Average iterator not working?

Hi, 

I have an iterator setup to calculate an average of surrounding figures, and it appears to be giving me different results than if I run the same iterator with the same numbers in excell.

It's  for( i=12; i< (array.size()-12); i+=1)

Array[i] = (Array[i]  + Array[i+1] + Array[i-1]  etc to Array[i-12])/25;

This should give me average value, for 25 readings, but it is off!

564 in excell..... 650 ish in studio code. .. ( so 100 different ? )

Why might that be?

  • I think you need to have i<array.size() - without the -12

  • the other thing is that you're modifying the array while you do this, so the values of array[17] will change when you calculate some of the previous (and further) items... The code is very strange, most probably it's not doing what you think it is...

  • Genius - you are spot on - that is the issues. the average value were changing the past values. I need to create from an array of say 150 instant readings, an rolling average of 25 readings. Ie - average of 0 to 24, then average of 1 to 25 etc....

    Is the iterator the best way to achieve this?

  • In general, the best way to do this is probably to use a First-In-First-Out queue with a fixed maximum size and a second array to store your averages.

    Assume you have a class Queue which stores Numbers and has the following operations:

    initialize(maxSize as Number): creates a new Queue with the specified max size

    enqueue(value as Number) as Void: adds a number to the queue

    dequeue() as Number: removes a number from the queue (first in, first out) and returns its value

    size() as Number: returns the current number of elements in queue

    Your rolling average function would look something like this:

    function calculateRollingAverages(data as Array<Number>, window as Number) as Array<Number> {
        const dataSize = data.size();
        const dataQueue = new Queue(window);
        const rollingTotal = 0;
        const rollingAverages = new [];
        for (var i = 0; i < dataSize; i++) {
            rollingTotal += data[i];
            dataQueue.enqueue(data[i]);
            if (dataQueue.size() == window) {
                const rollingAverage = rollingTotal / window;
                rollingAverages.add(rollingAverage);
                rollingTotal -= dataQueue.dequeue();
            }
        }
    }
    
    const data = new [150];
    // ... populate data
    const rollingAverages = calculateRollingAverages(data, 25);

    I haven't tested this but the general idea is there.

    Couple of notes:

    - The implementation of Queue is left as an exercise

    - Updating the rollingAverages array could be made much more efficient by creating it as a fixed-size array and keeping track of the current position. I'll also leave that as an exercise.

    - This code obviously only calculates rolling averages after *all* the data has been collected. You'd have to rewrite this to update the rolling averages array as your instant reading update array is updated

    (This is assuming you don't have any unusual memory constraints. For example, if the code and memory associated with the use of a class for Queue create too much overhead for your app, you could implement a queue without the use of classes. This probably isn't the case for you, though)

  • Thanks, yeah I have allready rewritten the code twice due stack overflows,  and have a 3rd re-write  coming as I just figured out I can reduce the number of arrays I am using.

    When you mention fixed sized arrays being more efficient, do you mean initialising a 150 array and changing values, is more efficient than adding values to a 0 sized array  for example? 

    Also, given I am running into compute limits here, what is the reason I would open a new window to do thus rather than inside the current window,? 

    I'm using accelerometer data. Which comes out in 1 seacon bursts of 25, but then need to string the seconds together and age the data at the same time.

    And implement a 1 second rolling  average conversion ontop of that for at least the last seconds' worth of data after aging or before the next age.

  • When you mention fixed sized arrays being more efficient, do you mean initialising a 150 array and changing values, is more efficient than adding values to a 0 sized array  for example? 

    Yes, because there's always a possibility that Array.add() will cause memory reallocation and a copy.

    Also, given I am running into compute limits here, what is the reason I would open a new window to do thus rather than inside the current window,? 

    Yeah, my code was more of a generic example and not specifically geared towards your case. I'll admit the example of calculating rolling averages after all data has been collected isn't very realistic.

    I'm using accelerometer data. Which comes out in 1 seacon bursts of 25, but then need to string the seconds together and age the data at the same time.

    And implement a 1 second rolling  average conversion ontop of that for at least the last seconds' worth of data after aging or before the next age.

    I'm a little confused here. Is the rolling average for 1 second or for several seconds? If it's just for 1 second's worth of samples (1 burst of 25), then you don't really need a "rolling" average, you just need a plain average. If it's for several seconds worth of data (6 seconds?), then the code might be a bit more complicated due to the fact that you're getting bursts of data.

    Regardless of the details, the general principle is the same. You want to use a FIFO queue to accumulate your rolling data, because it's the most efficient way to maintain a running total in order to calculate your rolling average. Otherwise you'll just be reading the same data over and over again, which is not efficient (especially if you try to scale this to large amounts of data.)

    For example, if you are getting 25 samples per second, then each second you would enqueue 25 numbers and add them to the rolling total. If you have enough numbers to fill your window (150?), then you would calculate your rolling average by dividing the rolling total by the number of samples in the queue. You would then dequeue 25 numbers (representing the oldest burst of samples).

  • So in pseudocode:

    initialize:
      dataQueue = new Queue(windowSize)
      rollingTotal = 0

    on receiving new data sample x:
      rollingTotal += x
      dataQueue.enqueue(x)

      if dataQueue.size() == windowSize:
        rollingAverage = rollingTotal / dataQueue.size()
        # do something with rollingAverage
        rollingTotal -= dataQueue.dequeue()

    If all you need to do is display the rolling average rather than store it (for some reason), then you don't really need much more than this. Your queue data structure would be a class that contains one array and 1 or 2 numbers which serve as indices.

  • Yeah, so basically, every 1 second. I need to look at the last 1 seconds' worth of rolling averages ( 25 readings that are all individual averages of 25 readings, offset by 1 reading. If something is found in there, then I need to be able to look at the last 6 seconds of rolling averages which is why the initial array is the size it is. Then if something else is found, I then add more seconds to the end ( new seconds ) and collect an even bigger array size which takes another 6 or 7 seconds to gather, so 300 readings now and do some calculations to the raw data along the way, ending up with a bigger array that needs to be averaged also.

    I can save some duplication of effort here by aging that 1 seconds' worth of rolling average into it's own array, that way the 6 seconds' worth is allways there, and I can use that later on if my second trigger is met, by simply adding new info to the end of it.

  • Sorry, to be clear about what rolling average means:

    I get 2 seconds worth of accelerometer data ( 50 raw data points )

    I then add up reading 1 to reading 25, and divide by 25 - that gives me the a second average - based on the the 12th reading ( I'm adding up + and minus 12 readings each side of the 12th reading to get the average.  It's a 1 second average that is calculated around every single individual accelerometer return )

    The next reading would be from reding 2 to reading 26, divide by 25. Reading 26 data point is from the " next" second, which is why I have to start off with 2 seconds of raw data, I only get 1 seconds' worth  or 25 new averaged readings (  1 second average readings for 25 of the raw data points), 

    This would give me, 25 (1 second average readings ), that are centered on the raw data point array index 12, and go up to the 12th last reading. Ie - they represent a rolling 1 second average from half a second into the first second, up until half a second into the second seconds' raw data points. 

  • So, efficiency wise, right now I have about 5 different sized arrays.

    One sized 150 which is a rolling array, that continually overwrites itself ( that is the new 25 raw readings simply over-write the oldest 25 readings

    One smaller one takes information from the 150, into a 50 array, to calculate the rolling average of the last 1 second.... which goes into a 25 array.

    If triggers are met, the 50 array gets array.added to until it is 150, and then the 25 gets array.added to until it is 125

    then

    One of size 300  populates right at the end using array.add - initially from the 150 array, and then directly from the accelerometer.

    So, I think maybe I am better off just running 2 x 300 arrays, or possibly 3 to make the calculations easier. 4 would make the calculations easiest of all.

    1 will contain the raw data points, that are continually overwriting the oldest 25 data points 

    1 will contain the corresponding 1 second rolling average, which also overwrites the oldest 1 seconds' worth of rolling averages. 

    Then I need to somehow index all of my calculations, incase the readings I am interested in occur over then end of the array index, and into the start of the index. OR

    Array 3 is simply a re-ordered version of array 1 - that get's made after triggers are met

    Array 4 is simply a re-ordered version of array 2 - that gets made after triggers are met

    Such that all data I am interested in from a time perspective, starts at the beginning of the array, and ends at then end, with no " rolling over" of time over the end and into the beginning of the array.

    Thoughts?