Metric average data types

I don't know if this will make sense. Unless you've tried to do this. But the Activity.Info "average" data types, for at least Power and HR are integers. And they really should be floats.

Here is why.

Say you want to know the AVG POWER for the preceding 10 minutes (any segment of a longer activity). You have two choices.

A: You can save EVERY current power reading for every second, and calculate the segment's avg power by adding up possibly thousands of values and dividing by the number of values. Ugh

But if the avg metric was saved in ACTIVITY.INFO as a float, there is a much easier way.

B. Simply save the AVG POWER at the start of the segment, and then 10 minutes later, grab the AVG POWER again. And use a simple formula to essentially calculate the weighted value that causes that change in AVG.

Except, since the AVG metric is an integer, that approach is broken.

For example. Say you are 5 hours into a ride, and your AVG POWER was 183, and 10 mins later is 184. What was the POWER during that 10 minutes? You have no clue. Since:

That 183 could have been 182.50 to 183.49. That 184 could have been 183.50 to 184.49

Anyway - probably too late to ask Garmin to fix the precision of AVERAGE metrics. :-(

jim_m_58 over 6 years ago

For A, there's a much simpler way than recalcing everything and storing 1000's of samples:

let's say you've seen x samples so far and the avg is curAvg and you get a newSample

curAvg=((curAvg*x)+newSample)/(x+1);
x++;

A single calculation and no need to store 1000's of samples. Reset x and curAvg as needed, if you want it based on time, lap, etc. Or say a value for 10 minutes as well as one for the lap.

For a rolling avg, say for 10 minutes, when 10 minutes have passed you remove the impact on the avg of the reading you no longer need and add the impact of the new reading, so there you do need all the samples, but there's still minimal computation needed.

WillNorthYork over 6 years ago

jim_m_58 I think Dave just described an algorithm (B) for rolling avg which doesn’t require saving all the samples, it just requires you to know or compute the average at all times, and to remember the average from T(n-1). Using your method to calculate the average and his method in B, he could get what he wants without storing ten minutes worth of data.

I think it’s basically the a form of the algorithms below. Note the disclaimer about loss of precision as count or time gets larger....
https://stackoverflow.com/questions/12636613/how-to-calculate-moving-average-without-keeping-the-count-and-data-total

trying to find a way to calculate a moving cumulative average without storing the count and total data that is received so far.
I came up with two algorithms but both need to store the count:
new average = ((old count * old data) + next data) / next count
new average = old average + (next data - old average) / next count

The problem with these methods is that the count gets bigger and bigger resulting in losing precision in the resulting average.
The first method uses the old count and next count which are obviously 1 apart. This got me thinking that perhaps there is a way to remove the count but unfortunately I haven't found it yet. It did get me a bit further though, resulting in the second method but still count is present.

There are a couple of other good algorithms at that link, which don’t involve storing the count. But I think they’re all exponential weighted moving averages, which may not be what’s wanted.
So having said that, storing all the values for the time period may be the most computionally method sound method for the non-exponentially-weighted moving average after all....

DaveBrillhart over 6 years ago

All this would be so much easier if Garmin simply stored their averages as floats. That precision would make it as simple as storing the overall AVG value and timestamp at the start of the segment, and then at the end of the segment, use those 2 values as well as the new overall AVG value and timestamp. An accurate segment avg can be calculated by computing a formula once. Oh well.... I'll use one of the work around. THANKS

WillNorthYork over 6 years ago

No problem, but I want to point out again that you can calculate the overall average every second (without storing every single value or using an approximation) and take the two values at the start and end of the segment as you described.

If Garmin can calculate the overall average, so can you. They're not doing anything special that you are unable to do in your own code. All you have to do is store the total and the number of samples. I'm pretty sure that's all that Garmin does. Garmin doesn't have access to unlimited amounts of memory to store every sample (which isn't needed to take the average of a small number like power, which isn't going to overflow a 32-bit integer during a 3 hour bike ride.)

There is absolutely nothing stopping you from implementing your algorithm in the way you originally intended, as long as you add the step of calculating the average yourself. Unless I'm missing something...

e.g.

var N = 0;
var total = 0;

function compute(info)
{
  if (info has :currentPower && info.currentPower != null)
  {
      N++;
      total += info.currentPower;
  }

  if (N != 0)
  {
     var averagePowerFloat = total.toFloat() / N;
     // if this isn't how Garmin does it (although with integer division or rounding), I'll eat my hat

     // Now you can do whatever you want with your floating-point average power
     // such as sample it at the beginning and end of your segment
  }
}

Whether taking the average at the beginning and end of a segment is the most precise way to get the segment average is another story....

DaveBrillhart over 6 years ago

GREAT point! Thanks!!

DaveBrillhart over 6 years ago

WillNorthYork,

I am implementing your idea about a running average. Three additions need to be made (don't worry, no hat eating required), in order to mirror Garmin's internal power average (except we'll make it a floating point).

First, you can't assume that compute() is called at the exact same frequency each home. It can be every 800ms to every few seconds... depending on what else is running. Say you add a couple more complex CIQ data fields to your screen, or just activate a turn-by-turn course. So, each compute() loop take a timestamp and compare the duration since the prior run, and apply a time duration weighted increment. For better accuracy. However, I get that the ASSUMED fixed duration is probably ok.

Second, and this is huge. You need to maintain a timerState variable, and only perform the increment action if you are in a RUN state (not Stop, not Pause, etc).

Third, you have a choice to include zero values or not into the average. Most people include zeros for power averaging. For cadence, most people don't include zeros. I'm not sure how to grab via CIQ, the user's section for include (or not) zeros into the average. So I'll hard code my preference to include zeros.

Again, thanks for the idea!!

ekutter-dnu over 6 years ago

Pretty sure you can't actually get an accurate rolling average without saving every value. You are getting an estimated rolling average, but not an actual one. I've heard of this estimate being used before, but it really doesn't work. If you have a series of 400w values, and a series of 200w values in that average, you can't just remove 1 second worth of the average of those. You need to know if the second you are removing from the average is 400 or 200.

It's probably good enough for your needs, but it isn't accurate. And the whole premise of this thread is that you want precision.

And as Jim shows, computations can be minimal, it's just storing all the values that gets expensive.

WillNorthYork over 6 years ago

No worries.

Yes, compute() isn't called exactly every second, but I think most devs will assume once per second, as no one will notice the difference over a 30 minute to 3 hour run.

Right, most of Garmin's calculated metrics (max, average) don't get updated while the timer is paused, so you def want to mirror that.

I have to agree with ekutter that your estimate is not accurate. You may as well just use a "standard" exponentially weighted average algorithm if you don't want to store all the values - you'll get a similar level of precision without having to store the average from the beginning of the window.

DaveBrillhart over 6 years ago

ekutter - I'm trying to understand your comment....

Example: say at 3 hours in our avg power is 234.56789W (at 10,800,000 miliseconds).

We capture the current power at 203W at the next compute() cycle which happens to take 1.25 seconds, at 10,801,250 miliseconds). The power is an integer metric.

I agree that we need to assume that for that entire last 1.25 seconds, the power has been 203W (limited to an integer). But that is as good as we can get. And even storing every value, that assumption is still made.

The NEW avg power is simply (234.56789 * 10,800,000 + 203 * 1,250) / 10,800,100 = 234.56424W

Are you saying that saving all 10.801 integers, the average would be more accurate? In fact, I think it would be less accurate, since you'd assume all 10,801 values were captured at the same time gap between samples. Whereas in the example above, we took into account the actual variance in the sample to sample time (1.25 seconds in this case).

I'd like my rolling average to be as accurate as possible. Thanks for helping me get this right.

ekutter-dnu over 6 years ago

To be most accurate, you'd need to store the exact time AND value for each data point as well as have higher precision (either floats or fixed point numbers such as shifting left by some number of zeros). And do weighted averages rather than assuming even second intervals. But to keep the math simple, lets assume even second intervals.

If you have the following data set. Yes contrived as you can't likely oscillate between 200 and 400w on an every second basis. But it is made extreme so you easily see the difference. And say you are averaging over 3 seconds.

- yes I'm rounding below to integers for clearer display but that makes no difference in the argument

[400,200,400,200,400,200, ...]

second
3: (400 + 200 + 400)/3 = (1000/3) = 333
4: (1000 + 200 - 400)/3 = (800/3) = 267
5: (800 + 400 - 200) = 1000/3 = 333
...

where as if you were just using your currently adjusted average to compute what to subtract
4: ((333*2) + 200) / 3 = 866 / 3 = 289
5: ((289*2) + 400) / 3 = 326
...

So subtracting off the average is not the same as subtracting off the first value. As the number of elements grow, the average will narrow in on 333.

In general, all of this doesn't really matter as I can guarantee your power meter isn't accurate to 5 watts, especially on an every second basis. Probably not even 10 or 15. Over an extended period of many seconds or minutes, it might be in the 2% range, but not on an every second basis. And that extended basis will make the impression we are talking about noise.