Efficient Data Logging with Memory Constraints: Dynamic Buffer and Downsampling (merge) on Embedded Devices

Efficient Data Logging with Memory Constraints: Dynamic Buffer and Downsampling (merge) on Embedded Devices

Many developers face the problem that Garmin or any embedded device has very limited memory, yet you want to log data continuously for hours - often offline, with no way to upload data for a long time.

If you simply store every sample in memory, you'll quickly run out of space.
The solution: use a fixed-size buffer, and when it fills up, automatically merge (downsample) the samples so that older data is always retained - just at lower temporal resolution - while the newest data is stored at the highest possible detail.

The Core Idea

  1. Collect samples in a buffer (e.g., maxSamples, say 60), typically adding a new sample every tick (e.g., every second).

  2. When the buffer is full (overflow):

    • Merge pairs of samples (downsampling):

    • Now, each sample covers twice as much time as before.

    • After the merge, every two samples become one (e.g., 60 → 30), freeing up space.

  3. Crucial note:
    After each merge, every sample in the buffer - including the newest and oldest - covers exactly the same time interval
    (the "sampling interval" as set after the latest merge, e.g., 2s, 4s, 8s, etc.).

  4. All new incoming samples are also stored using this “coarser” (enlarged) interval until the next merge.

General Rules for Merging All Data Types

Data type Merge logic Explanation
Timestamp Always take from the first sample Indicates the start of the merged time window
Maximum value Take the max of the pair E.g., max depth, max speed
Minimum value Take the min of the pair E.g., lowest temperature
Averaged value Take the average E.g., avg heart rate, avg temp, avg speed
Monotonically increasing/decreasing Always take from the last sample E.g., total distance, battery left
Latest/current value Always take from the last sample E.g., GPS position, current sensor state
Aggregate/trend Type-dependent: average, min, max, diff Always compress: aggregate, max, min, as fits

Changing Resolution Over Long Offline Periods

  • After each merge, all buffer samples cover the same time interval:

    • E.g., 1s → 2s → 4s → 8s → 16s, etc.

  • All new samples are only stored with this interval (until the next merge).

  • There will never be finer-resolution samples at the end of the buffer than at the start.

Formula:

If your total offline period is T seconds, buffer size is N, the sampling interval should be:

  • The smallest power of 2 where N × interval ≥ T

  • (e.g., 6 hours = 21,600s → 60 × 512 = 30,720s, so interval = 512s)

Example Table:

Offline period Sampling interval (s) Each sample covers Buffer covers total time
2 hours 128 2m 8s 2:08
4 hours 256 4m 16s 4:16
6 hours 512 8m 32s 8:32

Why Is This Good?

  • You can log data for hours or days without running out of RAM - the memory usage is always fixed.

  • You never actually lose data, only reduce temporal resolution for the oldest records (the information is still retained, just in coarser “chunks”).

  • It’s always easy to reconstruct the real time interval for each sample - just use the timestamp.

Short "merge" algorithm (pseudo-code):

function mergeSamples(a, b):
    ts = a.timestamp                // Take the timestamp from the first sample
    maxValue = max(a.maxValue, b.maxValue)
    minValue = min(a.minValue, b.minValue)
    avgValue = (a.avgValue + b.avgValue) / 2
    lastValue = b.lastValue         // Always use the last sample for “latest” values
    // For other fields, apply type-specific logic!
    return [ts, maxValue, minValue, avgValue, lastValue, ...]

Key Takeaway

With this method, you can log indefinitely, regardless of offline duration, and never lose data - only the resolution drops for the oldest samples.
After every buffer merge, all samples in the buffer cover the same time interval.
There are never finer-resolution samples at the end than at the start.

This approach is universal - whether for Garmin, any embedded system, IoT device, or custom health/activity tracker.

  • Can you give some example(s) for data where this makes sense?

  • In practice, this approach is most useful for any custom-calculated data that your device or app generates - anything that Garmin/Connect IQ does not natively log or sync, but you want to save, analyze, or upload to your own server.

    For example:

    Suppose you have a custom DataField that displays a calculated heart rate percentage or zone index - something you compute every second using raw sensor data, but that Garmin does not store natively.

    If you want to keep this unique value for long-term analysis (e.g., to upload to your own database later), simply store each computed value (with timestamp, and optionally min, max, average, etc.) into a fixed-size array - let’s say one value per second.

    When the array fills up, you merge (downsample) adjacent elements - combining values according to their type (e.g., average, min, max) - so your buffer stays the same size, but each slot now covers twice as much time. This process repeats: every time the buffer fills up, you merge again, and double the interval.

    This way, you always keep your buffer in RAM, but never lose your custom data - only its time resolution drops for older samples. You can upload this array to your own server, reconstruct the timeline, and analyze as you wish.

    Example pseudo-code for storing and merging custom samples: 

    sample = [timestamp, customValue, minValue, maxValue, avgValue, ...]
    
    // Check if it's time to start a new sample:
    // (For example: if the number of seconds since start is an exact multiple of the current sampling interval)
    if (it_is_time_for_new_sample):
        samples.add(sample)
    else:
        lastSample = samples.last()
        merged = mergeSamples(lastSample, sample)
        samples[samples.size - 1] = merged
    
    if (samples.size >= maxSamples):
        samples = mergeAllSamples(samples)
        sampleInterval *= 2

    • sampleInterval starts at 1 (second), then doubles after each merge.

    • mergeSamples() merges two adjacent samples, combining their values (min, max, avg, etc.).

    • mergeAllSamples() merges all elements in the array in pairs.

    • At upload, you send the buffer as-is.

    This is ideal for any self-generated time series - custom fitness indices, unique event detections, complex calculated fields, or advanced research metrics - where you want to keep your data, but have limited memory!

  • I had come into this situation with my recent battery monitor app. It tracks battery usage over time and plot it. I first did something like you mentioned, as data got older, I merged every two samples together, keeping the time sample of the first one and the highest battery of both samples. I did that for the bottom half of the array, leaving room for 25% new data until a merge was needed again.

    However, I soon came into issues when the app was launched through Glance as this one as FAR less available memory than the main app. So to circumvent this problem, I broke up the array into smaller chunks and only the latest "chunk" is loaded while in glance. I have a stored array who contains the time stamps of the first element of each of those "chunks" and those "chunks" are stored as history_timestamp so they are easy to retrieve and combined together to rebuild the whole stored data to plot.

    Since I calculate the slopes of all down trends in the data for discharge prediction, this can be very time consuming when you have thousands of data elements. To prevent that, once a down slope has been calculated, it is stored in an array called slopes_timestamp (same timestamp as the one used for the "chunks") so they are only calculated once. I then simply have to average the slopes that have been previously calculated to get the projected down slope of all the data. I've also moved the slope calculation into its own timer loop so, I hope (never got an answer to my question from a few weeks ago), it decouples it from the main onUpdate allocated time.

    Once the maximum number of "chunks" is reached, I drop the oldest one and start a fresh one. I'm pondering with the idea of taking the last two chunks and merging them together instead of dropping the older one like I'm currently doing. But with 500 elements per chunk, I have to make sure I'm not going over the time limit allowed per run.

    I also preallocate the current "chunk" I'm working with so I don't have to use .add to add new data which from what I read creates a new array with the new data at its end, which is pretty inefficient.

    I learnt quite a bit about the limitations of CIQ apps doing this app. 

  • Why would you do that when in most cases you can just add it to the recorded FIT data, that is synced with the other data (distance, HR, elevation, pace,...) and adds almost nothing to the memory needs (probably less than what you described)?

  • The battery monitor mentioned is a case where this would be useful as there's no fit recording involved, plus it's something you'd want 24/7 and maybe over a few days

  • Thanks for your question, Flocsy!

    In my use cases, I’m almost always developing custom DataFields with unique, app-specific calculated values - not just displaying built-in Garmin/FIT-compatible metrics.
    This means the data I want to log and later analyze (or upload to my own server) doesn’t have a corresponding FIT field, and can’t be saved through standard Garmin activity recording.

    So for me, FIT integration simply isn’t an option - that’s exactly why I rely on this buffer + merge approach for efficient, long-term offline storage of my custom metrics.

    This is especially important for fields like calculated experimental metrics, etc that Garmin doesn’t support natively. If I want to keep these for later, I have to handle storage myself!

  • But if it's a datafield, that anyway records some fields, then I understand it even less. Can you give an example? Because when I do some "custom calculations" in my DF-s, then usually it does make sense to correlate them with some of the other data (maybe the input data that the calculation is based on), and it doesn't make sense to look at them individually (like you say: later, not related to the activity it is related to), even less to "loose" precision by dropping or merging values.

  • Great question! Let me give you a concrete example:

    My HydroDepth DataField (already released, you can check it out on my site: https://felixgear.net) is a custom data field for tracking water depth with centimeter precision, using both pressure and temperature corrections.
    But more importantly, I’m about to release DiveDepth, a fully standalone dive logging app (currently in testing), which will log depth, temperature, location and more - even when there is no "standard" FIT activity recording happening.

    For my projects, a regular chart or graph isn’t enough. I use a custom online dashboard with map-based visualization - so you can hover over any datapoint and instantly see more details, context, and statistics.
    That’s only possible because I log my own custom data fields offline and later upload them to my server for deep analysis and interactive visualization.

    Of course, I totally understand your point:
    If someone is only using built-in data fields and doesn’t plan to create custom dashboards or analyses, then standard FIT recording is usually perfectly fine!
    But that’s not what my post is about. I’m presenting a data storage technique that’s generally useful for any kind of advanced, offline, or long-term data collection - regardless of what you ultimately want to do with the data.

    One more important point:
    Technically, you can use the FitContributor API in a DataField, but in practice, custom fields written this way rarely appear in the real FIT file or Garmin Connect unless you develop a full standalone activity app.
    With built-in activities (like Run, Bike, Swim, etc.), DataFields cannot officially store extra custom data in FIT - the Garmin ecosystem does not support this reliably for DataFields.

    That’s exactly why I (and many others) need a custom buffer and storage system for DataFields:

    As a DataField developer, you have no official, guaranteed way to log your generated data to the standard Garmin FIT file during built-in activities.
    If you want to capture your own custom metrics or calculated values, and especially if you want to analyze or visualize them later, you must manage the storage yourself.

    This is a big part of why I advocate this buffer+merge method.
    It’s not about rejecting FIT recording for its own sake - it’s because in many advanced use cases (especially with DataFields), FIT just isn’t available or reliable!

  • By the way, if you actually have a working example where custom DataField values are reliably saved to the FIT file (and later show up in Garmin Connect), I’d love to see it!
    So far, I haven’t found a robust way to achieve this - if you’ve managed to make it work, please share your method or code snippet.

  • There are lot of examples for this. Like: Hiker DF: https://apps.garmin.com/apps/4debed0e-4266-495e-8215-a3a8755ee87a

    With simple graphs (that I can't attach...)

    But I am confused by some of the things you wrote. I have the impression that either you are not aware of the possibility to record your custom data to a FIT file from a data field, or that you haven't figured out how to do it properly (maybe because of the data structure you want to record?)

    Also you can record data to the FIT file and intentionally "hide" it from Garmin Connect, so it won't show some strange data in an improper way, and you can download the fit file and analyze it just like you do it now (plus the additional advantages of being able to also correlate it with other data, like time, GPS coordinates, etc that are also in the same FIT file)