First list of FIT decoding libraries and basic benchmarking

Hi,

Just wanted to share, I started to put together a list of libraries that decode fit files and a simple benchmarking of the performance of each as for my own application performance is an issue.

Hopefully can be useful to others as reference, feel free to add other libraries or suggest other to look at.

https://github.com/roznet/fit-benchmarks

  • I've now started including "gps loss" detection in my post-parsing because that factor is really bothering me.

    Basically count records, count records with vs without lat/lon and compare counts.

    >2 = gps loss, >2% is serious gps loss, >20% is severe gps loss

    90% of files are fine but there is a surprising amount of GPS dropout, even just a few records in some files and it's not obvious when mapped on strava, etc.

    I intend to someday build a FIT repair tool that uses known waypoints or previous runs or even a mapped route (ie. strava) to patch back in some gps data (with somehow mapped color coding to indicate it's not actual realworld data).

    If the user provides a known starting and ending point and several waypoints it should be possible to use the distance data from the watch or stryd to time-stretch and estimate where they were and when within a certain small error factor.

    I discovered that even if all gps drops out on the records after a certain point, sometimes the lap data after the fail contains some restored gps data, strava etc. won't use the lap data lat/lon only records. It's a strange bug on the watch where the gps chip must either stop sending data or the watch cpu loses its way interpreting the data.

  • I think GPS in general can be a bit unreliable. Sudden loss of direct line of sight to a satellite mean points won't be logged or reflections causing incorrectly logged location (don't know what kind of mitigation/buffering is in place, but we only get smoothed out gps data anyway, I guess). For the VIRB I can spot this visually due to how we sometimes work with the data (e.g. put the points on a visual timeline and you get occasional gaps, wider than 100ms). I have plans for a repair of sorts, on top of interpolation between logged points, but I'm not sure when that'll be done.

  • I'd like to suggest the inclusion of other FIT files to help test decoders even though that's not the primary purpose of the benchmark.

    How about a CORRUPT.FIT with purposely added garbage/corruption after the ending CRC in the file

    and also some invalid base types purposely changed in the file (the CRC might have to be recalculated for that but I think most decoders ignore the CRC)

    Got this idea from a bad file I came across in my archive and had to patch my code to handle it for the remaining valid data.

    It also occurs to me it's possible to make a very large synthetic FIT file by just creating a massive route and then patching in actual distance/speed data into the records.

    I have a friend that does multi-day ultras so was thinking of getting some FIT files from them to test too.

    By the way has anyone realized that Garmin could release the entire FIT spec inside a FIT file itself using 207/206 types, since there is a "native field" option it can just define each native field. It would be very large. But it would be nicely meta recursive, scale and offset and descriptions available for every field. In addition to the spreadsheet which needs special tools/code to parse otherwise.

  • I actually have a couple of corrupt ones. One is from https://www.thisisant.com/forum/viewthread/6470/ (not mine) but the dropbox-link no longer works so I'm not sure whether it's ok to repost.

    I also have a VIRB FIT-file where the header reports data size 0, the data is there however so I've added a check and estimate the data size via file size. Can't share this one, sorry.

    Another is from this site, with biking routes in Southern Sweden: https://www.musette.se/cykling/rutter/. Not all have FIT-files. Click a route and on the following page there's a link saying "Ladda hem rutten" ("download this route") at the bottom. Those that are FIT-files consistently report an additional 11 bytes of data in the header, which exceeds the file size. I had to add a check for this. :)

    Don't know which device, but if it helps the file_id message contains the following (is manufacturer = 1 Garmin?):

    [1] Global ID: 0 | Message type: file_id | Header: 0/0b00000000
        id:   3 serial_number         : UINT32Z([0]) 
        id:   4 time_created          : UINT32([965038856]) 
        id:   1 manufacturer          : UINT16([1]) 
        id:   2 product               : UINT16([65534]) 
        id:   0 type                  : ENUM([6]) ENUM_NOT_IMPLEMENTED_FOR_GLOBAL_ID_0

  • Oh those are very interesting!

    Yeah I think a header can only be 12 or 14 byte by design but in theory if one knew for a fact it was a FIT file you could just start scanning for global 0 and start from there, but even zero is not required just advised.

    In theory it would also be possible to just scan for anything that looks like a Garmin timestamp with the +631065600 offset and try to figure out record positions (oh that's another benchmark idea, how fast can you identify all timestamps in a FIT file and figure out the very last one recorded, to correct "last modified" file timestamp on uploaded/download files, because global 0 has the first timestamp, now find the last)

    Something I've noticed is that because the Garmin mass-storage uses old FAT32 that cross-linked files and other FAT problems can occur over time, which is probably why garmin support is always telling people to just factory-reset every time there is a problem because the FIT files that control the watch get corrupted

    (I've been meaning to make a utility to create settings FIT files that can be uploaded to re-formatted watches to restore, it would be neat if Garmin made an ability to dump all watch settings and values into a FIT on the watch, it can take FIT files to set things but I don't think it exports them)

    Another thing to detect in FIT decoding is if timestamps ever get out of sequence. Should never ever happen. Except if a file gets cross-linked like that person with the Boston Marathon corruption. it could happen. SDK decoders would likely reject the FIT completely at that point or make a crazy map, etc.

  • Sorry, badly worded by me on the "11 byte extra"-file. The part that specifies data size in the header consistently reports 11 bytes too much for the bike routes. The header itself has the correct size (yes headers should be 12 or 14 afaik).

    I also noticed the marathon dropbox file is in fact still there, good!

    EDIT: By the way I get to the same point at 2k as one of the posters in the marathon thread with an invalid message header error (set to 255).

    With my current implementation I just read the whole file into an uint8 array and parse that since even the huge files from the VIRB are less than 20MB. Perhaps I could do some pre-scan for better error mitigation. On the other hand it seems fairly fast to operate on that array, if one wants to retrieve timestamps only etc.

    I assume the FAT32 decision is one of the reasons for the VIRB splitting up every recording session into clips (much smaller than the 32-bit 4GB limit though). :/

    Out of sequence timestamps? I have them, but for a very specific VIRB message type called camera_event/161. It logs e.g. start/stop of recording, when the session is split into a new clip etc. Check the fractional timestamp timestamp_ms:

    [23] Global ID: 161 | Message type: camera_event | Header: 11/0b00001011
        id: 253 timestamp             : UINT32([6210]) 
        id:   2 camera_file_uuid      : STRING("VIRBactioncameraULTRA30_Expansive_1920_1440_29.9700_3937280306_344591dc_11_69_2017-10-15-12-02-01.fit") 
        id:   0 timestamp_ms          : UINT16([686]) 
        id:   1 camera_event_type     : ENUM([2]) video_end
        id:   3 camera_orientation    : ENUM([0]) camera_orientation_0
    [24] Global ID: 161 | Message type: camera_event | Header: 11/0b00001011
        id: 253 timestamp             : UINT32([6210]) 
        id:   2 camera_file_uuid      : STRING("VIRBactioncameraULTRA30_Expansive_1920_1440_29.9700_3937280306_344591dc_11_69_2017-10-15-12-02-01.fit") 
        id:   0 timestamp_ms          : UINT16([670]) 
        id:   1 camera_event_type     : ENUM([6]) video_second_stream_end
        id:   3 camera_orientation    : ENUM([0]) camera_orientation_0

    I assume this is not much of an issue in this specific case since one wouldn't use these messages via timestamp, but rather for what they represent. I use these message to filter data down to specific recording session.

    Your settings idea sounds intriguing (I'm only parsing/reading FIT data at this point though).

  • I need to save up and buy a garmin camera apparently, makes really interesting files and sensors.

    I found another nicely corrupt file here from an ironman, my parser doesn't do multi-session yet so it's processing the whole thing as one massive session.

    File corruption, gps loss, watt-meter dropout, it's got everything.

    forums.garmin.com/.../corrupted-fit-file-after-an-ironman-tri

  • The VIRB is pretty cool. Essentially a data hub with a camera, since you can just attach other ant-accessories depending on your needs. My only worry is that GoPro will (or already have) passed the current VIRB in terms of being a good camera. Hoping Garmin is at least contemplating a new release, but fitness watches etc are probably where the money is.

    Yes, more files! It parses in full for me, but my parsing is pretty basic and the initial parse is not evaluating anything at all, just getting the data out of there. Lots of unidentified message types - I guess it's time to start a little unofficial database. Maybe I'm just unknowingly skipping over the corrupted bit or the file is ok to parse in terms of alignment etc, but fails at the data evaluation stage (which I'm not doing here).

    I haven't even checked what multi-session does to the data structure...

    Perhaps something is missing, but this is the overview I get for the original file in that thread (there's a fixed version further down). As usual, my results are pretty uninteresting until processed, but I get no parse errors (well, none that my parser reports at least... again, it's pretty basic):

    ---------------------------------------------------
    Header
    
          size: 14
      protocol: 16
       profile: 2095
      datasize: 1036125
        dotfit: ['.', 'F', 'I', 'T']
           crc: Some(18894)
    ---------------------------------------------------
    Data
    
     Global ID | Message type                 | Count
    ...................................................
             0 | file_id                      |      1
            21 | event                        |     18
           233 | UNDEFINED_MESSAGE_TYPE_233   |     14
           104 | UNDEFINED_MESSAGE_TYPE_104   |    129
           140 | UNDEFINED_MESSAGE_TYPE_140   |      3
           141 | UNDEFINED_MESSAGE_TYPE_141   |      1
             7 | zones_target                 |      5
             2 | device_settings              |      1
            19 | lap                          |     46
            18 | session                      |      5
            20 | record                       |  24572
            23 | device_info                  |     60
           113 | UNDEFINED_MESSAGE_TYPE_113   |      5
            49 | file_creator                 |      1
             3 | user_profile                 |      1
           216 | UNDEFINED_MESSAGE_TYPE_216   |     51
            12 | sport                        |      5
           125 | UNDEFINED_MESSAGE_TYPE_125   |      1
            34 | activity                     |      1
            22 | UNDEFINED_MESSAGE_TYPE_22    |    536
           147 | UNDEFINED_MESSAGE_TYPE_147   |      9
            79 | UNDEFINED_MESSAGE_TYPE_79    |      3
            13 | UNDEFINED_MESSAGE_TYPE_13    |      5
    ...................................................
                                        Total:   25473 
    ---------------------------------------------------

    From a super quick glance record/20 seems ok at least, but I guess those of you who are really into the fitness data may find things missing elsewhere. Final record/20 message:

    [24572] Global ID: 20 | Message type: record | Header: 8/0b00001000
        id: 253 timestamp             : UINT32([938018539]) 
        id:   0 position_lat          : SINT32([527986785]) 
        id:   1 position_long         : SINT32([147509926]) 
        id:   5 distance              : UINT32([22617170]) 
        id:  29 accumulated_power     : UINT32([3305444]) 
        id:   2 altitude              : UINT16([2346]) 
        id:   6 speed                 : UINT16([3060]) 
        id:  39 vertical_oscillation  : UINT16([1297]) 
        id:  40 stance_time_percent   : UINT16([3375]) 
        id:  41 stance_time           : UINT16([2470]) 
        id:  83 vertical_ratio        : UINT16([3346]) 
        id:  84 stance_time_balance   : UINT16([5231]) 
        id:  85 step_length           : UINT16([3670]) 
        id:  87 UNDEFINED_FIELD_87    : UINT16([0]) 
        id:  88 UNDEFINED_FIELD_88    : UINT16([300]) 
        id:   3 heart_rate            : UINT8([140]) 
        id:   4 cadence               : UINT8([87]) 
        id:  13 temperature           : SINT8([19]) 
        id:  42 activity_type         : ENUM([1]) ENUM_NOT_IMPLEMENTED_FOR_GLOBAL_ID_20
        id:  53 fractional_cadence    : UINT8([64]) 
        id:  90 UNDEFINED_FIELD_90    : SINT8([0]) 

  • I won't clog up this thread more with this but it's fascinating so here's an example of them changing watch settings via a crafted FIT file placed into \NEWFILES

    https://support.firstbeat.com/hc/en-us/articles/360015729193-How-to-Set-on-the-RR-recording-for-Older-Garmin-Devices

    all that's in that FIT, it's literally the smallest possible that does anything

    file_id (0, type: 0, length: 2 bytes):
      type (0-1-ENUM): settings (2)
    hrm_profile (4, type: 0, length: 2 bytes):
      log_hrv (2-1-ENUM): 1

    and I wonder if ALL the watch settings can be backed up and restored that way

    I would bet settings could be at least made that way if not read. Which means you could make a "watch settings" tool for the desktop that just saves to the usb mass storage after factory resets, etc.

    Maybe there is a FIT that can be placed that causes the watch to dump all settings into a resulting FIT

    There is a SETTINGS.FIT in the \SETTINGS folder that may give us hints.

    Maybe it is as simple as backing up that file and after a full format, placing it into NEWFILES

  • I have to investigate this further but when you use the "download heartrate data" feature with the HRM-TRI (and probably the Pro and Swim) apparently the new FIT file that the watch builds from the old one is "corrupt" with trailing garbage in the file.

    But I suspect it's not garbage, maybe some kind of old data it appends, have to study it more closely.