First list of FIT decoding libraries and basic benchmarking

Hi,

Just wanted to share, I started to put together a list of libraries that decode fit files and a simple benchmarking of the performance of each as for my own application performance is an issue.

Hopefully can be useful to others as reference, feel free to add other libraries or suggest other to look at.

https://github.com/roznet/fit-benchmarks

  • I need to get my mac vmware emulator running again just to check out your tools, looks fantastic.

  • Yes, parsing performance is quite an issue for me/my app, so I spent some time thinking about how to optimise:

    the fastest for me is the approach of the c sdk, which is layout the fit data in the c structure memory layout (just need to order it right, define the right size for each field in the c structure memory), and then each field is just a c struct member lookup.

    This is really fast and natural : just a bit of rigorous pointer arithmetics - beats all the other approach I tried, but would love to learn another one. Only downside is that it requires the code to predefine all the fields you are interested in a c structure in advance. This is what the ParsingType.fast does in FitFileParser 

    To add flexibility and process any field dynamically, including unknown one, the code can process a buffer of memory by doing the right typecast of the memory location into the higher language type. A bit ugly/scary code (take a look at this function), but still quite fast. What slows down the generic approach is the dynamic look up for each field memory bytes to know what it corresponds to. This will typically be a lookup of the field number to what the field mean (depending on your app). In the fast approach, this field mapping is known at compile time and much faster. This dynamic approach is what the ParsingType.generic does in FitFileParser

  • That's great - I recognise the street. :)

    Next challenge in Advent of Code FIT Edition: how long was the video recording session that took place during that short "hike"?

    SUMMARY
    ---------------------------------------------------
    Header
    
          size: 14
      protocol: 16
       profile: 2010
      datasize: 101735
        dotfit: ['.', 'F', 'I', 'T']
           crc: Some(11130)
    ---------------------------------------------------
    Data
    
     Global ID | Message type                 | Count
    ...................................................
            21 | event                        |      1
            23 | device_info                  |      3
           219 | UNDEFINED_MESSAGE_TYPE_219   |      1
            49 | file_creator                 |      1
           104 | UNDEFINED_MESSAGE_TYPE_104   |      4
            22 | UNDEFINED_MESSAGE_TYPE_22    |      2
           160 | gps_metadata                 |    512 *
           161 | camera_event                 |      3 *
           209 | barometer_data               |    187
           165 | accelerometer_data           |     53
           210 | one_d_sensor_calibration     |      1
           208 | magnetometer_data            |     53
           164 | gyroscope_data               |     53
           167 | three_d_sensor_calibration   |      4
           162 | timestamp_correlation        |      1 *
             0 | file_id                      |      1
            20 | record                       |    188
    ...................................................
                                        Total:    1068 
    ---------------------------------------------------
    Session time span
    
      Start:    2017-05-29T11:08:34.768
      End:      2017-05-29T11:08:51.068
      Duration: 16s 300ms
    ---------------------------------------------------
    UUIDs in selected session
    
      1. VIRBactioncameraULTRA30_Video_1920_1080_29.9700_3937280306_338eb533_1_44_2017-05-29-13-05-42.fit
    ---------------------------------------------------

  • No indications of temperature being logged in the Fenix6 data linked above, unfortunately. The dev data contains:

    field_name              units
    "Pace"                  "mins/mile"
    "Power"                 "Watts"
    "Cadence"               "RPM"
    "Ground Time"           "Milliseconds"
    "Vertical Oscillation"  "Centimeters"
    "Air Power"             "Watts"
    "Form Power"            "Watts"
    "Leg Spring Stiffness"  "kN/m"
    "Lap Power"             "Watts"

    For accelerometer data, I've set things up so I can get the calibrated accelerometer data on demand - I have plans for those (at least my logic seems to work out and the test values from the sdk correspond to what my algorithm returns - but eh, who knows...). Only thing is that the section in the sdk on calibration seemingly has the orientation_matrix normalised (?) to -1 to 1, so I'm not sure what to expect for other devices. The VIRB logs this in the following way:

    [1] Global ID: 167 | Message type: three_d_sensor_calibration | Header: 4/0b00000100
        id: 253 timestamp             : UINT32([2]) 
        id:   1 calibration_factor    : UINT32([5]) 
        id:   2 calibration_divisor   : UINT32([82]) 
        id:   3 level_shift           : UINT32([32768]) 
        id:   4 offset_cal            : SINT32([22, 1, -48]) 
        id:   5 orientation_matrix    : SINT32([0, -65535, 0, 0, 0, -65535, -65535, 0, 0]) 
        id:   0 sensor_type           : ENUM([1]) gyroscope
    [2] Global ID: 167 | Message type: three_d_sensor_calibration | Header: 4/0b00000100
        id: 253 timestamp             : UINT32([2]) 
        id:   1 calibration_factor    : UINT32([1]) 
        id:   2 calibration_divisor   : UINT32([2048]) 
        id:   3 level_shift           : UINT32([32768]) 
        id:   4 offset_cal            : SINT32([33, -20, -42]) 
        id:   5 orientation_matrix    : SINT32([0, -65535, 0, 0, 0, -65535, -65535, 0, 0]) 
        id:   0 sensor_type           : ENUM([0]) accelerometer

  • So since both of you seem to have experience: what kind of performance should one aim for? My tool currently parses the Fenix files above (sample.fit + large.fit) in 0.050ms + 0.309ms respectively in full on an M1 Mac with Rust 1.49 beta arm64 compiler. For my old work laptop, dual core 13" MacBook Pro, it's between 2-3 times that. Virb files seem less demanding, even for the largest 19MB one I have (0.373ms on M1), since there is no developer data so far. Basically I parse the fit in a way that incorporates the dev data/field_description if it exists into the "normal" definitions (sorry, crappy explanation).

    I also needed to implement filtering data for a specific recording sessions for the virb so finding e.g. video UUID + start/end of session is included in the times above (the virb logs outside of video recording, which great imo).

    It's performant enough that I can traverse and arbitrary path recursively and pair Virb video clips with the correct fit-file on demand (requirement for what we do). Disk IO is the real bottleneck so far.

    I want to be able to import the Profile.xlsx content automatically for name lookup niceness in the future, but it's not a requirement in my case.

    Most of the dev time goes to finding edge cases so that it accepts an arbitrary fit file (what about a watch file that was corrupt after half the run - I want to salvage as much data as possible - or another where the reported data size in the header *exceeds* the file size, or when it's 0 etc etc - have all of those :P).

    I'm about to release this tool (cli so nothing fancy, but open source and should work on win, mac, linux) for a somewhat specific purpose (inspecting fit contents in this detail is a side effect) for an academic project as a companion tool for a paper, but I'm frankly a bit scared to release my code, since it'll be the first time I do so. Not that I expect anyone rushing to use it; it's meant for field work and we're all a bit travel constrained currently...

    No fancy stat curves for the athletes etc, it's all about mapping info to location :)

  • What should I measure? I have the following fit files, ranging from a few kb to almost 20MB:

    size_bytes  file
    -------------------------------------------
    2769422     fenix6x/large.fit
    258326      fenix6x/sample.fit
    771         sdk/Activity.fit
    210044      garmin_forerunner245m/2019-07-19-11-56-05[1].fit
    168581      garmin_forerunner245m/2019-07-20-09-46-51[1].fit
    13107688    virb/2019-10-17-18-43-04.fit
    8454450     virb/2017-01-28-05-16-40.fit
    18348422    virb/2017-10-15-12-01-53.fit
    18962970    virb/2017-10-15-12-02-01.fit
    6507863     virb/2018-10-27-16-44-22.fit
    10990525    virb/2017-11-23-01-22-22.fit
    11057368    virb/2018-01-13-22-17-25.fit
    3923562     virb/2018-01-16-10-56-47.fit
    11297268    virb/2018-01-16-11-46-36.fit
    16676470    virb/2018-01-16-16-25-12.fit
    15865772    virb/2018-01-17-16-53-13.fit
    596599      watch_corrupt/84883038.fit
    70445       musette_bicycle/En_langpanna_pa_asen.fit
    50419       musette_bicycle/Musette_Classics_2020.fit
    33224       musette_bicycle/Musetteride_100km.fit
    21197       musette_bicycle/Musetteride_Original.fit
    30716       musette_bicycle/Musetteride_med_Cadel_Evans.fit
    193907      musette_bicycle/Ronde_van_Skane.fit
    48348       musette_bicycle/Simrishamn_-_Malmo_pa_grus.fit
    101751      virb/2017-05-29-13-05-42.fit
    1257        wahoo_bolt/2020-08-29-153628-ELEMNT BOLT 86EC-1-0.fit
    117020      wahoo_bolt/2020-08-29-163117-ELEMNT BOLT 86EC-2-0.fit
    73352       wahoo_bolt/2020-08-30-110504-ELEMNT BOLT 86EC-3-0.fit
    116887      wahoo_bolt/2020-08-30-114810-ELEMNT BOLT 86EC-4-0.fit
    142218      wahoo_bolt/2020-09-01-085429-ELEMNT BOLT 86EC-6-0.fit
    129362      wahoo_bolt/2020-09-01-151421-ELEMNT BOLT 86EC-7-0.fit
    

    Running

    time for f in **/*.fit;my_fittool $f;my_fit_tool $f;end

    (will parse all of the above - fish is my shell btw) returns:

    Executed in    4,79 secs   fish           external 
       usr time    4,29 secs    3,28 millis    4,28 secs 
       sys time    0,38 secs   16,54 millis    0,36 secs

    But I'm not sure this is what you measure? This doing a full parse on each input file. I'm just reading the whole fit as an array of bytes, and iterate over that when parsing. I have setup generic structs for e.g. a "DataMessage" that are returned as a HashMap with the global id as key and array of DataMessage structs. Default is returning all data messages, but can be filtered on global id and/or virb recording session during parse. This is returning the raw Fit basetype values (e.g. sint32 for coordinate as semicircle etc) so further processing is required before use, which works well in my case.

    Sorry, weird Christmas so I'm a bit too talkative (and bored perhaps). Seems we'll have a semi-proper one over here after all, though.

    Merry Christmas!

  • Intention is how long it takes to read the file and convert it into something usable in the language of choice of a library.

    what I measure is the time it takes to read the file and parse it into a structure that is native by the target language as the library chose (not necessarily a common format like csv or json). so in swift FitFileParser generate an array of message object that each are a dictionary key -> value, in php it generates key to array of value, FitDataProtocol generates message as class for each message, etc, etc.

    Make sense? which language do you use?

  • that sounds quite fast, in fit-benchmarks the fastest library measured similar on my MacBook Pro (intel), I'll try on a MacBook m1, which I am getting access to later today.

    once it's released, I assume on GitHub? we can add it to the list in fit-benchmarks... Is it also a library or just a cli tool?

  • So I got access to an M1 MacBook Air, and I checked the speed of parsing the fit file with my benchmarks there compared to my MacBook Pro with an intel I9

    And all the online benchmarks seem confirmed, it's definitely faster on the M1!!

  • Yep, the M1 is pretty nice so far, isn't it? Slight smile

    I did a crappy timed test for a few files of varying size and content. The runs for ALL: FULL PARSE returns a hash map/dict with the global_id as the key, the values require further processing (since Rust is statically typed I'm returning an enum where each member corresponds to a fit basetype and has the values "containerised" - sorry bad at the proper lingo).

    The gps_metadata section is a bit dishonest in that it's not fully converted to decimal degrees etc, but just re-structured for easier access. The values are still e.g. sint32/semicircles  for longitude, but accessible as e.g. point.longitude. Calibrated three_d_sensor data does go through processing, however, and *should* return values in the form Profile.xslx states for calibrated_x etc (needs further testing, but somewhat checks out)...

    So this is Rust, v1.49 beta compiled natively for M1(may fluctuate 5-10% between runs, didn't do an average):

    ALL: FULL PARSE, EXTRACT ALL MESSAGES (RETURNS FIT BASETYPES)
    | Time   | File size      | File name
    +--------+----------------+-----------------
    | 0.245s |  2769422 bytes | fenix6x/large.fit
    | 0.023s |   258326 bytes | fenix6x/sample.fit
    |  0.01s |   193907 bytes | musette_bicycle/Ronde_van_Skane.fit
    | 0.011s |   142218 bytes | wahoo_bolt/2020-09-01-085429-ELEMNT BOLT 86EC-6-0.fit
    | 0.016s |   210044 bytes | garmin_forerunner245m/2019-07-19-11-56-05[1].fit
    | 0.332s | 18962970 bytes | 2017-10-15-12-02-01.fit
    | 0.001s |   101751 bytes | 2017-05-29-13-05-42.fit
    
    VIRB: EXTRACT AND PROCESS GPS METADATA
    | Time   | File size      | Points         | File name
    +--------+----------------+----------------+-------------------------
    | 0.002s |   101751 bytes |   581 points   | 2017-05-29-13-05-42.fit
    | 0.362s | 18962970 bytes | 60114 points   | 2017-10-15-12-02-01.fit
    
    VIRB: EXTRACT, CALIBRATE THREE_D_SENSOR METADATA (ACCELEROMETER)
    | Time   | File size      | 3D messages    | File name
    +--------+----------------+----------------+-------------------------
    | 0.002s |   101751 bytes |    53 messages | 2017-05-29-13-05-42.fit
    | 0.362s | 18962970 bytes | 20405 messages | 2017-10-15-12-02-01.fit

    Example "processed" gps_metadata:

    2017-05-29-13-05-42.fit
    GpsMetadata {
        timestamp: 137,
        timestamp_ms: 990,
        latitude: 664615144,
        longitude: 157484303,
        altitude: 3064,
        speed: 0,
        heading: 0,
        utc_timestamp: 864990480,
        velocity: [ -3, 34, 0],
    }

    Example calibrated x, y, z three_d_sensor data, accelerometer:

    2017-10-15-12-02-01.fit
    CALIBRATED_X: [-0.021484375, -0.01123046875, 0.0009765625, 0.0068359375, 0.00390625, -0.0029296875, -0.0107421875, -0.0126953125, -0.0068359375, -0.0068359375, -0.0126953125, -0.0146484375, -0.013671875, -0.0126953125, -0.01513671875, -0.017578125, -0.01513671875, -0.00830078125, -0.00048828125, 0.017578125, 0.02392578125, 0.00732421875, -0.0234375, -0.03955078125, -0.03369140625, -0.01904296875, -0.0087890625, -0.00634765625, -0.00830078125, -0.0107421875]
    CALIBRATED_Y: [-0.02392578125, -0.03076171875, -0.03173828125, -0.02734375, -0.02099609375, -0.0166015625, -0.01611328125, -0.01806640625, -0.02197265625, -0.02490234375, -0.0263671875, -0.02587890625, -0.02392578125, -0.0224609375, -0.02294921875, -0.02197265625, -0.0205078125, -0.0185546875, -0.01708984375, -0.01953125, -0.01953125, -0.01708984375, -0.01513671875, -0.01611328125, -0.0185546875, -0.0205078125, -0.02197265625, -0.025390625, -0.02880859375, -0.029296875]
    CALIBRATED_Z: [-1.0068359375, -1.01123046875, -1.0068359375, -0.99755859375, -0.99169921875, -0.994140625, -1.0009765625, -1.00048828125, -0.9931640625, -0.9921875, -0.99462890625, -0.99462890625, -0.9931640625, -0.99169921875, -0.99755859375, -1.01025390625, -1.017578125, -1.0126953125, -0.99755859375, -0.98388671875, -0.97509765625, -0.96875, -0.974609375, -0.98681640625, -0.99560546875, -1.0, -1.00439453125, -1.0107421875, -1.0166015625, -1.01611328125]