FitCSVTool can't recreate original FIT file, is that really true?

Sorry if this is an obvious question, but I was exploring FitCSVTool to convert between FIT and CSV formats and I naively assumed that converting from FIT to CSV and then back again would result in a file identical to the original FIT file.

However, the FIT file is missing all the "unknown" fields and messages.

It would seem to be fairly trivial for FitCSVTool to write a hex value for "unknown" so that the conversion would work. Is there some reason why this isn't done? Is there an alternative to FitCSVTool that works as I would have expected?

  • Replying to myself:

    After reading more of the source code and other resources it is clear that FitCSVTool is indeed not a complete conversion tool.

    I can see that Garmin wants to keep some fields/messages proprietary, which is fair enough. Having the ability to easily create FIT files with these fields/messages in them could lead to all sorts of mess. E.g., I was thinking of editing some heart rate data where the sensor clearly went tropo for a few minutes, however, this would likely invalidate some proprietary fields/messages that depended on using the sensor data at the time the original file was created (e.g., to calculate VO2 max, or time to recover).

    Still, I have a couple of suggestions:

    (1) Make the above information explicit in the documentation for FitCSVTool, and

    (2) Consider opening up some of the unknown fields. E.g., it is widely known how to extract VO2 max from message type 140, so why not include this in the SDK?

  • Using one of the SDKs, that supports both reading and writing files: C, C#, C++, Java, Objective-C, a FIT file can be decoded, modified, and then re-encoded without losing and of the proprietary data. 

  • Thanks for your reply. I might look into modifying FitCSVTool to do this.

  • Hello.

    About half a year ago I was probably doing the same thing as the OP is doing now. Maybe, my story will help.

    Before starting to think about developing something with Java FIT SDK I wanted to be sure, that decoding and encoding are lossless. In other words, I expected encode(decode(fit))=fit, as it is true for Base64, Deflate, etc. After some digging I've found, that it is not 100% possible with provided SDK, because com.garmin.fit.FileEncoder#writeFileHeader writes current SDK version in header, so these bytes as well as the arch (Big/Little Endian) and checksum will differ in any case.

    I've done some dirty hacks in SDK to fix these differences at least for my test FIT file and came to the problem, that there's no way to get FIT messages from com.garmin.fit.FitMessages in the same order, so accurate encoding again is not possible. I've modified FitMessages by adding one more list of messages and updated com.garmin.fit.FitListener#onMesg to store incoming messages in this list and than later read them and encode. So, my code started to look like this:

    FitMessages messages = new FitDecoder().decode(new FileInputStream("original.fit"));
    
    FileEncoder encoder = new FileEncoder(new File("reencoded.fit"), V2_0);
    encoder.write(messages.getMessages()); // Non-SDK field is used here
    encoder.close();
    

    After that reencoded.fit is still smaller than original.fit by about 10 Kb. After converting these files with FitCSVTool with -s -iso8601 -b flags resulting CSV file for reencoded FIT is 1.5 Mb smaller because lots of fields in messages are missing/empty/have different value. I think, the reason for that is because classes, that extend com.garmin.fit.Mesg, do not contain all exising fields from messages and write fields that are only known to FIT SDK. I've heavily debugged com.garmin.fit.Decode#resume and com.garmin.fit.Decode#read(byte) methods, where bytes parsing is occurred, and added some code here and there to store original com.garmin.fit.MesgDefinition that have all field definitions read from the original file along with all fields (ignoring getNumValues), and use stored values during writing instead of message definitions, created from parsed message (that has already lost unknown fields).

    With all that mess I've made on SDK I can now write about 5 or 6 messages from 20k (I am close, ha-ha) byte-to-byte identically as they are coded in the original file. Any attempt to fix writing of the next message cause previous messages' encoding to broke (bytes begin to differ). Maybe, it is because com.garmin.fit.Decode#localMesgDefs is too small. I also remember some problems with message definitions with the same num and localNum having different content.

    For now I've stopped on that. I understand, that it is a terrible way of using SDK, and I probably should've dived deeper into the documentation first, but whatever happened, happened.

    I'd be like to learn about the way of using the original SDK to reencode FIT files without data loss. Thanks.

  • Hi 9811827, thanks for that summary of your investigations. My gut feeling is that there should be some elegant way to patch the Java FIT SDK to read/write hex values whenever an "unknown" field/message appears, but perhaps I am wrong. Perhaps adding dummy enums for the unknown fields/messages would be a start?

    I am surprised that there doesn't appear to be a good open-source alternative to FitCSVTool. There are quite a few similar things on github, etc, but I haven't yet found a simple encoder/decoder that generates byte-identical files.

  • If you want to recode a FIT file, then you do not want to use the FitDecoder and FitMessages helper classes. Those are there for convenience when the goal is to decode and process the data in the file.

    With the Java SDK you can connect the decoder to an encoder and that is about it. 

    In pseudo code, but should be close...

    FileEncoder encoder = new FileEncoder(file, Fit.ProtocolVersion.V2_0); // This could be a BufferEncoder too
    Decode decoder = new Decode();
    decoder.addListener((MesgListener) encoder);

    try {
    decoder.read(inputStream);
    inputStream.close();
    }
    catch (FitRuntimeException e) {
    }
    catch (java.io.IOException e) {
    }

    try {
    encoder.close();
    }
    catch (FitRuntimeException e) {
    }
    That will not get you an exact binary copy of the file but the data in the output file will be identical. The header will be different and the Java SDK will strip out invalid values, which is different than unknown values. Otherwise all valid data will be in the output file and the messages will be in their original order. 

    If you want to modify messages, then you need to create a class that implements both the message listener and message source interfaces, and slot this class between the decoder and the encoder. 

    public class IdentityTransform implements MesgSource, MesgListener {
    private ArrayList<MesgListener> mesgListeners;
    public IdentityTransform() {
    super();
    mesgListeners = new ArrayList<MesgListener>();
    }

    public void onMesg(Mesg mesg) {
    for (Mesg mesg : mesgs) {
    mesgListener.onMesg(mesg);
    };
    }

    public void addListener(MesgListener mesgListener) {
    if ((mesgListener != null) && !mesgListeners.contains(mesgListener)) {
    mesgListeners.add(mesgListener);
    }
    }

    }
    You could buffer the incoming messages, mess with them, and then broadcast them for an example of that look at the ActivityRepairFilter in the Java SDK. 


  • Hi Ben,

    Thanks for taking the time to explain this! I have looked through this forum and seen your very helpful and patient replies on numerous threads. Greatly appreciated.

  • One issue with this approach is that all the editing/modifying messages is done in Java, whereas the advantage of doing it with the CVS file is that the file can be edited with any old text editor. So I still like the idea of a having a CVS translation of a FIT file that would contain all the information needed to recreate the original file.

  • Thanks for the reply.

    However with this code I have exaclty the same results, that I had with my modified SDK:

    reencoded.fit is still smaller than original.fit by about 10 Kb. After converting these files with FitCSVTool with -s -iso8601 -b flags resulting CSV file for reencoded FIT is 1.5 Mb smaller because lots of fields in messages are missing/empty/have different value

    You wrote: "the data in the output file will be identical" and "the Java SDK will strip out invalid values, which is different than unknown values". Converted CSV file for reencoded FIT does not contain many fields:

    // original
    Definition,0,file_id,serial_number,1,,time_created,1,,unknown,1,,manufacturer,1,,product,1,,number,1,,type,1,,
    Data,0,file_id,serial_number,"my-serial-number",,time_created,"my-fit-time",,unknown,"4294967295",,manufacturer,"1",,garmin_product,"my-part-number",,number,"65535",,type,"4",,
    
    // reencoded
    Definition,0,file_id,serial_number,1,,time_created,1,,manufacturer,1,,product,1,,type,1,,
    Data,0,file_id,serial_number,"my-serial-number",,time_created,"my-fit-time",,manufacturer,"1",,garmin_product,"my-part-number",,type,"4",,
    

    I want to keep ALL fields and values (including FFFFFFFF=4294967295 and FFFF=65535 - these value are not valid and their fields are stripped?).

    I would expect that the file created by a modern Garmin device (not from a third-party source) does not contain invalid fields/values that will be removed.

    Do you confirm, that this is the expected behavior? Thanks.

  • What you are seeing is the expected behavior. The Java SDK dynamically creates message definitions based on the fields with valid data. That is why fields with invalid values are being removed in the output file. Whereas the devices use static message definitions, which means there needs to be a way to convey if a value is valid or not.