Basic data compression to save watch memory use.

Good morning everyone, 

In one of my applications I have a very large array that is full of float values,  ( up to around 150 float values ), and I would like to convert each array float into a 4 digit  value ( always with two decimal places, and allways with value less that 99, and always an integer ), and then combine several of the array values into a single number. 

For example [1.2344667,24.565544,32.342354,77.3424234]   would become 0123245732347734

My array of 150 float values, would then become an array of 33 numbers..... and I could have lots of these arrays.

I need to retain accuracy with the number, so my question is how many of these 4 digit, non signed , non decimal  numbers should I be able to group and store as non signed integers?

I might have for example, 200 arrays each with 150 values, and the watch memory cannot cope with that number of vlaues.

  • You'll need to add the code that uncompresses them, and the uncompressed values will also be in the memory, so I'm not sure how much you'll really save. Do you want to be able to keep only 1 array of 150 floats in the memory uncompressed, and as much as possible of these arrays?

    You might also be able to put them as json in the resources, thus they are not in the memory, and you load them on demand. That might save you lots of coding of this "decompressor".

    But if you really want to compress then what you can do is either BCD (https://en.wikipedia.org/wiki/Binary-coded_decimal) or maybe even slightly more compressed fixed point fractional (https://en.wikipedia.org/wiki/Fixed-point_arithmetic)

  • Or save them as byte arrays,  Two bytes for something like 123.45 (byte 1 is 123, byte 2 is 45).  Not much work to save in the array or restore. 150 values, 300 bytes, vs 600 bytes for 150 floats

  • Well...... this is what I was thinking..

    Current data example :0.000000, -0.000143, 0.000803, 0.003856, 0.009837, 0.019554, 0.033733, 0.053033, 0.077965, 0.109072, 0.146944, 0.192127, 0.244890, 0.305228, 0.372911, 0.447520,  ...   

    Times by 10, make 4 digit integeter .... **original value will always be less than 40.00

    0000,0000,0000,0000,0001,0002,0003,0005,0008,0011,0015,0019,0023,0030,0037,0045

    Combine 5 x integers below 4000 into a long integer

    0000 0000 0000 0000 0001,0002 0003 0005 0008 0011,00150019002300300037,    0045

    Array was 16 floats, array will now be 4 longs. Typically most of my array data will be 60 to 100 values, which should reduce to an array with long integers of (12 to 20) size.

    I can then fairly easily convert that back to the original value in the original order. but I may have a single array on the watch, that contains up to 300 of these smaller arrays ( 12 to 20 ) size.

    My original values I don't care about + or - and, only interested in preserving 2 decimal places, and the biggest original value will be less than 40, which means I should be able to jam in 5 sets of 4 numbers, below 4000? into 60 bytes? so within the long integer size limit.

    I am only storing this data, to be sent to a website when the watch has connectivity.  and then de-parsed on the webserver.  My applications are very memory intensive allready, is this strategy reasonable, or is there a better way, or a way that does not count as app memory use on the stack when not in contact with WIFI or BLE ... stored on the watch somehow?  

    Also - checking is my math correct for calculating how many numbers I can fit into a long 64 int ,     2 to the power of 12 is 4096, so I assumed so long as my numbers are less than 4000, each individual integer would take up a max of 12 bytes, and the limit is 64 for a long, so I can fit 5 numbers conguently in a long int?

    Or is it 4 max?    less than 2 to power of 63 ?

  • If you want to preserve number greater than 32.00 then you'll need 6 bits for the integer part, if you won't have more than 31, then 5 bits would be enough. But the gain you would have by shrinking by 1 bit is probably not much more than the code you would need to add to deal with these.

    I think your math is OK

  • Or save them as byte arrays,  Two bytes for something like 123.45 (byte 1 is 123, byte 2 is 45).  Not much work to save in the array or restore. 150 values, 300 bytes, vs 600 bytes for 150 floats

    For example [1.2344667,24.565544,32.342354,77.3424234]   would become 0123245732347734

    I need to retain accuracy with the number

  • Max number for signed 64 bit is 9,223,372,036,854,775,808.  Note it is 19 digits and thus you are not going to fit 5 x 4 digit numbers into it.  Plus how are you going to deal with a negative number in the middle of your sequence, start another long?

  • so with the source data, I don't care about the signing, so I will convert to absolute number integer ( all positive ) and multiply by 100, to remove all decimal places, because I only care about the original 2 decimal places, and this then gives me 4 whole numbers which is accurate enough for my purpose.............    I'm unfamiliar with storing as a JSON resource? Does it have persistance after the application on the watch is closed?    

  • So, the byte array is going to take up less space, than shrinking a 150 float array down to 34 long int array?

  • Guys, Ultra and Nick, you both seem to be new to programming. You're mixing the base-10 digits of a number with the bits of a Long. To represent a digit in BCD you only need 4 bits (and there are more compact ways), so 4 digits are 16 bits, so you can fit 4 numbers like that into a 64 bit Long. However It's not that simple as holding 1 Long in the memory will use more than 64 bits (it can also be null).

    Using byte array might give you better compression because there you use all the bits (but still has some overhead for the ByteArray itself.

    However from your questions it seems to me that this is what we call premature optimization. Start to develop the main feature, then when it work think about fitting more data into it. By that time you might become more familiar with basic programming phrases and Monkey C as well.

    To me it looks that the gain is very small compared to the hassle. But the best way to learn Monkey C is to try out things. You could create multiple compressions and then throw in the same data and see how it performs. Note that you'll have to look in the memory manager because neither the mc file size nor the prg size or code size alone won't give you real data, only when you see it working in the simulator (because the uncompressor code takes place, it uses memory, etc...

    Another thing to consider is how will this be used. Decompression will take some time, resources, so take that into account.

  • I wasn’t referring to BCD and wasn’t confused. But let's say you were referring to BCD.  Nick was talking about 5 Integers of 4 digits into a Long.  Each 4 digit number would occupy 16 bits (in BCD) and if you multiply that by 5 you’ll see it comes to 80 bits. You can't fit a quart into a pint pot as they say.  Are you new to maths?