How to store a database?

Hey, I’m trying to make an app that needs to be offline. How can I store the database for it? (30kb text file)

What I know is:

  • I can’t store it as a string or array because it’s too long
  • I can’t store it on the file system because??
  • I can’t store it as a resource because it’s not a picture or font

Is there any possible way to do this?

(I only need to load one line at a time, I’d like to search in it, basically)

  • TL;DR In my experience, as far as memory efficiency goes, I think that JSON resources are always better than hardcoded data (due to the huge overhead of the code to initialize the data), provided your app already calls loadResource() once for reasons other than loading JSON resources. This is because calling loadResource() at least once incurs the additional (permanent) RAM overhead of the resource table (which is roughly 1 KB in one of my apps which has a few dozen string resources for settings).

    If your app doesn't load other resources, then it's a toss-up depending on how big your data is and how many resources you have. (This is especially an issue if your app has settings, since the strings for app settings will take up the bulk of your resource table if you are not using any resources at runtime.)

    Also, there's no rule that says JSON data has to be a dictionary. As the docs point out, it can also be a primitive, array, or mix of data. I have an app which uses array JSON data.

    In this case, for data which is 8 KB * 19 in size, it seems like JSON data is a clear winner.

    -

    More details:

    Here's my 2 cents on the JSON resource vs. hardcoded data structure question, as far as memory usage goes. (I've used both approaches in the same memory-constrained data field app, since it supports old devices which don't support JSON resources, and I looked closely at the memory impact.)

    - JSON data can consist of arrays as well as dictionaries, so the overhead of a dictionary vs an array doesn't seem to be particularly relevant here (other than the fact the fact that you will be adding at least 1 entry to the object which represents the resource table - more on that later *)

    - A huge advantage of JSON data over hardcoded data in code is that the latter approach requires the code for *initializing the data structure* to always reside in RAM, even if the data itself doesn't always need to be in RAM. For example, I did a quick test by adding a static array with 100 numbers as a class member. The size of the array in memory was 515 bytes, but size of the additional to code initialize the the array was a whopping 1314 bytes (the impact of the additional application data was miniscule - 8 bytes). With JSON data, there is no such initializer code, which seem to be a huge win in its favor (but see the next point *)

    - (*) On the flip side, one issue with any kind of resource that hasn't been brought up in this thread is that the resource table isn't loaded into your app unless you call loadResource() at least once. As soon as you call loadResource(), this table is loaded into memory permanently, and it has entries for ALL resources (including strings, which obviously includes strings for app settings). The table maps resource IDs to numbers, so the content of the resources isn't a factor here, but only the *number* of resources. I have an app which uses JSON data and also has a few dozen strings for app setting; the string section of the resource table is 876 bytes and the entire resource table is a little over 1 KB. If my data were significantly smaller than 1 KB (including the size of the initializer code), it could've been more efficient to use hardcoded data

    - For both approaches, one obvious strategy for saving memory is to load the data on demand, so you can use your memory for something else when you're not using the data. This can be accomplished with either approach. With JSON data, call loadResource() only when you need the data, store it in a variable X, and assign null to X when you're done. With hardcoded data, wrap it a function and only call the function when you need it. Again, the persistent disadvantage with hardcoded data is the code for initializing the data itself will always consume RAM.

    So in my experience, the only issue with JSON resources is the resource table memory overhead they add to your program if you aren't already calling loadResource() at least once. If you are already calling loadResource at least once, then it seems to me that the use of JSON resources for static data is a no-brainer, since you're guaranteed to save memory over the hardcoded data structure approach.

  • Interesting. I have some data I want to load ONLY once, whenever a user starts the app for the very first time. I now have this hard coded in the code itself. I might want to poke this bear of JSON resources.

  • Funny Flow, but in another thread from just a few minutes back you said:

    "(Arrays are expensive, but dictionaries are even more expensive). "

    When you use JSON data, you always start with a dictionary, which can contains arrays.  How expensive is really a matter of your code and the structure of the data.

  • Funny Flow, but in another thread from just a few minutes back you said:

    "(Arrays are expensive, but dictionaries are even more expensive). "

    Not sure what your point is? I agree that dictionaries (with lots of keys and/or nested arrays/dictionaries) are expensive. I disagree with your statement that JSON data has to be a dictionary. Even if it did, you could simply construct your JSON data resource as a single dictionary with a single key whose value is an array, so you'd have a *fixed* overhead for each JSON resource.

    Therefore, I also disagree with the implication that hardcoded data is better than JSON resources because "hardcoded data can be anything" and "JSON data must be a dictionary." If you followed my argument, the reason I think JSON data is better is that hardcoded data has a huge overhead in terms of the code to initialize the data (this code is not present for JSON data).

    The documentation literally says JSON data can be: a dictionary, array, primitive, or "mixed data" (e.g. any of those things nested within each other.)

    So maybe I'm not understanding what you're saying, but it sounds highly misleading to me.

    https://developer.garmin.com/connect-iq/core-topics/resources/#jsondata

    <resources>
        <jsonData id="jsonDictionary">{"key":"value", "3":"three", "three":3}</jsonData>
        <jsonData id="jsonArray">[1,2,3,4,5,6]</jsonData>
        <jsonData id="jsonMix">[1,{"1":"one"},["a","b","c"]]</jsonData>
        <jsonData id="jsonPrimitive">5</jsonData>
        <jsonData id="jsonFile" filename="data.json"/>
    </resources>

    When you use JSON data, you always start with a dictionary, which can contains arrays.  How expensive is really a matter of your code and the structure of the data.

    Are you sure about that? Maybe I'm not understanding what you're trying to say. It sounds like you're trying to say that JSON data must *always* be in the form of a dictionary (at the top-level), whereas when you hard code data, that data can obviously be in the form of an array.

    I have JSON data where the *top-level* data structure is an array. When I call loadResource(Rez.JsonData.Foo), I get an *array* back.

    As I said in my comment, yes, for every resource that you add, one entry has to be added to resource table which will reside in RAM. That's a *fixed* overhead per resource (maybe 24-ish bytes per resource), not the kind of massive overhead that you would get by structuring data (JSON or hardcoded) as an dictionary with lots of keys, as opposed to a flat array full of elements.

    I even gave an example above of how hardcoding an array has a huge overhead (proportional to the size of the data) in terms of the code to initialize the array:

    Array with 100 numbers (hardcoded):

    - 515 bytes for the object in memory

    - 1314 bytes for code to initialize the object (this would not exist for a JSON resource)

    The same array as a JSON resource would still take 515 bytes for the object in memory, and a handful of bytes for the loadResource() code. Once again, the only question here is whether your app uses other resources. If your app loads other resources, then using JSON resources is a no-brainer, because the resource table has to reside in RAM anyway. If you are not using other resources, then as soon as you call loadResource() even once, you incur the permanent RAM penalty of loading the resource table, which is fixed penalty for each resource in the table. Again, the issue here is for people who have lots of strings for app settings, as the IDs for those strings take up the bulk of the resource table (assuming you don't have a lot of other resources.)

    So please point out to me where I said anything that was factually incorrect?

    e.g. I have an app where the JSON data xml looks like this:

    <resources>
        <jsonData id="jsonLayoutRound240_2" filename="gen/round240.2.layout.json"/>
    ...

    And the corresponding file for that resource looks like this:

    [
      4294967295,
      0,
      4294967295,
      0,
    ...

    I'd really love to hear your explanation for how that's not an array? When I call loadResource(Rez.JsonData...) I literally get an array back.

    Again, I'm aware that for each JSON resource, an entry has to be added to the resource table which must reside in memory. That's not my understanding of the meaning of the statement "JSON resources have the overhead of dictionaries".

    And even if that's what you meant, so what?

    It's one dictionary, with one entry per JSON resource. That's a fixed overhead for each resource. The real issue with using dictionaries is when you have lots of keys, or when the values are also dictionaries (or arrays.) For example, a flat array with 10 numbers will be a lot smaller than the equivalent dictionary with 10 keys/values.

    Given that the code overhead for a 515 byte array is *1314* bytes when you hardcode the data, I'd prefer to have constant overhead for *one* dictionary entry, which is surely smaller than 1314 bytes.

    Usually in Monkey C when we talk about the overhead of dictionaries, we're talking about a complex dictionary with a ton of keys and/or nested array and dictionaries, not the overhead of defining and loading a resource.

    Like I never see anyone say "don't use string resources, because of the overhead of dictionaries". (EDIT: well, maybe it’s been said that string resources are expensive, but not in those words. The difference with strings compared to other resources is that apps will typically have lots of strings, especially for app settings where the use of string resources is mandatory)

    Nobody ever says "don't use fonts or bitmaps, because of the overhead of dictionaries".

    And nobody ever says "don't use layouts, because of the overhead of dictionaries".

    All of those things incur the same per-resource overhead as JSON data, in the resource table.

    So I don't get why this would be said for JSON resource data, where there's literally no downside as long as your app already loads other resources at run time. And even if it doesn't, if your data is large enough, then moving it to a resource still saves more memory (compared to hardcoding the data.)

  • TL;DR the statement "dictionaries are more expensive than arrays" has subtext and nuance.

    Like, if I define a dictionary with a single key and set the value to a flat array, yeah that's "more expensive" than the flat array on its own, but it's not the end of the world. It's a small fixed overhead.

    The usual interpretation of the original statement is more like "if I have a flat array with X values, then the corresponding dictionary with X values and keys is horrendously more expensive." Even a flat array with 2 * X values is probably better than the same dictionary with X keys/values.

    If I were to refactor a flat array with X elements into a somewhat equivalent dictionary with X keys/values, that would incur a memory penalty that's proportional to X (as opposed to a fixed penalty). That's *my* understanding of the statement "dictionaries are more expensive (in terms of space) than arrays."

    Circling back to resources:

    If we're saying that "JSON data is bad because each JSON resource adds an extra entry to the JSON resource table (which is a dictionary I guess)", then I call BS, because that's a FIXED overhead per resource. For each new JSON resource, you're not creating a whole new complex dictionary with lots of keys and nested data, you're just adding one entry to an existing resource table.

    It's especially a disingenuous argument because the JSON data itself can be a primitive, array, dictionary or "mix" (e.g. array or dictionary with nested arrays or dictionaries.)

    Going back to my example of a flat array with 100 numbers, if I have the following two choices...

    A) Hardcode the array, so the total memory usage is: 515 bytes for the object and 1314 for the initialization code

    B) Use a JSON resource, so the total memory usage is: 515 bytes for the object, handful of bytes for loadResource() code, plus the size of the JSON resource table (remember it's a *fixed cost* for each resource in the table)

    ...then my first question is:

    1) Do I already call loadResource()?

    If the answer is yes, then it's a no-brainer. The resource table is already in memory, and adding one JSON resource will cost me a small fixed amount of memory, regardless of resource data size, which will surely be less than the size of the initialization code, unless my JSON data is something stupid like a single number. (But in this example it's an array of 100 numbers)

    If the answer is no, then it's more complicated. Now that I have to call loadResource(), I incur the penalty of having the resource table reside in memory. Then the next question is:

    2) How big is the resource table compared to the initialization code for my data?

    If SIZE(resource table) >= SIZE(initialization code), then obviously I hardcode.

    If SIZE(resource table) < SIZE(initialization code), then I will use JSON data.

    --

    Also, regarding the comment that "it matters how you structure your data" - of course it does. But if we're comparing apples to apples, then I would compare the same data structure whether it's hardcoded or it's in a JSON resource. So that's a non sequitur - you obviously want to choose the best data structure for your purposes (to maximize speed of access, minimize memory usage, or balance the two), but it doesn't play a role in whether you should use a JSON resource or hardcode your data.