Big update to prettier-extension-monkeyc

I've posted about prettier-extension-monkeyc before, but I've added a bunch of new features that developers will probably like (well, I've been missing them, so maybe you have too).

The new features it implements for VSCode include:

  • Goto Definition. Point at a symbol, Ctrl/Cmd click, and it will take you to the definition. Or F12
  • Goto References. Right click on a symbol and select "Goto References". It will show you all the references. Or Shift-F12
  • Peek Definition/Peek References. Same as above, but in a popup window so you don't lose your place in the original document.
  • Rename Symbol. Right click on a local, function, class or module name, and select "Rename Symbol". It will rename all the references. It doesn't yet work for class members/methods.
  • Goto Symbol. Type Ctrl/Cmd-Shift-O and pick a symbol from the drop down (which has a hierarchical view of all symbols in the current file). This also appears as an outline across the top of the file.
  • Open Symbol By Name. Type Ctrl/Cmd-T, then start typing letters from a symbol name. A drop down will be populated with all matching symbols from anywhere in your project.

Older features include a prettier based formatter for monkeyc, and a monkeyc optimizer that will build/run/export an optimized version of your project.

[edit: My last couple of replies seem to have just disappeared, and the whole conversation seems to be in a jumbled order, so tldr: there's a new test-release at https://github.com/markw65/prettier-extension-monkeyc/releases/tag/v2.0.9 which seems to work for me on linux. I'll do more verification tomorrow, and push a proper update to the vscode store once I'm sure everything is working]

  • so when I do this:

    function f() {
      {
        var a ... // only used in this block
      }
      // here I can't use a
    }

    you say that even though the compiler doesn't let me use a after the block, it's not freed?

  • I looked at the wiki. Can't you know in the optimization phase that the compiler will be called with -l 3 and based on that decide automatically that you trust(or not) declared types?

    Sounds like sizeBasedPRE should be on by default, we all use this to optimize size. Is there any use case except for you debugging why anyone would turn this off?

  • Can't you know in the optimization phase that the compiler will be called with -l 3 and based on that decide automatically that you trust(or not) declared types

    Sort of. But these optimizations introduce a lot of scope for bugs. So at least until its been out for a while, I want a way for people to kill it, independently of whether or not Garmin's type checker is enabled. I'm also finding that (for the parts that work so far) mine is much better than Garmin's; so at some point, when its mature enough, I'm likely to just turn Garmin's off, and rely on mine. Other people may or may not want to do the same. And if it does get to where I'd like it to be, using both won't be a good idea, because Garmin's tends to insist on explicit casts where they shouldn't be necessary, and I'd like to drop them. But initially, it won't do any type reporting, it will just be to help with optimization.

    Sounds like sizeBasedPRE should be on by default,

    At least one person in this thread was pretty adamant that they didn't want it on, because (as I stated when I first released it) it can result in slower code, even though its smaller. I'm pretty sure that *generally* it produces smaller, faster code, but you can imagine a huge switch statement, with 20 cases, and 10 globals, with each global used in 2 cases. All of the globals will get loaded up before the switch every time, even though only one of them will actually be used.

  • you say that even though the compiler doesn't let me use a after the block, it's not freed

    What I said was that the stack storage for a is allocated for the duration of the function. That's 4 bytes (btw I don't think the current stack size contributes to the memory limit of the device, but I'm not 100% certain).

    Whether or not the contents of a are freed (eg if it holds a large array) is a separate issue. As it happens I *think* that the storage isn't freed - because as far as I can see there would have to be an explicit instruction to free it at the end of the block - since the runtime has no sense of the block structure - and I don't see such an instruction.

    But I'm not certain either way. I guess the answer is to write a function with several consecutive blocks, where each one declares a new variable, and stick a large array in it. Then step through the function in the debugger and watch to see if the memory footprint keeps going up...

  • v2.0.38 is out.

    This features

    • better analysis of calls to determine what state they might affect, which results in much better results from the size-based-pre pass
    • fixes a bug that dropped unsupported languages from the optimized project
    • allows "-" in the names of properties
    • fixes several minor parser bugs (where the parser would refuse to parse legal, but relatively obscure code).
    • fixes some display issues with the results of constant folding floats. Typically, the optimized code would have way too much precision for floats. eg Math.PI + 0 would end up with 16 decimal places, most of which were garbage. Now it ends up with the same digits as the definition of Math.PI.

    the fix for unsupported languages uses a trick to avoid warnings. If you set base.lang.heb, or round.lang.heb, or round-240x240.lang.heb, and then set fenix3.lang = $(<whichever>.lang), fenix3 will inherit the setting without generating a warning. So thats how I do it. Since there's a limited number of such "bases", If fenix3 and fr235 both use the same heb directory, I arrange for them to share a "base".

    Note that the bases are picked in alphabetical order; so it would be entirely possible that you end up with "fenix3.lang = $(rectangle-320x360)". This looks odd, but is not a bug. The bases are not used by the optimized jungle file except to handle this one special case.

    Finally, if the optimizer runs out of bases (I don't think thats possible, but maybe if every device that has unsupported languages has its own directory for that language), it just resorts to explicitly listing the languages, and you'll end up with warnings for some of the devices.

    So in this case, its quite possible that you start with a jungle that causes unsupported language warnings, but the optimized jungle does not.

  • I can confirm that the warnings about Hebrew are gone, no other warnings are added.

    "-" works.

    I also see a slight (6 byte) code size improvement.

  • you say that even though the compiler doesn't let me use a after the block, it's not freed?

    I got curious enough to actually try it. With this code:

        function foobarbaz() as Void {
            {
                var x = new [500];
            }
            {
                var y = new [500];
            }
            {
                var z = new [500];
            }
            return;
        }
    

    I see this in the -g output (with my annotations):

    globals/foobarbaz:
        // 1 argument - the hidden "self"
        argc 1
        // 3 locals; so non-concurrent locals don't share space
        incsp 3
        // allocate the first array
        ipush 500
        newa
        // store it in local number 1 (x)
        lputv 1
        // allocate the second array
        ipush 500
        newa
        // store it in local number 2 (y)
        lputv 2
        // allocate the third array
        ipush 500
        newa
        // store it in local number 3 (z)
        lputv 3
        // return, and implicitly free all the locals
        return

    and stepping through in the debugger does indeed show memory increases at each allocation, and isn't freed until the function returns.

    So there's another opportunity for savings that I could implement: move all locals to the top of the function, and remap non-concurrent variables to a single name. In the above example that would mean we would have a single shared xyz variable, and at most two of the arrays would be live at any given time (two because the new array is allocated before the local is assigned to, and its not until the assignment that the old value will be freed). This should add no new code, reduce stack size, and potentially reduce peak memory (eg by 2k in the above example).

  • Wow, interesting. But is it worth to do this as an automatic optimization? It will be used only in a few places, and probably only gain a few bytes, not 500 :) Especially if it's only used when there are blocks. Though if I think about the inlined code, that most of the time becomes a block, and I believe it is really possible to reuse the variable names between them. I believe that this is exactly the opposite of what you do know, isn't it? And in many cases you'll be even able to do something similar without the need for blocks. In your above example code it would work exactly the same way if you did the same without the blocks:

    original:
    var x = new [500];
    // use x
    var y = new [500];
    // use y
    var z = new [500];
    // use z

    optimized:
    var xyz = new [500];
    // use xyz
    xyz = new [500];
    // use xyz
    xyz = new [500];
    // use xyz

    This could reuse any variable after it was used for the last time in the function (after all the inlining). Which for sure would decrease the stack size.


    But isn't there a problem with the strict typecheck? The tmp var would be Number at the beginning, then Number or String, then Number or String or Long, .... soon it'll practically be Any. Won't this cause the generated code to fail with -l 3 if the tmp variable would be used as a right-value in any of the later blocks (if there was a type change)?

    var mZ as Float;
    function f() {
      var x = "str";
      // use x
      var y = 1;
      // use y
      var z = 3.14;
      // use z:
      mZ = z;
    }

    optimized:
    var mZ as Float;
    function f() {
      var xyz = "str"; // String

      // use xyz
      xyz = 1; // String or Number
      // use xyz
      xyz = 3.14; // String or Number or Float
      // use xyz
      mZ = xyz; // I guess strict wouldn't like this, so you might need to do:
      mZ = xyz as Float;
    }


  • I have another idea, tell me what you think about it (the thing is that IMHO it's only worth if there's a way to do it such that it works also without the optimizer, so it's tricky): currently we can create a global function a global variable or a class that have different iplementations under different source folders. It would be nice if somehow we could do a "abstract function" in a class that is implemented in different source folders. In some easier cases it might be possible using inheritance, but even that adds both code and memory overhead and it's not always possible when things start to get complex.

    source/A.mc:
    class A {
      (:abstract) function y();

      function x() {
        y();
      }
    }

    source-<foo>/A.y.mc:
    function y() {
      // do something foo way
    }

    source-<bar>/A.y.mc:

    function y() {
      // do something bar way
    }

    Ah BTW this leads me to another idea: switch inheritance with inlining. Example from my actual datafield:

    (:memory16K)
    class MyBaseField extends WatchUi.SimpleDataField {
      // ... foo
    }
    (:memory16Kplus)
    class MyBaseField extends WatchUi.DataField {
      // ... bar
    }
    class MyDataField extends MyBaseField {
      // most of the code is here
    }

    This inheritance is technical, but could be replaced by inlining code. I see why it's not that simple, mostly because it's probably a very special case, I don't know how many others use this, and it's not easy to detect (probably would need a special annotation like :inline) and even then it would not be easy: you can't move the code from MyDataField to MyBaseField, because you need the type to be MyDataField (that's used all over the code). But maybe you can move the code of the relevant MyBaseField into MyDataField, and replace the exends with the relevant parent class:

    (:memory16K)
    class MyDataField extends WatchUi.SimpleDataField {
      // ... foo
      // most of the code is here
    }

  • But isn't there a problem with the strict typecheck? The tmp var would be Number at the beginning, then Number or String, then Number or String or Long, .... soon it'll practically be Any. Won't this cause the generated code to fail with -l 3 if the tmp variable would be used as a right-value in any of the later blocks (if there was a type change)?

    I don't think thats a problem. At least according to the documentation, and some superficial experiments, the type checker really does track the types of locals properly. So after "x = 1" the type of x is Number, not String or Number. In particular, your example does typecheck without any casts.