Ticket Created
over 5 years ago

WERETECH-8672

ERA Backtrace points to a function that is never called

I was recently looking through some ERA backtraces and came across with a few that looked strange in two respects:

1) The backtrace consisted of only one line

2) The exception reference seemed to be impossible within the code that is pointed to by the backtrace

After pushing out a bunch of releases with small tweaks trying to narrow in on the issue, I finally came across an interesting finding: the backtrace *always* points to the last function in the module--even if it's never called!

Looking through the debug XML for my app, I believe I can see why: the last function in the module is also the last symbol referenced in the pcToLineNum mapping.

So I'm guessing the PC for the backtrace is outside the bounds of my code, and the failure mode for the backtrace-prettification is to not print 'Unknown' or just the PC, but instead to default the the highest (or possibly lowest) addressed symbol in the pcToLineNum mapping.

This is a big problem because, well, it causes developers to spend a large number of frustrating hours trying to disprove something that in reality never actually happened!

A better solution IMHO would be to print something like <Unknown Symbol> or <PC is outside application scope> or something to let the developer know that yes it's a problem, but not *your problem*.

 Examples from ERA: (NOTE: lastFunctionInModule is never called, it was added just to prove the theory that the backtrace was erroneous)

Error Name: Too Many Arguments Error
Occurrences: 3
First Occurrence: 2020-03-03
Last Occurrence: 2020-03-03
Devices:
fēnix® 6S Pro / 6S Sapphire: 3.00
App Versions: 4.6.8
Languages: deu
Backtrace:
DownloadSpeed.lastFunctionInModule:65

Error Name: Symbol Not Found Error
Occurrences: 1
First Occurrence: 2020-03-03
Last Occurrence: 2020-03-03
Devices:
Forerunner® 945: 2.80
App Versions: 4.6.8
Languages: deu
Backtrace:
DownloadSpeed.lastFunctionInModule:65

Parents Comment Children
  • Thanks for looking into it!

    I'd be happy to try to build a minimal test case for this issue, but since it requires old firmware to replicate, I don't think I have a means to do so. There still isn't a firmware archive available for devs, is there?

    Obviously this is pie-in-the-sky, but what would be phenomenal:

    1) An Eclipse plugin, like ERA, that lets developers reflash their devices to arbitrary firmware versions

    2) This tool would need developers keys, etc, so would be somewhat protected from use/abuse by casual users (thus avoiding support burden)

    3) This tool would also abstract away the storage/delivery of the firmware. Garmin wouldn't be sharing links directly, but would be authorizing download through the tool and then performing the download for the developers. Very similar to the SDK manager. The difference would be the post-download hook that perform the install via Garmin Express integration

    4) This hypothetical tool could display any EULA or legal stuff needed because of the experimental nature of this tool

    Given the realities of when folks update firmware (hint: they mostly don't), a tool like this is in the best interest of both developers and Garmin. But I understand it would be *a lot* of work. Just thinking out loud...

  • Yeah. I can fix the bit of code in question and beef up the unit tests for it, but that doesn't necessarily guarantee that the bug you're seeing will be fixed.

  • To clarify: the functional test case would require the whole app (since I don't have any idea what the root cause of the bug is so I can't isolate the affected lines), and a watch with old firmware.

  • Unfortunately I don't really have a test-case per-se: I can only trigger the bug by deploying my app to production where it ends up running on a smattering of watches with old (likely buggy) firmware and then I observe the results in ERA.

    The root-cause bug is still unknown (to me at least), but it's necessary to generate a backtrace with a PC outside of the application's range.

    The easier route to go is probably to add/update the unit-tests for the Backtrace-Prettifier (whatever component is using pcToLineNum) and verify if the PC is outside of the application's range, it indicates so and doesn't just fall through to the last thing thing it saw.

    I'd imagine looking at the code it might be possible to just see the bug jump out (?) That could be really naive though....