Skip to content
kripken edited this page Apr 10, 2011 · 12 revisions

#summary Debugging Emscriptened Code

Introduction

If you compile code and things don't go right, you may have run into a bug in Emscripten, or a limitation of it. This page gives some ideas of how to figure out what is going wrong.

Compilation Failures

If the code doesn't compile at all, it should provide some info in the .js file it creates. To see what line number in the .ll file caused the problem, compile with frameworkLines in settings.js (add it to DEBUG_TAGS_SHOWING), that will print each .ll line as we start to process it.

Also useful is to add framework in settings.js, this lets you know exactly which actor was running during the crash, so you can tell which part of the .ll line was responsible. (Search for that actor in the .js files in src/)

Also worth mentioning is that compiling with SpiderMonkey gives slightly more useful crash information than V8.

From here on out, we assume the problem is with running the code, not compiling it.

First Things First

It's a good idea to compare the emscriptened code with how that code behaves when compiled normally. More specifically, compare the following 3:

  • The emscriptened code
  • The same source code, compiled into a binary directly, using gcc or clang
  • The same .ll file that was emscriptened, compiled into binary (llvm-as) and run in the LLVM interpreter (lli)

If the first gives different results than the other two, then you are hitting either a limitation of Emscripten or a bug. Whereas if the last two don't agree, then something may be wrong with the generation of the .ll, or perhaps a bug in LLVM.

Assuming the last two agree, and differ from the first, then you can proceed to the next step.

Limitation or Bug?

Emscripten cannot compile all C/C++ code out there. Some limitations exist, due mainly to one of the following

  • Limitations of the JavaScript runtime environment. For example, no shared-state threads, no extended precision floating points, etc.
  • Self-imposed limitations, to make the generated code efficient, potentially one of the following:
    • Numerical overflows. In JavaScript, adding numbers naively will not overflow like in C/C++. For example, in an unsigned char, adding 1 repeatedly will eventually get you from 255 to 0, but in JavaScript that won't happen by default. To see if this issue occurs, compile with CHECK_OVERFLOWS (set that to 1 in settings.js). You can correct them using CORRECT_OVERFLOWS=1, but this is slow. It is better to use CORRECT_OVERFLOWS=2 and specify the precise lines that need this optimization - see the linespecific test.
    • Signing issues. In C/C++, comparing two 8-bit values of -1 and 255 will return true, but JavaScript will return false. Use CORRECT_SIGNS to force the generated code to fix signs depending on the type of the variable. You can also use CORRECT_SIGNS=2 and specify specific lines to fix. Use CHECK_SIGNS to see what lines are an issue.
    • Rounding of integers. To get true C-like rounding, use CORRECT_ROUNDINGS=1, or =2 with line-specific instructions like with CORRECT_OVERFLOWS.
    • The load-store consistency hypothesis, see [http://code.google.com/p/emscripten/issues/detail?id=8 Issue 8]. To check if there is a problem with this, compile with SAFE_HEAP. Code that violates this is rare, and to make it work properly would mean emulating an x86 (or other) CPU, which would be very slow. So usually you should find a way to change the original code. (Note though that SAFE_HEAP might generate warnings that can be ignored - like valgrind, if you are familiar with that useful tool.)

If you wrote the code you are compiling, you might know if it runs into any of these limitations already. If not, you can use the techniques described in above to find out (CHECK-), see also the methods mentioned later down in this page.

Aside from these limitations, it's possible you ran into a bug in Emscripten, such as

  • Missing functionality, for example a library function which hasn't yet been implemented in Emscripten. Possible solutions here are to implement the function (see library.js) or to compile it from C++.
  • An actual mistake in Emscripten. Please report it!

Additional Tools

As already mentioned, some useful settings appear in src/settings.js. Change the settings there and then recompile the code to have them take effect. When code isn't running properly, you should compile with SAFE_HEAP, CHECK_OVERFLOWS, CHECK_SIGNS and CHECK_ROUNDINGS to find potential problems (see descriptions above). Additional settings are

  • EXCEPTION_DEBUG - Will print out exceptions as they occur. This is useful because if the compiled code catches exceptions, it may catch the wrong ones, and/or not give enough details about those exceptions. It's a good idea to enable this if you have any suspicions about something not running properly.
  • LABEL_DEBUG - Will print out each function and each label in each function, as we enter them. This is extremely useful if the generated code enters an infinite loop that it shouldn't: Run it until it hits the loop, then you can see exactly where that is.
  • GUARD_LABELS - Checks for mistakes in the flow of code from one label to another. Generally doesn't hurt to enable this, but it is useful only on rare occasions.

The AutoDebugger

The 'nuclear option' when debugging is to use the autodebugger tool. The autodebugger will rewrite the LLVM bitcode so it prints out each store to memory. You can then run the exact same LLVM bitcode in the LLVM interpreter (lli) and JavaScript, and compare the output (diff is useful if the output is large). For how to use the autodebugger tool, see the autodebug test.

The autodebugger can potentially find any problem in the generated code, so it is strictly more powerful than the CHECK- settings and SAFE_HEAP. However, it has some limitations:

  • The autodebugger generates a lot of output. Using diff can be very helpful here.
  • The autodebugger doesn't print out pointer values, just simple numerical values. The reason is that pointer values change from run to run, so you can't compare them. However, on the one hand this may miss potential problems, and on the other, a pointer may be converted into an integer and stored, in which case it would be shown but it should be ignored.

Debug Info

It can be very useful to compile the C/C++ files with -g to get debugging into - Emscripten will add source file and line number to each line in the generated code. Note however that when compiling with -g, it may crash lli for some reason. So you may need to build once without -g for lli, then build again with -g. Or, use tools/exec_llvm.py in Emscripten, which will run lli after cleaning out debug info.

Additional Tips

You can also do something similar to what the autodebugger does, manually - modify the original source code with some printfs, then compile and run that, to investigate issues.

Another useful tip is if you have a good idea of what line is problematic in generated .js, you can add print(new Error().stack) to get a stack trace there.

Additional Help

Of course you can also ask the Emscripten devs for help :) See links to IRC and the Google Group on the main project page.

Clone this wiki locally