-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
This issue was previously filed as emscripten-core/emscripten#23717.
llvm-objdump gives wrong line info for a simple WebAssembly file.
Steps to reproduce
- Create a simple
main.cpp:
int main() { return 42; }- Now compile with debug symbols:
em++ -g main.cppVerbose output
"/home/swdv/emsdk/upstream/bin/clang++" -target wasm64-unknown-emscripten -fignore-exceptions -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr --sysroot=/home/swdv/emsdk/upstream/emscripten/cache/sysroot -DEMSCRIPTEN -Xclang -iwithsysroot/include/fakesdl -Xclang -iwithsysroot/include/compat -g3 -DNO_USE_MYFUN -v -c main.cpp -o /tmp/emscripten_temp_pe2lfvyf/main_0.o
clang version 21.0.0git (https:/github.com/llvm/llvm-project 6dc41a639334b913e762f65410fcd14a722b137f)
Target: wasm64-unknown-emscripten
Thread model: posix
InstalledDir: /home/swdv/emsdk/upstream/bin
(in-process)
"/home/swdv/emsdk/upstream/bin/clang-21" -cc1 -triple wasm64-unknown-emscripten -emit-obj -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name main.cpp -mrelocation-model static -mframe-pointer=none -ffp-contract=on -fno-rounding-math -mconstructor-aliases -target-cpu generic -fvisibility=hidden -debug-info-kind=constructor -dwarf-version=4 -debugger-tuning=gdb -fdebug-compilation-dir=/home/swdv/Downloads/plainwasmtest -v -fcoverage-compilation-dir=/home/swdv/Downloads/plainwasmtest -resource-dir /home/swdv/emsdk/upstream/lib/clang/21 -D EMSCRIPTEN -D NO_USE_MYFUN -isysroot /home/swdv/emsdk/upstream/emscripten/cache/sysroot -internal-isystem /home/swdv/emsdk/upstream/emscripten/cache/sysroot/include/wasm64-emscripten/c++/v1 -internal-isystem /home/swdv/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1 -internal-isystem /home/swdv/emsdk/upstream/lib/clang/21/include -internal-isystem /home/swdv/emsdk/upstream/emscripten/cache/sysroot/include/wasm64-emscripten -internal-isystem /home/swdv/emsdk/upstream/emscripten/cache/sysroot/include -fdeprecated-macro -ferror-limit 19 -fgnuc-version=4.2.1 -fskip-odr-check-in-gmf -fcxx-exceptions -fignore-exceptions -fexceptions -fcolor-diagnostics -iwithsysroot/include/fakesdl -iwithsysroot/include/compat -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -o /tmp/emscripten_temp_pe2lfvyf/main_0.o -x c++ main.cpp
clang -cc1 version 21.0.0git based upon LLVM 21.0.0git default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/home/swdv/emsdk/upstream/emscripten/cache/sysroot/include/wasm64-emscripten/c++/v1"
ignoring nonexistent directory "/home/swdv/emsdk/upstream/emscripten/cache/sysroot/include/wasm64-emscripten"
#include "..." search starts here:
#include <...> search starts here:
/home/swdv/emsdk/upstream/emscripten/cache/sysroot/include/fakesdl
/home/swdv/emsdk/upstream/emscripten/cache/sysroot/include/compat
/home/swdv/emsdk/upstream/emscripten/cache/sysroot/include/c++/v1
/home/swdv/emsdk/upstream/lib/clang/21/include
/home/swdv/emsdk/upstream/emscripten/cache/sysroot/include
End of search list.
/home/swdv/emsdk/upstream/bin/clang --version
/home/swdv/emsdk/upstream/bin/wasm-ld -o hello.wasm /tmp/emscripten_temp_pe2lfvyf/main_0.o -L/home/swdv/emsdk/upstream/emscripten/cache/sysroot/lib/wasm64-emscripten -L/home/swdv/emsdk/upstream/emscripten/src/lib -lGL-getprocaddr -lal -lhtml5 -lstubs-debug -lnoexit -lc-debug -ldlmalloc-debug -lcompiler_rt -lc++-noexcept -lc++abi-debug-noexcept -lsockets -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -mwasm64 /tmp/tmp5u5b29eklibemscripten_js_symbols.so --export=emscripten_stack_get_end --export=emscripten_stack_get_free --export=emscripten_stack_get_base --export=emscripten_stack_get_current --export=emscripten_stack_init --export=_emscripten_stack_alloc --export=__wasm_call_ctors --export=_emscripten_stack_restore --export-if-defined=__start_em_asm --export-if-defined=__stop_em_asm --export-if-defined=__start_em_lib_deps --export-if-defined=__stop_em_lib_deps --export-if-defined=__start_em_js --export-if-defined=__stop_em_js --export-if-defined=main --export-if-defined=__main_argc_argv --export-if-defined=fflush --export-table -z stack-size=65536 --no-growable-memory --initial-heap=16777216 --no-entry --stack-first --table-base=1
/home/swdv/emsdk/upstream/bin/llvm-objcopy hello.wasm hello.wasm --remove-section=producers
/home/swdv/emsdk/node/20.18.0_64bit/bin/node /home/swdv/emsdk/upstream/emscripten/src/compiler.mjs /tmp/tmp3fupbzr6.json
/home/swdv/emsdk/node/20.18.0_64bit/bin/node /home/swdv/emsdk/upstream/emscripten/tools/preprocessor.mjs /tmp/emscripten_temp_pe2lfvyf/settings.js shell.html
- Now disassemble the main function:
~/emsdk/upstream/bin/llvm-objdump --disassemble-symbols=__original_main --line-numbers a.out.wasm- Observe how the line numbers and file are completely incorrect, mentioning
fflush.cinstead of ourmain.cpp:
a.out.wasm: file format wasm
Disassembly of section CODE:
0000017c <__original_main>:
.local i32, i32, i32, i32, i32, i32, i32
; __original_main():
; /emsdk/emscripten/system/lib/libc/musl/src/stdio/fflush.c:17
180: 23 80 80 80 80 00 global.get 0
186: 21 00 local.set 0
188: 41 10 i32.const 16
18a: 21 01 local.set 1
18c: 20 00 local.get 0
18e: 20 01 local.get 1
190: 6b i32.sub
191: 21 02 local.set 2
193: 41 00 i32.const 0
195: 21 03 local.set 3
; /emsdk/emscripten/system/lib/libc/musl/src/stdio/fflush.c:18
197: 20 02 local.get 2
199: 20 03 local.get 3
19b: 36 02 0c i32.store 12
19e: 41 8d 21 i32.const 4237
1a1: 21 04 local.set 4
1a3: 41 15 i32.const 21
; /emsdk/emscripten/system/lib/libc/musl/src/stdio/fflush.c:15
1a5: 21 05 local.set 5
1a7: 20 04 local.get 4
1a9: 20 05 local.get 5
1ab: 36 02 00 i32.store 0
1ae: 41 2a i32.const 42
; /emsdk/emscripten/system/lib/libc/musl/src/stdio/fflush.c:20
1b0: 21 06 local.set 6
1b2: 20 06 local.get 6
1b4: 0f return
1b5: 0b endVersion of emscripten/emsdk
emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 4.0.3 (a9651ff57165f5710bb09a5fe52590fd6ddb72df)
clang version 21.0.0git (https:/github.com/llvm/llvm-project 6dc41a639334b913e762f65410fcd14a722b137f)
Target: wasm32-unknown-emscripten
Thread model: posix
InstalledDir: /home/swdv/emsdk/upstream/bin
More findings from emscripten-core/emscripten#23717
@kripken emscripten-core/emscripten#23717 (comment):
[...]
llvm-dwarfdumpgives proper output.
@dschuff emscripten-core/emscripten#23717 (comment):
So this problem has to do with the way LLVM handles symbols for linked wasm files and debug info. Specifically, symbol addresses in DWARF are always encoded as offsets in the code section, whereas for linked files, LLVM uses the offset in the file as the address for a function (this is to match how engines print code addresses in backtraces). See some changes (and llvm/llvm-project#76198) I made to implement this about a year ago in LLVM. So if you use e.g.
llvm-objdumpto print symbol addresses, they will match what browser backtraces show, but not match what you see if you usellvm-dwarfdumpto look at the debug info, andllvm-symbolizerwill not get the right answer. I think the same mechanism in LLVM that causes the latter problem is what is happening whenllvm-objdumpis looking up line information from the debug info during disassembly (despite the fact that it's correctly finding the right code address when you ask it to disassemble a symbol by name).So this is an unfortunate mismatch and not everything works right, as you have seen. Emscripten has a tool emsymbolizer that knows a bunch of ways emscripten can store name/address information (e.g. DWARF, source maps, name sections) and can symbolize addresses. It papers over this problem using the
--adjust-vmaflag of llvm-symbolizer, but it currently only supports the use case of looking up a name or line from an address one at a time.We might be able to improve this situation. Adjusting how symbols are represented in LLVM is tricky, since they are used in various places in assembly, linking, etc. Ideally we also wouldn't need a bunch of special hacks in the tools such as llvm-objdump (although I wouldn't necessarily be above some kind of special case if it wasn't too horrible). [...]