|
| 1 | +============== |
| 2 | +Debugging LLVM |
| 3 | +============== |
| 4 | + |
| 5 | +This document is a collection of tips and tricks for debugging LLVM |
| 6 | +using a source-level debugger. The assumption is that you are trying to |
| 7 | +figure out the root cause of a miscompilation in the program that you |
| 8 | +are compiling. |
| 9 | + |
| 10 | +Extract and rerun the compile command |
| 11 | +===================================== |
| 12 | + |
| 13 | +Extract the Clang command that produces the buggy code. The way to do |
| 14 | +this depends on the build system used by your program. |
| 15 | + |
| 16 | +- For Ninja-based build systems, you can pass ``-t commands`` to Ninja |
| 17 | + and filter the output by the targeted source file name. For example: |
| 18 | + ``ninja -t commands myprogram | grep path/to/file.cpp``. |
| 19 | + |
| 20 | +- For Bazel-based build systems using Bazel 9 or newer (not released yet |
| 21 | + as of this writing), you can pass ``--output=commands`` to the ``bazel |
| 22 | + aquery`` subcommand for a similar result. For example: ``bazel aquery |
| 23 | + --output=commands 'deps(//myprogram)' | grep path/to/file.cpp``. Build |
| 24 | + commands must generally be run from a subdirectory of the source |
| 25 | + directory named ``bazel-$PROJECTNAME``. Bazel typically makes the target |
| 26 | + paths of ``-o`` and ``-MF`` read-only when running commands outside |
| 27 | + of a build, so it may be necessary to change or remove these flags. |
| 28 | + |
| 29 | +- A method that should work with any build system is to build your program |
| 30 | + under `Bear <https://github.com/rizsotto/Bear>`_ and look for the |
| 31 | + compile command in the resulting ``compile_commands.json`` file. |
| 32 | + |
| 33 | +Once you have the command you can use the following steps to debug |
| 34 | +it. Note that any flags mentioned later in this document are LLVM flags |
| 35 | +so they must be prefixed with ``-mllvm`` when passed to the Clang driver, |
| 36 | +e.g. ``-mllvm -print-after-all``. |
| 37 | + |
| 38 | +Understanding the source of the issue |
| 39 | +===================================== |
| 40 | + |
| 41 | +If you have a miscompilation introduced by a pass, it is |
| 42 | +frequently possible to identify the pass where things go wrong |
| 43 | +by searching a pass-by-pass printout, which is enabled using the |
| 44 | +``-print-after-all`` flag. Pipe stderr into ``less`` (append ``2>&1 | |
| 45 | +less`` to command line) and use text search to move between passes |
| 46 | +(e.g. type ``/Dump After<Enter>``, ``n`` to move to next pass, |
| 47 | +``N`` to move to previous pass). If the name of the function |
| 48 | +containing the buggy IR is known, you can filter the output by passing |
| 49 | +``-filter-print-funcs=functionname``. You can sometimes pass ``-debug`` to |
| 50 | +get useful details about what passes are doing. See also `PrintPasses.cpp |
| 51 | +<https://github.com/llvm/llvm-project/blob/main/llvm/lib/IR/PrintPasses.cpp>`_ |
| 52 | +for more useful options. |
| 53 | + |
| 54 | +Creating a debug build of LLVM |
| 55 | +============================== |
| 56 | + |
| 57 | +The subsequent debugging steps require a debug build of LLVM. Pass the |
| 58 | +``-DCMAKE_BUILD_TYPE=Debug`` to CMake in a separate build tree to create |
| 59 | +a debug build. |
| 60 | + |
| 61 | +Understanding where an instruction came from |
| 62 | +============================================ |
| 63 | + |
| 64 | +A common debugging task involves understanding which part of the code |
| 65 | +introduced a buggy instruction. The pass-by-pass dump is sometimes enough, |
| 66 | +but for complex or unfamiliar passes, more information is often required. |
| 67 | + |
| 68 | +The first step is to record a run of the debug build of Clang under `rr |
| 69 | +<https://rr-project.org>`_ passing the LLVM flag ``-print-inst-addrs`` |
| 70 | +together with ``-print-after-all`` and any desired filters. This will |
| 71 | +cause each instruction printed by LLVM to be suffixed with a comment |
| 72 | +showing the address of the ``Instruction`` object. You can then replay |
| 73 | +the run of Clang with ``rr replay``. Because ``rr`` is deterministic, |
| 74 | +the instruction will receive the same address during the replay, so |
| 75 | +you can break on the instruction's construction using a conditional |
| 76 | +breakpoint that checks for the address printed by LLVM, with commands |
| 77 | +such as the following: |
| 78 | + |
| 79 | +.. code-block:: text |
| 80 | +
|
| 81 | + b Instruction::Instruction if this == 0x12345678 |
| 82 | +
|
| 83 | +When the breakpoint is hit, you will likely be at the location where |
| 84 | +the instruction was created, so you can unwind the stack with ``bt`` |
| 85 | +to see the stack trace. It is also possible that an instruction was |
| 86 | +created multiple times at the same address, so you may need to continue |
| 87 | +until reaching the desired location, but in the author's experience this |
| 88 | +is unlikely to occur. |
| 89 | + |
| 90 | +Identifying the source locations of instructions |
| 91 | +================================================ |
| 92 | + |
| 93 | +To identify the source location that caused a particular instruction |
| 94 | +to be created, you can pass the LLVM flag ``-print-inst-debug-locs`` |
| 95 | +and each instruction printed by LLVM is suffixed with the file and line |
| 96 | +number of the instruction according to the debug information. Note that |
| 97 | +this requires debug information to be enabled (e.g. pass ``-g`` to Clang). |
| 98 | + |
| 99 | +GDB pretty printers |
| 100 | +=================== |
| 101 | + |
| 102 | +A handful of `GDB pretty printers |
| 103 | +<https://sourceware.org/gdb/onlinedocs/gdb/Pretty-Printing.html>`__ are |
| 104 | +provided for some of the core LLVM libraries. To use them, execute the |
| 105 | +following (or add it to your ``~/.gdbinit``):: |
| 106 | + |
| 107 | + source /path/to/llvm/src/utils/gdb-scripts/prettyprinters.py |
| 108 | + |
| 109 | +It also might be handy to enable the `print pretty |
| 110 | +<https://sourceware.org/gdb/current/onlinedocs/gdb.html/Print-Settings.html>`__ |
| 111 | +option to avoid data structures being printed as a big block of text. |
0 commit comments