| 
 | 1 | +==============  | 
 | 2 | +Debugging LLVM  | 
 | 3 | +==============  | 
 | 4 | + | 
 | 5 | +This document is a collection of tips and tricks for debugging LLVM  | 
 | 6 | +using a source-level debugger. The assumption is that you are trying to  | 
 | 7 | +figure out the root cause of a miscompilation in the program that you  | 
 | 8 | +are compiling.  | 
 | 9 | + | 
 | 10 | +Extract and rerun the compile command  | 
 | 11 | +=====================================  | 
 | 12 | + | 
 | 13 | +Extract the Clang command that produces the buggy code. The way to do  | 
 | 14 | +this depends on the build system used by your program.  | 
 | 15 | + | 
 | 16 | +- For Ninja-based build systems, you can pass ``-t commands`` to Ninja  | 
 | 17 | +  and filter the output by the targeted source file name. For example:  | 
 | 18 | +  ``ninja -t commands myprogram | grep path/to/file.cpp``.  | 
 | 19 | + | 
 | 20 | +- For Bazel-based build systems using Bazel 9 or newer (not released yet  | 
 | 21 | +  as of this writing), you can pass ``--output=commands`` to the ``bazel  | 
 | 22 | +  aquery`` subcommand for a similar result. For example: ``bazel aquery  | 
 | 23 | +  --output=commands 'deps(//myprogram)' | grep path/to/file.cpp``. Build  | 
 | 24 | +  commands must generally be run from a subdirectory of the source  | 
 | 25 | +  directory named ``bazel-$PROJECTNAME``. Bazel typically makes the target  | 
 | 26 | +  paths of ``-o`` and ``-MF`` read-only when running commands outside  | 
 | 27 | +  of a build, so it may be necessary to change or remove these flags.  | 
 | 28 | + | 
 | 29 | +- A method that should work with any build system is to build your program  | 
 | 30 | +  under `Bear <https://github.com/rizsotto/Bear>`_ and look for the  | 
 | 31 | +  compile command in the resulting ``compile_commands.json`` file.  | 
 | 32 | + | 
 | 33 | +Once you have the command you can use the following steps to debug  | 
 | 34 | +it. Note that any flags mentioned later in this document are LLVM flags  | 
 | 35 | +so they must be prefixed with ``-mllvm`` when passed to the Clang driver,  | 
 | 36 | +e.g. ``-mllvm -print-after-all``.  | 
 | 37 | + | 
 | 38 | +Understanding the source of the issue  | 
 | 39 | +=====================================  | 
 | 40 | + | 
 | 41 | +If you have a miscompilation introduced by a pass, it is  | 
 | 42 | +frequently possible to identify the pass where things go wrong  | 
 | 43 | +by searching a pass-by-pass printout, which is enabled using the  | 
 | 44 | +``-print-after-all`` flag. Pipe stderr into ``less`` (append ``2>&1 |  | 
 | 45 | +less`` to command line) and use text search to move between passes  | 
 | 46 | +(e.g. type ``/Dump After<Enter>``, ``n`` to move to next pass,  | 
 | 47 | +``N`` to move to previous pass). If the name of the function  | 
 | 48 | +containing the buggy IR is known, you can filter the output by passing  | 
 | 49 | +``-filter-print-funcs=functionname``. You can sometimes pass ``-debug`` to  | 
 | 50 | +get useful details about what passes are doing. See also  `PrintPasses.cpp  | 
 | 51 | +<https://github.com/llvm/llvm-project/blob/main/llvm/lib/IR/PrintPasses.cpp>`_  | 
 | 52 | +for more useful options.  | 
 | 53 | + | 
 | 54 | +Creating a debug build of LLVM  | 
 | 55 | +==============================  | 
 | 56 | + | 
 | 57 | +The subsequent debugging steps require a debug build of LLVM. Pass the  | 
 | 58 | +``-DCMAKE_BUILD_TYPE=Debug`` to CMake in a separate build tree to create  | 
 | 59 | +a debug build.  | 
 | 60 | + | 
 | 61 | +Understanding where an instruction came from  | 
 | 62 | +============================================  | 
 | 63 | + | 
 | 64 | +A common debugging task involves understanding which part of the code  | 
 | 65 | +introduced a buggy instruction. The pass-by-pass dump is sometimes enough,  | 
 | 66 | +but for complex or unfamiliar passes, more information is often required.  | 
 | 67 | + | 
 | 68 | +The first step is to record a run of the debug build of Clang under `rr  | 
 | 69 | +<https://rr-project.org>`_ passing the LLVM flag ``-print-inst-addrs``  | 
 | 70 | +together with ``-print-after-all`` and any desired filters. This will  | 
 | 71 | +cause each instruction printed by LLVM to be suffixed with a comment  | 
 | 72 | +showing the address of the ``Instruction`` object. You can then replay  | 
 | 73 | +the run of Clang with ``rr replay``. Because ``rr`` is deterministic,  | 
 | 74 | +the instruction will receive the same address during the replay, so  | 
 | 75 | +you can break on the instruction's construction using a conditional  | 
 | 76 | +breakpoint that checks for the address printed by LLVM, with commands  | 
 | 77 | +such as the following:  | 
 | 78 | + | 
 | 79 | +.. code-block:: text  | 
 | 80 | +
  | 
 | 81 | +    b Instruction::Instruction if this == 0x12345678  | 
 | 82 | +
  | 
 | 83 | +When the breakpoint is hit, you will likely be at the location where  | 
 | 84 | +the instruction was created, so you can unwind the stack with ``bt``  | 
 | 85 | +to see the stack trace. It is also possible that an instruction was  | 
 | 86 | +created multiple times at the same address, so you may need to continue  | 
 | 87 | +until reaching the desired location, but in the author's experience this  | 
 | 88 | +is unlikely to occur.  | 
 | 89 | + | 
 | 90 | +Identifying the source locations of instructions  | 
 | 91 | +================================================  | 
 | 92 | + | 
 | 93 | +To identify the source location that caused a particular instruction  | 
 | 94 | +to be created, you can pass the LLVM flag ``-print-inst-debug-locs``  | 
 | 95 | +and each instruction printed by LLVM is suffixed with the file and line  | 
 | 96 | +number of the instruction according to the debug information. Note that  | 
 | 97 | +this requires debug information to be enabled (e.g. pass ``-g`` to Clang).  | 
 | 98 | + | 
 | 99 | +GDB pretty printers  | 
 | 100 | +===================  | 
 | 101 | + | 
 | 102 | +A handful of `GDB pretty printers  | 
 | 103 | +<https://sourceware.org/gdb/onlinedocs/gdb/Pretty-Printing.html>`__ are  | 
 | 104 | +provided for some of the core LLVM libraries. To use them, execute the  | 
 | 105 | +following (or add it to your ``~/.gdbinit``)::  | 
 | 106 | + | 
 | 107 | +  source /path/to/llvm/src/utils/gdb-scripts/prettyprinters.py  | 
 | 108 | + | 
 | 109 | +It also might be handy to enable the `print pretty  | 
 | 110 | +<https://sourceware.org/gdb/current/onlinedocs/gdb.html/Print-Settings.html>`__  | 
 | 111 | +option to avoid data structures being printed as a big block of text.  | 
0 commit comments