|
| 1 | +--- |
| 2 | +hide: |
| 3 | + - tags |
| 4 | +tags: |
| 5 | + - CPU |
| 6 | + - CUDA |
| 7 | + - GPU |
| 8 | + - ROCm |
| 9 | +icon: octicons/bug-16 |
| 10 | +--- |
| 11 | + |
| 12 | +# LLVM debugging playbook |
| 13 | + |
| 14 | +This page aims to collect notes on how to debug or reduce issues that |
| 15 | +appear to arise from within LLVM itself and how to generate useful LLVM |
| 16 | +bug reports. |
| 17 | + |
| 18 | +This guide contains platform-independent notes applicable to both |
| 19 | +CPU and GPU compilation. Additional GPU-specific notes (such as how |
| 20 | +to perform binary substitutions in an AMD GPU context) are contained |
| 21 | +within [the GPU debugging playbook](./gpu.md). |
| 22 | + |
| 23 | +## Generating LLVM IR |
| 24 | + |
| 25 | +Wthen bisecting, reducing, or debugging an issue that might manifest within |
| 26 | +LLVM, it can be helpful to use the |
| 27 | +`--iree-hal-dump-executable-intermediates-to=[directory]` (or the more general |
| 28 | +`--iree-hal-dump-executable-files-to=[directory]`) flags to `iree-compile` |
| 29 | +or `iree-opt`. These flags will cause IREE to write out the compiled LLVM module |
| 30 | +to the specified directory so you can operate on it directly. |
| 31 | + |
| 32 | +Generally, there will be a `.linked` file, which contains the LLVM IR shortly after |
| 33 | +it was generated by MLIR (though after steps like bitcode library linking where |
| 34 | +applicable) and a `.optimized` file, which contains the IR after the `opt` passes |
| 35 | +have been run. The `.optimized` file may include reproduction instructions (if |
| 36 | +it doesn't, the relevant compiler plugin hasn't been updated to add them). |
| 37 | + |
| 38 | +Similarly, the final generated assembly (a `.s` or `.rocmasm` or so on) may |
| 39 | +include reproduction instructions. Where those are present, they should be helpful |
| 40 | +in manually recreating the LLVM compilation so that you no longer have to |
| 41 | +route any changes through IREE. |
| 42 | + |
| 43 | +For more details and other related flags, see |
| 44 | +[the documentation on dumping intermediate files](../general/developer-tips/#dumping-executable-files). |
| 45 | + |
| 46 | +!!! tip |
| 47 | + |
| 48 | + While `opt` is "target-independent", many passes (such as vectorization) |
| 49 | + have substantial dependencies on target information. Ensure your LLVM IR |
| 50 | + contains a `target triple` or that you're passing `-mtriple=` to your `opt` |
| 51 | + invocations. There are fewer dependencies on `-mcpu=`, but it should also be |
| 52 | + preserved to reduce debug variability. |
| 53 | + |
| 54 | +## LLVM binaries |
| 55 | + |
| 56 | +To create LLVM binaries that run on the same commit as your IREE checkout, |
| 57 | +use |
| 58 | + |
| 59 | +``` shell |
| 60 | +cmake --build [build-directory] --tragtet opt llc |
| 61 | +``` |
| 62 | + |
| 63 | +to produce binaries in `[build-directory]/llvm-project/bin`. You can similarly |
| 64 | +produce other utility binaries such as `llvm-reduce`, which aren't built by default. |
| 65 | + |
| 66 | +## Reducing optimization levels in IREE |
| 67 | + |
| 68 | +If you suspect an LLVM bug, try disabling (or reducing) one or both optimization |
| 69 | +levels. LLVM has two places where a `-O[n]` is applied: the middle-end |
| 70 | +(`opt`) and the backend/codegen (`llc`). The backend optimization level |
| 71 | +is selected by values of the `llvm::CodeGenOptLevel` enum, which is passed to |
| 72 | +a `createTargetMachine` call in the compiler plugin. This level defaults to |
| 73 | +`-O3`. On the other hand, the generic/middle-end `opt` optimization level |
| 74 | +is controlled by `llvm::OptimizationLevel` and defaults to `-O2` currently. |
| 75 | + |
| 76 | +Setting one or both of these values to the `-O0` or `-O1` equivalent |
| 77 | +and seeing the issue go away is an indicator that there may be a LLVM bug |
| 78 | +in play. It may, however, also indicate that there's a race condition or |
| 79 | +other correctness issue in the generated LLVM IR that is masked by a lack |
| 80 | +of compiler optimizations. |
| 81 | + |
| 82 | +## Useful flags for `opt` and `llc` |
| 83 | + |
| 84 | +- `-print-after-all` and `-print-before/after=[passname]` can help |
| 85 | + locate places were suspect IR is introduced or where crashes occur, just as their |
| 86 | + MLIR equivalents can be used in IREE. |
| 87 | +- `-print-module-scope` ensures IR dumps include attributes and metadata |
| 88 | + if those are relevant |
| 89 | +- The exact process for feeding a binary back into IREE after manually compiling |
| 90 | + it is target-specific, but will generally involve |
| 91 | + `--iree-hal-substitute-executable-object=[executable]=[filename]`. |
| 92 | +- `-global-isel=1` (changing the instruction selection system in `llc`) |
| 93 | + can be helpful in localizing a bug to instruction selection. If it solves |
| 94 | + your problem (or turns it into a different bug), you've substantially narrowed |
| 95 | + down the code that needs to be searched. |
| 96 | +- `opt` produces human-readable output whin passed the `-S` flag, and often |
| 97 | + needs a `-o -` to send its results to standard output. `llc` takes a |
| 98 | + `--filetype={asm,obj}` argument to control whether assembly or |
| 99 | + assembled objects are produced. |
| 100 | + |
| 101 | +## `llvm-diff` |
| 102 | + |
| 103 | +When adjusting an `opt` invocation to isolate misbehaving passes or when |
| 104 | +comparing LLVM IR from a working and a broken commit, you may be able to |
| 105 | +use the `llvm-diff` tool to compare two LLVM IR files without the noise that |
| 106 | +is induced by LLVM's IR numbering scheme. Note that `llvm-diff` output is written |
| 107 | +to standard error and should be redirected with a `2>&1`. |
| 108 | + |
| 109 | +## `llvm-reduce` |
| 110 | + |
| 111 | +In some cases - particularly compiler crashes, the `llvm-reduce` program |
| 112 | +(part of LLVM) may be useful. It takes a LLVM IR file and an "interestingness |
| 113 | +script" which returns 0 (success) if there **is** a problem with a proposed |
| 114 | +reduced input and fails otherwise. |
| 115 | + |
| 116 | +When writing such a script, the `not` tool (especially its |
| 117 | +`not --crash [program] [args]` mode) and `FileCheck` from the LLVM test suite |
| 118 | +are often useful. |
| 119 | + |
| 120 | +For example, the interestingness script used to reduce the crash in |
| 121 | +[issue #22001](https://github.com/iree-org/iree/issues/22001) was |
| 122 | + |
| 123 | +``` shell |
| 124 | +#!/usr/bin/env bash |
| 125 | + |
| 126 | +[llvm-bin]/not --crash [llvm-bin]/opt -passes='amdgpu-lower-buffer-fat-pointers' -disable-output "$@" 2>/dev/null |
| 127 | +``` |
| 128 | + |
| 129 | +which was used with `llvm-reduce -test=interesting.sh pre-buffer-loads.ll` |
| 130 | +on output created by adding |
| 131 | +`--print-before=amdgpu-lower-buffer-fat-pointers --print-module-scope` |
| 132 | +to the crashing `llc` invocation (the crashing pass was located through |
| 133 | +backtraces and `--print-after-all`). |
| 134 | + |
| 135 | +[This is a helpful LLVM slide deck on how to operate `llvm-reduce`](https://www.llvm.org/devmtg/2025-04/slides/tutorial/arsenault_reduce.pdf). |
| 136 | +These slides include other useful flags and tips. |
| 137 | + |
| 138 | +## Creating reproducers |
| 139 | + |
| 140 | +If you're planning to file a bug against LLVM, it's helpful to create a small reproducer. |
| 141 | + |
| 142 | +In many cases, an input that demonstrates the behavior you've identified as a bug |
| 143 | +can be either created with `llvm-reduce` or by hand. |
| 144 | + |
| 145 | +However, in some cases (such as incorrect dispatch results that aren't |
| 146 | +clearly attributable to a particular change) all you can do is create a |
| 147 | +reproduction harness. The exact process for creating these is target-specific, |
| 148 | +but such a harness should be a piece of standalone code that links against / loads |
| 149 | +different versions of the misbehaving input, calls the function at issue, |
| 150 | +and reports the results (likely checking against a naive implementation). |
| 151 | + |
| 152 | +This wrapper program should be accompanied by a simple build process that doesn't |
| 153 | +depend on IREE and instructions on how to run it. The build should produce binaries |
| 154 | +from LLVM IR - ideally, the post-`opt` IR, by calling `llc` (or, if needed, `opt`). |
| 155 | + |
| 156 | +If the bug goes away at different optimization levels, you should build a working |
| 157 | +and a non-working binary. |
0 commit comments