Skip to content

Commit d5e06fe

Browse files
authored
[Triton] Generate local MLIR reproducers when possible (#5155)
By setting a reproducer path, the pass manager will dump a standard MLIR reproducer before each pass manager invocation. This PR also enables additional local crash reproducer generation (to the same path set through the env var), which tries to narrow down the specific pass that failed, if the pass pipeline fails at any point.
1 parent 8bf3ae9 commit d5e06fe

File tree

2 files changed

+16
-0
lines changed

2 files changed

+16
-0
lines changed

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,9 @@ For detailed instructions on how to debug Triton's frontend, please refer to thi
176176
kernels. Use `MLIR_ENABLE_DUMP=kernelName` to dump for a specific kernel only.
177177
- Triton cache can interfere with the dump. In cases where `MLIR_ENABLE_DUMP=1` does not work, try cleaning your triton cache: `rm -r ~/.triton/cache/*`
178178
- `LLVM_IR_ENABLE_DUMP=1` dumps the IR before every pass run over the LLVM IR.
179+
- `TRITON_REPRODUCER_PATH=<reproducer_path>` will generate an MLIR reproducer file
180+
at `<reproducer_path>` before each MLIR compiler stage. If any of the stages fail,
181+
`<reproducer_path>` will be a local MLIR reproducer captured right before the failing pass.
179182
- `TRITON_INTERPRET=1` uses the Triton interpreter instead of running on the
180183
GPU. You can insert Python breakpoints in your kernel code!
181184
- `TRITON_ENABLE_LLVM_DEBUG=1` passes `-debug` to LLVM, printing a lot of

python/src/ir.cc

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1707,7 +1707,14 @@ void init_triton_ir(py::module &&m) {
17071707
auto anchorName = self.getOpAnchorName();
17081708
auto passes = self.getPasses();
17091709
Operation *op = mod.getOperation();
1710+
// Save a reproducer for the current pass manager invocation
1711+
// immediately.
17101712
makeReproducer(anchorName, passes, op, reproducerPath);
1713+
// But if the pass manager crashes, attempt to generate a local
1714+
// reproducer instead.
1715+
mod.getContext()->disableMultithreading();
1716+
self.enableCrashReproducerGeneration(reproducerPath,
1717+
/*genLocalReproducer=*/true);
17111718
}
17121719

17131720
if (triton::tools::getBoolEnv("TRITON_ENABLE_LLVM_DEBUG")) {
@@ -1740,6 +1747,12 @@ void init_triton_ir(py::module &&m) {
17401747
self.enableTiming();
17411748
}
17421749

1750+
// Run the pass manager under a source manager diagnostic handler, which
1751+
// enables emitted MLIR diagnostics to directly reference Python source
1752+
// code.
1753+
llvm::SourceMgr sourceMgr;
1754+
SourceMgrDiagnosticHandler diagHandler(sourceMgr, mod.getContext(),
1755+
llvm::errs());
17431756
if (failed(self.run(mod.getOperation())))
17441757
throw std::runtime_error("PassManager::run failed");
17451758
});

0 commit comments

Comments
 (0)