Skip to content

Commit 6f5baf6

Browse files
fywkevinYuanwei Fang
andauthored
Allow TRITON_KERNEL_OVERRIDE on .amdgcn and .hsaco files (#5394)
Enable the TRITON_KERNEL_OVERRIDE feature to work on AMD assembly and binary. Currently, for the backends, it only works on Nvidia `ptx` and `cubin`. --------- Co-authored-by: Yuanwei Fang <[email protected]>
1 parent f257479 commit 6f5baf6

File tree

2 files changed

+7
-7
lines changed

2 files changed

+7
-7
lines changed

README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -211,10 +211,10 @@ For detailed instructions on how to debug Triton's frontend, please refer to thi
211211
- `LLVM_ENABLE_TIMING` dumps the timing information for each LLVM pass.
212212
- `TRITON_DEFAULT_FP_FUSION` overrides the default behavior of allowing fp fusion (mul+add->fma).
213213
- `MLIR_ENABLE_REMARK` enables the performance warnings that are emitted as remarks.
214-
- `TRITON_KERNEL_DUMP` enables the dumping of the IR from each compilation stage and the final ptx.
215-
- `TRITON_DUMP_DIR` specifies the directory to save the dumped IR and ptx when `TRITON_KERNEL_DUMP` is set to 1.
216-
- `TRITON_KERNEL_OVERRIDE` enables the override of the compiled kernel with a user-specified IR/ptx at the beginning of each compilation stage.
217-
- `TRITON_OVERRIDE_DIR` specifies the directory from which to load the IR/ptx files when `TRITON_KERNEL_OVERRIDE` is set to 1.
214+
- `TRITON_KERNEL_DUMP` enables the dumping of the IR from each compilation stage and the final ptx/amdgcn.
215+
- `TRITON_DUMP_DIR` specifies the directory to save the dumped IR and ptx/amdgcn when `TRITON_KERNEL_DUMP` is set to 1.
216+
- `TRITON_KERNEL_OVERRIDE` enables the override of the compiled kernel with a user-specified IR/ptx/amdgcn at the beginning of each compilation stage.
217+
- `TRITON_OVERRIDE_DIR` specifies the directory from which to load the IR/ptx/amdgcn files when `TRITON_KERNEL_OVERRIDE` is set to 1.
218218

219219
**Kernel Override Steps**
220220

@@ -224,7 +224,7 @@ export TRITON_KERNEL_DUMP=1
224224
export TRITON_DUMP_DIR=<dump_dir>
225225
export TRITON_KERNEL_OVERRIDE=1
226226
export TRITON_OVERRIDE_DIR=<override_dir>
227-
# Step 1: Run the kernel once to dump kernel's IRs and ptx in $TRITON_DUMP_DIR
227+
# Step 1: Run the kernel once to dump kernel's IRs and ptx/amdgcn in $TRITON_DUMP_DIR
228228
# Step 2: Copy $TRITON_DUMP_DIR/<kernel_hash> to $TRITON_OVERRIDE_DIR
229229
# Step 3: Delete the stages that you do not want to override and modify the stage you do want to override
230230
# Step 4: Run the kernel again to see the overridden result

python/triton/compiler/compiler.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -170,9 +170,9 @@ def parse(full_name, ext, context):
170170
module = ir.parse_mlir_module(full_name, context)
171171
module.context = context
172172
return module
173-
if ext == "llir" or ext == "ptx":
173+
if ext == "llir" or ext == "ptx" or ext == "amdgcn":
174174
return Path(full_name).read_text()
175-
if ext == "cubin":
175+
if ext == "cubin" or ext == "hsaco":
176176
return Path(full_name).read_bytes()
177177

178178

0 commit comments

Comments
 (0)