Skip to content

Commit 77894ef

Browse files
authored
Speedup 'rewrite stack ptr' pass by moving it after canonicalizer (#4841)
`0.6506 ( 4.4%) 0.6506 ( 6.9%) TritonIntelGPURewriteStackPtr` vs `0.2397 ( 1.9%) 0.2397 ( 3.1%) TritonIntelGPURewriteStackPtr` Canonicalizer reduces 300k IR to 100k IR which speeds things up a lot. This data for one of the big Flex Attn kernels. Signed-off-by: Anatoly Myachev <[email protected]>
1 parent 9e5bd73 commit 77894ef

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

third_party/intel/backend/compiler.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -320,8 +320,8 @@ def make_llir(src, metadata, options):
320320
intel.passes.ttgpuir.add_to_llvmir(pm, options.advanced_path, options.one_matrix_per_load_for_bt,
321321
options.enable_tile_load_linear_layout)
322322
intel.passes.ttgpuir.add_gen_to_llvm(pm)
323-
intel.passes.ttgpuir.add_rewrite_stack_ptr(pm)
324323
passes.common.add_canonicalizer(pm)
324+
intel.passes.ttgpuir.add_rewrite_stack_ptr(pm)
325325
passes.common.add_cse(pm)
326326
passes.convert.add_arith_to_llvmir(pm)
327327
passes.common.add_canonicalizer(pm)

0 commit comments

Comments
 (0)