Skip to content

Commit 31fe770

Browse files
authored
Improve performance of shape 1024x1024x1024 out of box (#2839)
Close #2822 After: 90% -> 103%
1 parent 4d3a94d commit 31fe770

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

benchmarks/triton_kernels_benchmark/gemm_benchmark.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@
3232
] + [
3333
triton.Config(
3434
{'BLOCK_SIZE_M': 256, 'BLOCK_SIZE_N': 128, 'BLOCK_SIZE_K': 32, 'GROUP_SIZE_M': 4, 'grf_mode': 'large'},
35-
num_stages=s, num_warps=32) for s in [2, 3]
35+
num_stages=s, num_warps=32) for s in [2, 3, 4]
3636
] + [
3737
triton.Config(
3838
{'BLOCK_SIZE_M': 64, 'BLOCK_SIZE_N': 128, 'BLOCK_SIZE_K': 32, 'GROUP_SIZE_M': 4, 'grf_mode': 'large'},

0 commit comments

Comments
 (0)