Skip to content

Commit 38633c0

Browse files
[rocm7.1_internal_testing_inductor][ROCm][inductor] Additional pointwise tunings (#2653)
This config improves the performance of a 1D pointwise kernel by 20% as measured on MI350. (cherry picked from commit a7bac0a) Duplicate of this #2642
1 parent 6259a21 commit 38633c0

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

torch/_inductor/runtime/triton_heuristics.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2510,6 +2510,15 @@ def pointwise(
25102510
),
25112511
*hinted_configs,
25122512
]
2513+
# Additional reduction configs appended for ROCm builds
2514+
if torch.version.hip:
2515+
configs.append(triton_config_with_settings(
2516+
size_hints,
2517+
2048,
2518+
num_warps=8,
2519+
num_stages=2,
2520+
waves_per_eu=1
2521+
)) # 20% improvement
25132522
if len(size_hints) == 2:
25142523
if (
25152524
disable_pointwise_autotuning(inductor_meta) # or tile_hint == TileHint.SQUARE

0 commit comments

Comments
 (0)