Skip to content

Commit 6b8ee6d

Browse files
authored
Increase the dynamo recompile limit number for the flex attention benchmark testing (#3819)
The default number of Torch Dynamo recompile limit is 8. The benchmark test cases would fallback to eager mode kernel if the number is larger than 8. To increase the number to make sure all the flex attention kernel is running on Triton kernel.
1 parent cfe7c98 commit 6b8ee6d

File tree

2 files changed

+4
-0
lines changed

2 files changed

+4
-0
lines changed

benchmarks/triton_kernels_benchmark/flex_attention_benchmark_causal_mask.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,8 @@
1111
import triton_kernels_benchmark as benchmark_suit
1212
from triton_kernels_benchmark import xetla_kernel
1313

14+
torch._dynamo.config.recompile_limit = 100 # pylint: disable=protected-access
15+
1416
# Compile the flex_attention function
1517
flex_attention = torch.compile(flex_attention, dynamic=False)
1618

benchmarks/triton_kernels_benchmark/flex_attention_benchmark_custom_masks.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@
1212

1313
import triton_kernels_benchmark as benchmark_suit
1414

15+
torch._dynamo.config.recompile_limit = 100 # pylint: disable=protected-access
16+
1517
# Compile the flex_attention function
1618
flex_attention = torch.compile(flex_attention, dynamic=False)
1719

0 commit comments

Comments
 (0)