Skip to content

Commit fca27a5

Browse files
committed
empty cache between each run to avoid OOM
1 parent d415da6 commit fca27a5

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

benchmarks/triton_kernels_benchmark/gemm_postop_addmatrix_benchmark.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -311,6 +311,7 @@ def benchmark(B, M, N, K, dtype, provider):
311311
# Maximum across onednn=600, triton=1000
312312
# For onednn and triton: Some configs increase performance with warmup as a step function, but some
313313
# slowly decrease with saturation. Performance is best at 150-200ms range, but we want stable, not just best
314+
torch.xpu.empty_cache()
314315
do_bench = benchmark_suite.get_do_bench(n_warmup=1000, n_repeat=10, quantiles=[0.5, 0.0, 1.0])
315316
res_dtype = torch.float32 if dtype.is_floating_point else torch.int32
316317
if dtype.is_floating_point:

0 commit comments

Comments
 (0)