Skip to content

Commit fe47f98

Browse files
authored
09-persistent-matmul.py bugfix (#4820)
Currently we sleep between each rep for Triton kernels, but not for the cuBLAS kernel. This may improve cuBLAS performance on fp8 due to thermal issues.
1 parent 0b4feb7 commit fe47f98

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

python/tutorials/09-persistent-matmul.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -554,7 +554,7 @@ def bench(K, dtype, tiles_per_update, reps=10):
554554
if cublas is not None:
555555
for _ in range(reps):
556556
cublas_matmul(a, b)
557-
time.sleep(0.01)
557+
time.sleep(0.01)
558558
if dtype == torch.float16:
559559
for _ in range(reps):
560560
torch_matmul(a, b)

0 commit comments

Comments
 (0)