Skip to content

Commit b8639c1

Browse files
committed
add description on perf boost
1 parent 29effc5 commit b8639c1

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

prototype_source/max_autotune_on_CPU_tutorial.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,8 @@ This is similar to the max-autotune mode on CUDA, where implementations from ATe
1717

1818
We have covered most popular data types, including FP32, BF16, FP16, and INT8, with epilogue fusions for x86 CPUs.
1919

20+
While the development is still in progress, we have already seen promising speedups over pure ATen-based GEMMs as measured by the three benchmark suites and the inference of LLMs.
21+
2022
How to activate ``max-autotune`` mode
2123
------------
2224
To activate the ``max-autotune`` mode in PyTorch, set the ``mode`` argument to ``max-autotune`` when compiling your model using ``torch.compile``.

0 commit comments

Comments
 (0)