Skip to content

Commit b6bbbcb

Browse files
authored
Update README.md
1 parent f975cc9 commit b6bbbcb

File tree

1 file changed

+11
-0
lines changed

1 file changed

+11
-0
lines changed

README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,17 @@ Summary of the optimizations:
1818
* `coordinate_descent_check_all_directions = True`
1919
* `torch.export` + Ahead-of-time Inductor (AOTI) + CUDAGraphs
2020

21+
All of the above optimizations are lossless (outside of minor numerical differences sometimes
22+
introduced through the use of `torch.compile` / `torch.export`) EXCEPT FOR dynamic float8 quantization.
23+
Disable quantization if you want the same quality results as the baseline while still being
24+
quite a bit faster.
25+
26+
**Example baseline output:**
27+
![baseline_output](https://github.com/user-attachments/assets/8ba746d2-fbf3-4e30-adc4-11303231c146)
28+
29+
**Example fully-optimized output (with quantization):**
30+
![fast_output](https://github.com/user-attachments/assets/1a31dec4-38d5-45b2-8ae6-c7fb2e6413a4)
31+
2132
## Setup
2233
We rely primarily on pure PyTorch for the optimizations. Currently, a relatively recent nightly version of PyTorch is required.
2334
The numbers reported here were gathered using:

0 commit comments

Comments
 (0)