File tree Expand file tree Collapse file tree 1 file changed +11
-0
lines changed Expand file tree Collapse file tree 1 file changed +11
-0
lines changed Original file line number Diff line number Diff line change @@ -18,6 +18,17 @@ Summary of the optimizations:
1818 * ` coordinate_descent_check_all_directions = True `
1919* ` torch.export ` + Ahead-of-time Inductor (AOTI) + CUDAGraphs
2020
21+ All of the above optimizations are lossless (outside of minor numerical differences sometimes
22+ introduced through the use of ` torch.compile ` / ` torch.export ` ) EXCEPT FOR dynamic float8 quantization.
23+ Disable quantization if you want the same quality results as the baseline while still being
24+ quite a bit faster.
25+
26+ ** Example baseline output:**
27+ ![ baseline_output] ( https://github.com/user-attachments/assets/8ba746d2-fbf3-4e30-adc4-11303231c146 )
28+
29+ ** Example fully-optimized output (with quantization):**
30+ ![ fast_output] ( https://github.com/user-attachments/assets/1a31dec4-38d5-45b2-8ae6-c7fb2e6413a4 )
31+
2132## Setup
2233We rely primarily on pure PyTorch for the optimizations. Currently, a relatively recent nightly version of PyTorch is required.
2334The numbers reported here were gathered using:
You can’t perform that action at this time.
0 commit comments