Skip to content

Commit 101d10c

Browse files
committed
update docs
1 parent de97a51 commit 101d10c

File tree

1 file changed

+9
-0
lines changed

1 file changed

+9
-0
lines changed

docs/source/en/quantization/torchao.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,15 @@ image = pipe(prompt, num_inference_steps=4, guidance_scale=0.0).images[0]
4848
image.save("output.png")
4949
```
5050

51+
TorchAO offers seamless compatibility with `torch.compile`, setting it apart from other quantization methods. This ensures one to achieve remarkable speedups with ease.
52+
53+
```python
54+
# In the above code, add the following after initializing the transformer
55+
transformer = torch.compile(transformer, mode="max-autotune", fullgraph=True)
56+
```
57+
58+
For speed/memory benchmarks on Flux/CogVideoX, please refer to the table [here](https://github.com/huggingface/diffusers/pull/10009#issue-2688781450).
59+
5160
Additionally, TorchAO supports an automatic quantization API exposed with [`autoquant`](https://github.com/pytorch/ao/blob/main/torchao/quantization/README.md#autoquantization). Autoquantization determines the best quantization strategy applicable to a model by comparing the performance of each technique on chosen input types and shapes. This can directly be used with the underlying modeling components at the moment, but Diffusers will also expose an autoquant configuration option in the future.
5261

5362
## Resources

0 commit comments

Comments
 (0)