You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/en/optimization/fp16.md
+7Lines changed: 7 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -239,6 +239,13 @@ The `step()` function is [called](https://github.com/huggingface/diffusers/blob/
239
239
240
240
In general, the `sigmas` should [stay on the CPU](https://github.com/huggingface/diffusers/blob/35a969d297cba69110d175ee79c59312b9f49e1e/src/diffusers/schedulers/scheduling_euler_discrete.py#L240) to avoid the communication sync and latency.
241
241
242
+
<Tip>
243
+
244
+
Refer to [this blog post](https://pytorch.org/blog/torch-compile-and-diffusers-a-hands-on-guide-to-peak-performance/) for a comprehensive coverage on how to use
245
+
`torch.compile` for getting maximum performance in Diffusers for various scenarios.
246
+
247
+
</Tip>
248
+
242
249
### Benchmarks
243
250
244
251
Refer to the [diffusers/benchmarks](https://huggingface.co/datasets/diffusers/benchmarks) dataset to see inference latency and memory usage data for compiled pipelines.
0 commit comments