Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/en/api/pipelines/flux.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ Original model checkpoints for Flux can be found [here](https://huggingface.co/b

Flux can be quite expensive to run on consumer hardware devices. However, you can perform a suite of optimizations to run it faster and in a more memory-friendly manner. Check out [this section](https://huggingface.co/blog/sd3#memory-optimizations-for-sd3) for more details. Additionally, Flux can benefit from quantization for memory efficiency with a trade-off in inference latency. Refer to [this blog post](https://huggingface.co/blog/quanto-diffusers) to learn more. For an exhaustive list of resources, check out [this gist](https://gist.github.com/sayakpaul/b664605caf0aa3bf8585ab109dd5ac9c).

[Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.

</Tip>

Flux comes in the following variants:
Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/api/pipelines/hidream.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@

<Tip>

Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
[Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.

</Tip>

Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/api/pipelines/ltx_video.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ export_to_video(video, "output.mp4", fps=24)
</hfoption>
<hfoption id="inference speed">

[Compilation](../../optimization/fp16#torchcompile) is slow the first time but subsequent calls to the pipeline are faster.
[Compilation](../../optimization/fp16#torchcompile) is slow the first time but subsequent calls to the pipeline are faster. [Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.

```py
import torch
Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/api/pipelines/qwenimage.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Check out the model card [here](https://huggingface.co/Qwen/Qwen-Image) to learn

<Tip>

Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
[Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.

</Tip>

Expand Down
2 changes: 1 addition & 1 deletion docs/source/en/api/pipelines/wan.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ export_to_video(output, "output.mp4", fps=16)
</hfoption>
<hfoption id="T2V inference speed">

[Compilation](../../optimization/fp16#torchcompile) is slow the first time but subsequent calls to the pipeline are faster.
[Compilation](../../optimization/fp16#torchcompile) is slow the first time but subsequent calls to the pipeline are faster. [Caching](../../optimization/cache) may also speed up inference by storing and reusing intermediate outputs.

```py
# pip install ftfy
Expand Down