[docs] Pipeline-level quantization #11604

stevhliu · 2025-05-22T21:25:22Z

Lightly refactors the pipeline-level quantization doc to emphasize the two ways of using PipelineQuantizationConfig, the difference between diffusers.BitsAndBytesConfig and transformers.BitsAndBytesConfig, and create a new Resource section.

HuggingFaceDocBuilderDev · 2025-05-22T21:32:17Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul

Thank you! Great refactor. Some minor nits.

sayakpaul · 2025-05-23T02:59:25Z

docs/source/en/quantization/overview.md

+Initialize [`~quantizers.PipelineQuantizationConfig`] with the following parameters.

-</Tip>
+- `quant_backend` specifies which quantization backend to use. Currently supported backends include: `bitsandbytes_4bit`, `bitsandbytes_8bit`, `gguf`, `quanto`, and `torchao`.


Easy to miss when adding a new backend. Additionally, with the granular option, it is possible to use a quantization backend for transformers that may not exist in diffusers and vice-versa. How do we make that info clear? 👀

sayakpaul · 2025-05-23T03:00:08Z

docs/source/en/quantization/overview.md

 ```

-Then pass it to [`~DiffusionPipeline.from_pretrained`] and run inference:
+There is a separate bitsandbytes backend in [Transformers](https://huggingface.co/docs/transformers/main_classes/quantization#transformers.BitsAndBytesConfig). You need to import and use [`transformers.BitsAndBytesConfig`] for components that come from Transformers. For example, `text_encoder_2` in [`FluxPipeline`] is a [`~transformers.T5EncoderModel`] from Transformers so you need to use [`transformers.BitsAndBytesConfig`] instead of [`diffusers.BitsAndBytesConfig`].


Suggested change

There is a separate bitsandbytes backend in [Transformers](https://huggingface.co/docs/transformers/main_classes/quantization#transformers.BitsAndBytesConfig). You need to import and use [`transformers.BitsAndBytesConfig`] for components that come from Transformers. For example, `text_encoder_2` in [`FluxPipeline`] is a [`~transformers.T5EncoderModel`] from Transformers so you need to use [`transformers.BitsAndBytesConfig`] instead of [`diffusers.BitsAndBytesConfig`].

There is a separate `bitsandbytes` backend in [Transformers](https://huggingface.co/docs/transformers/main_classes/quantization#transformers.BitsAndBytesConfig). You need to import and use [`transformers.BitsAndBytesConfig`] for components that come from Transformers. For example, `text_encoder_2` in [`FluxPipeline`] is a [`~transformers.T5EncoderModel`] from Transformers so you need to use [`transformers.BitsAndBytesConfig`] instead of [`diffusers.BitsAndBytesConfig`].

sayakpaul · 2025-05-23T03:00:39Z

docs/source/en/quantization/overview.md

-recommended to quantize the text encoders that are memory-intensive. Some examples include T5,
-Llama, Gemma, etc. In the above example, you quantized the T5 model of [`FluxPipeline`] through
-`text_encoder_2` while keeping the CLIP model intact (accessible through `text_encoder`). 
+- Read the [Exploring Quantization Backends in Diffusers](https://huggingface.co/blog/diffusers-quantization) blog post for a brief introduction to each quantization backend, how to choose a backend, and combining quantization with other memory optimizations.


refactor

fabb52d

stevhliu requested a review from sayakpaul May 22, 2025 21:33

sayakpaul approved these changes May 23, 2025

View reviewed changes

DN6 merged commit 7ae546f into huggingface:main May 26, 2025
1 check passed

stevhliu deleted the quant branch May 27, 2025 18:22

DN6 added the roadmap Add to current release roadmap label Jun 5, 2025

github-project-automation bot added this to Diffusers Roadmap 0.36 Jun 5, 2025

github-project-automation bot moved this to In Progress in Diffusers Roadmap 0.36 Jun 5, 2025

DN6 moved this from In Progress to Done in Diffusers Roadmap 0.36 Jun 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[docs] Pipeline-level quantization #11604

[docs] Pipeline-level quantization #11604

Uh oh!

stevhliu commented May 22, 2025

Uh oh!

HuggingFaceDocBuilderDev commented May 22, 2025

Uh oh!

sayakpaul left a comment

Uh oh!

sayakpaul May 23, 2025

Uh oh!

sayakpaul May 23, 2025

Uh oh!

sayakpaul May 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[docs] Pipeline-level quantization #11604

[docs] Pipeline-level quantization #11604

Uh oh!

Conversation

stevhliu commented May 22, 2025

Uh oh!

HuggingFaceDocBuilderDev commented May 22, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul May 23, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul May 23, 2025

Choose a reason for hiding this comment

Uh oh!

sayakpaul May 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants