Skip to content

Commit 529a523

Browse files
authored
Merge branch 'main' into lora-hot-swapping
2 parents e07323a + e45c25d commit 529a523

File tree

156 files changed

+4651
-293
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

156 files changed

+4651
-293
lines changed

.github/workflows/pr_test_peft_backend.yml

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -92,12 +92,14 @@ jobs:
9292
run: |
9393
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
9494
python -m uv pip install -e [quality,test]
95+
# TODO (sayakpaul, DN6): revisit `--no-deps`
9596
if [ "${{ matrix.lib-versions }}" == "main" ]; then
96-
python -m pip install -U peft@git+https://github.com/huggingface/peft.git
97-
python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git
98-
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
97+
python -m pip install -U peft@git+https://github.com/huggingface/peft.git --no-deps
98+
python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
99+
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git --no-deps
99100
else
100-
python -m uv pip install -U peft transformers accelerate
101+
python -m uv pip install -U peft --no-deps
102+
python -m uv pip install -U transformers accelerate --no-deps
101103
fi
102104
103105
- name: Environment

docker/diffusers-onnxruntime-cuda/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ ENV PATH="/opt/venv/bin:$PATH"
2828
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
2929
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3030
python3.10 -m uv pip install --no-cache-dir \
31-
torch \
31+
"torch<2.5.0" \
3232
torchvision \
3333
torchaudio \
3434
"onnxruntime-gpu>=1.13.1" \

docker/diffusers-pytorch-compile-cuda/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ ENV PATH="/opt/venv/bin:$PATH"
2929
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
3030
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3131
python3.10 -m uv pip install --no-cache-dir \
32-
torch \
32+
"torch<2.5.0" \
3333
torchvision \
3434
torchaudio \
3535
invisible_watermark && \

docker/diffusers-pytorch-cpu/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ ENV PATH="/opt/venv/bin:$PATH"
2929
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
3030
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3131
python3.10 -m uv pip install --no-cache-dir \
32-
torch \
32+
"torch<2.5.0" \
3333
torchvision \
3434
torchaudio \
3535
invisible_watermark \

docker/diffusers-pytorch-cuda/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ ENV PATH="/opt/venv/bin:$PATH"
2929
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
3030
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3131
python3.10 -m uv pip install --no-cache-dir \
32-
torch \
32+
"torch<2.5.0" \
3333
torchvision \
3434
torchaudio \
3535
invisible_watermark && \

docker/diffusers-pytorch-xformers-cuda/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ ENV PATH="/opt/venv/bin:$PATH"
2929
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
3030
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3131
python3.10 -m pip install --no-cache-dir \
32-
torch \
32+
"torch<2.5.0" \
3333
torchvision \
3434
torchaudio \
3535
invisible_watermark && \

docs/source/en/_toctree.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,12 @@
150150
title: Reinforcement learning training with DDPO
151151
title: Methods
152152
title: Training
153+
- sections:
154+
- local: quantization/overview
155+
title: Getting Started
156+
- local: quantization/bitsandbytes
157+
title: bitsandbytes
158+
title: Quantization Methods
153159
- sections:
154160
- local: optimization/fp16
155161
title: Speed up inference
@@ -209,6 +215,8 @@
209215
title: Logging
210216
- local: api/outputs
211217
title: Outputs
218+
- local: api/quantization
219+
title: Quantization
212220
title: Main Classes
213221
- isExpanded: false
214222
sections:

docs/source/en/api/pipelines/controlnet_flux.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
<!--Copyright 2024 The HuggingFace Team and The InstantX Team. All rights reserved.
1+
<!--Copyright 2024 The HuggingFace Team, The InstantX Team, and the XLabs Team. All rights reserved.
22
33
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
44
the License. You may obtain a copy of the License at
@@ -31,6 +31,14 @@ This controlnet code is implemented by [The InstantX Team](https://huggingface.c
3131
| Depth | [The InstantX Team](https://huggingface.co/InstantX) | [Link](https://huggingface.co/Shakker-Labs/FLUX.1-dev-ControlNet-Depth) |
3232
| Union | [The InstantX Team](https://huggingface.co/InstantX) | [Link](https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union) |
3333

34+
XLabs ControlNets are also supported, which was contributed by the [XLabs team](https://huggingface.co/XLabs-AI).
35+
36+
| ControlNet type | Developer | Link |
37+
| -------- | ---------- | ---- |
38+
| Canny | [The XLabs Team](https://huggingface.co/XLabs-AI) | [Link](https://huggingface.co/XLabs-AI/flux-controlnet-canny-diffusers) |
39+
| Depth | [The XLabs Team](https://huggingface.co/XLabs-AI) | [Link](https://huggingface.co/XLabs-AI/flux-controlnet-depth-diffusers) |
40+
| HED | [The XLabs Team](https://huggingface.co/XLabs-AI) | [Link](https://huggingface.co/XLabs-AI/flux-controlnet-hed-diffusers) |
41+
3442

3543
<Tip>
3644

docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_3.md

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,11 @@ image = pipe(
5454
image.save("sd3_hello_world.png")
5555
```
5656

57+
**Note:** Stable Diffusion 3.5 can also be run using the SD3 pipeline, and all mentioned optimizations and techniques apply to it as well. In total there are three official models in the SD3 family:
58+
- [`stabilityai/stable-diffusion-3-medium-diffusers`](https://huggingface.co/stabilityai/stable-diffusion-3-medium-diffusers)
59+
- [`stabilityai/stable-diffusion-3.5-large`](https://huggingface.co/stabilityai/stable-diffusion-3-5-large)
60+
- [`stabilityai/stable-diffusion-3.5-large-turbo`](https://huggingface.co/stabilityai/stable-diffusion-3-5-large-turbo)
61+
5762
## Memory Optimisations for SD3
5863

5964
SD3 uses three text encoders, one if which is the very large T5-XXL model. This makes it challenging to run the model on GPUs with less than 24GB of VRAM, even when using `fp16` precision. The following section outlines a few memory optimizations in Diffusers that make it easier to run SD3 on low resource hardware.
@@ -308,6 +313,26 @@ image = pipe("a picture of a cat holding a sign that says hello world").images[0
308313
image.save('sd3-single-file-t5-fp8.png')
309314
```
310315

316+
### Loading the single file checkpoint for the Stable Diffusion 3.5 Transformer Model
317+
318+
```python
319+
import torch
320+
from diffusers import SD3Transformer2DModel, StableDiffusion3Pipeline
321+
322+
transformer = SD3Transformer2DModel.from_single_file(
323+
"https://huggingface.co/stabilityai/stable-diffusion-3.5-large-turbo/blob/main/sd3.5_large.safetensors",
324+
torch_dtype=torch.bfloat16,
325+
)
326+
pipe = StableDiffusion3Pipeline.from_pretrained(
327+
"stabilityai/stable-diffusion-3.5-large",
328+
transformer=transformer,
329+
torch_dtype=torch.bfloat16,
330+
)
331+
pipe.enable_model_cpu_offload()
332+
image = pipe("a cat holding a sign that says hello world").images[0]
333+
image.save("sd35.png")
334+
```
335+
311336
## StableDiffusion3Pipeline
312337

313338
[[autodoc]] StableDiffusion3Pipeline

docs/source/en/api/quantization.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
12+
-->
13+
14+
# Quantization
15+
16+
Quantization techniques reduce memory and computational costs by representing weights and activations with lower-precision data types like 8-bit integers (int8). This enables loading larger models you normally wouldn't be able to fit into memory, and speeding up inference. Diffusers supports 8-bit and 4-bit quantization with [bitsandbytes](https://huggingface.co/docs/bitsandbytes/en/index).
17+
18+
Quantization techniques that aren't supported in Transformers can be added with the [`DiffusersQuantizer`] class.
19+
20+
<Tip>
21+
22+
Learn how to quantize models in the [Quantization](../quantization/overview) guide.
23+
24+
</Tip>
25+
26+
27+
## BitsAndBytesConfig
28+
29+
[[autodoc]] BitsAndBytesConfig
30+
31+
## DiffusersQuantizer
32+
33+
[[autodoc]] quantizers.base.DiffusersQuantizer

0 commit comments

Comments
 (0)