Skip to content

Commit 9e402de

Browse files
authored
Merge branch 'main' into pipeline_onnx_stable_diffusion-remove-float64
2 parents 6689572 + 1cb73cb commit 9e402de

File tree

390 files changed

+25894
-6206
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

390 files changed

+25894
-6206
lines changed

.github/workflows/benchmark.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ jobs:
3838
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
3939
python -m uv pip install -e [quality,test]
4040
python -m uv pip install pandas peft
41+
python -m uv pip uninstall transformers && python -m uv pip install transformers==4.48.0
4142
- name: Environment
4243
run: |
4344
python utils/print_env.py

.github/workflows/nightly_tests.yml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -414,12 +414,16 @@ jobs:
414414
config:
415415
- backend: "bitsandbytes"
416416
test_location: "bnb"
417+
additional_deps: ["peft"]
417418
- backend: "gguf"
418419
test_location: "gguf"
420+
additional_deps: ["peft"]
419421
- backend: "torchao"
420422
test_location: "torchao"
423+
additional_deps: []
421424
- backend: "optimum_quanto"
422425
test_location: "quanto"
426+
additional_deps: []
423427
runs-on:
424428
group: aws-g6e-xlarge-plus
425429
container:
@@ -437,6 +441,9 @@ jobs:
437441
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
438442
python -m uv pip install -e [quality,test]
439443
python -m uv pip install -U ${{ matrix.config.backend }}
444+
if [ "${{ join(matrix.config.additional_deps, ' ') }}" != "" ]; then
445+
python -m uv pip install ${{ join(matrix.config.additional_deps, ' ') }}
446+
fi
440447
python -m uv pip install pytest-reportlog
441448
- name: Environment
442449
run: |

.github/workflows/pr_style_bot.yml

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -13,39 +13,5 @@ jobs:
1313
uses: huggingface/huggingface_hub/.github/workflows/style-bot-action.yml@main
1414
with:
1515
python_quality_dependencies: "[quality]"
16-
pre_commit_script_name: "Download and Compare files from the main branch"
17-
pre_commit_script: |
18-
echo "Downloading the files from the main branch"
19-
20-
curl -o main_Makefile https://raw.githubusercontent.com/huggingface/diffusers/main/Makefile
21-
curl -o main_setup.py https://raw.githubusercontent.com/huggingface/diffusers/refs/heads/main/setup.py
22-
curl -o main_check_doc_toc.py https://raw.githubusercontent.com/huggingface/diffusers/refs/heads/main/utils/check_doc_toc.py
23-
24-
echo "Compare the files and raise error if needed"
25-
26-
diff_failed=0
27-
if ! diff -q main_Makefile Makefile; then
28-
echo "Error: The Makefile has changed. Please ensure it matches the main branch."
29-
diff_failed=1
30-
fi
31-
32-
if ! diff -q main_setup.py setup.py; then
33-
echo "Error: The setup.py has changed. Please ensure it matches the main branch."
34-
diff_failed=1
35-
fi
36-
37-
if ! diff -q main_check_doc_toc.py utils/check_doc_toc.py; then
38-
echo "Error: The utils/check_doc_toc.py has changed. Please ensure it matches the main branch."
39-
diff_failed=1
40-
fi
41-
42-
if [ $diff_failed -eq 1 ]; then
43-
echo "❌ Error happened as we detected changes in the files that should not be changed ❌"
44-
exit 1
45-
fi
46-
47-
echo "No changes in the files. Proceeding..."
48-
rm -rf main_Makefile main_setup.py main_check_doc_toc.py
49-
style_command: "make style && make quality"
5016
secrets:
5117
bot_token: ${{ secrets.GITHUB_TOKEN }}

.github/workflows/pr_tests_gpu.yml

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,51 @@ env:
2828
PIPELINE_USAGE_CUTOFF: 1000000000 # set high cutoff so that only always-test pipelines run
2929

3030
jobs:
31+
check_code_quality:
32+
runs-on: ubuntu-22.04
33+
steps:
34+
- uses: actions/checkout@v3
35+
- name: Set up Python
36+
uses: actions/setup-python@v4
37+
with:
38+
python-version: "3.8"
39+
- name: Install dependencies
40+
run: |
41+
python -m pip install --upgrade pip
42+
pip install .[quality]
43+
- name: Check quality
44+
run: make quality
45+
- name: Check if failure
46+
if: ${{ failure() }}
47+
run: |
48+
echo "Quality check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make style && make quality'" >> $GITHUB_STEP_SUMMARY
49+
50+
check_repository_consistency:
51+
needs: check_code_quality
52+
runs-on: ubuntu-22.04
53+
steps:
54+
- uses: actions/checkout@v3
55+
- name: Set up Python
56+
uses: actions/setup-python@v4
57+
with:
58+
python-version: "3.8"
59+
- name: Install dependencies
60+
run: |
61+
python -m pip install --upgrade pip
62+
pip install .[quality]
63+
- name: Check repo consistency
64+
run: |
65+
python utils/check_copies.py
66+
python utils/check_dummies.py
67+
python utils/check_support_list.py
68+
make deps_table_check_updated
69+
- name: Check if failure
70+
if: ${{ failure() }}
71+
run: |
72+
echo "Repo consistency check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make fix-copies'" >> $GITHUB_STEP_SUMMARY
73+
3174
setup_torch_cuda_pipeline_matrix:
75+
needs: [check_code_quality, check_repository_consistency]
3276
name: Setup Torch Pipelines CUDA Slow Tests Matrix
3377
runs-on:
3478
group: aws-general-8-plus
@@ -133,6 +177,7 @@ jobs:
133177

134178
torch_cuda_tests:
135179
name: Torch CUDA Tests
180+
needs: [check_code_quality, check_repository_consistency]
136181
runs-on:
137182
group: aws-g4dn-2xlarge
138183
container:
@@ -201,7 +246,7 @@ jobs:
201246

202247
run_examples_tests:
203248
name: Examples PyTorch CUDA tests on Ubuntu
204-
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
249+
needs: [check_code_quality, check_repository_consistency]
205250
runs-on:
206251
group: aws-g4dn-2xlarge
207252

@@ -220,6 +265,7 @@ jobs:
220265
- name: Install dependencies
221266
run: |
222267
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
268+
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
223269
python -m uv pip install -e [quality,test,training]
224270
225271
- name: Environment

docs/source/en/_toctree.yml

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,7 @@
175175
title: gguf
176176
- local: quantization/torchao
177177
title: torchao
178-
- local: quantization/quanto
178+
- local: quantization/quanto
179179
title: quanto
180180
title: Quantization Methods
181181
- sections:
@@ -265,19 +265,23 @@
265265
sections:
266266
- local: api/models/overview
267267
title: Overview
268+
- local: api/models/auto_model
269+
title: AutoModel
268270
- sections:
269271
- local: api/models/controlnet
270272
title: ControlNetModel
273+
- local: api/models/controlnet_union
274+
title: ControlNetUnionModel
271275
- local: api/models/controlnet_flux
272276
title: FluxControlNetModel
273277
- local: api/models/controlnet_hunyuandit
274278
title: HunyuanDiT2DControlNetModel
279+
- local: api/models/controlnet_sana
280+
title: SanaControlNetModel
275281
- local: api/models/controlnet_sd3
276282
title: SD3ControlNetModel
277283
- local: api/models/controlnet_sparsectrl
278284
title: SparseControlNetModel
279-
- local: api/models/controlnet_union
280-
title: ControlNetUnionModel
281285
title: ControlNets
282286
- sections:
283287
- local: api/models/allegro_transformer3d
@@ -298,6 +302,8 @@
298302
title: EasyAnimateTransformer3DModel
299303
- local: api/models/flux_transformer
300304
title: FluxTransformer2DModel
305+
- local: api/models/hidream_image_transformer
306+
title: HiDreamImageTransformer2DModel
301307
- local: api/models/hunyuan_transformer2d
302308
title: HunyuanDiT2DModel
303309
- local: api/models/hunyuan_video_transformer_3d
@@ -420,6 +426,8 @@
420426
title: ControlNet with Stable Diffusion 3
421427
- local: api/pipelines/controlnet_sdxl
422428
title: ControlNet with Stable Diffusion XL
429+
- local: api/pipelines/controlnet_sana
430+
title: ControlNet-Sana
423431
- local: api/pipelines/controlnetxs
424432
title: ControlNet-XS
425433
- local: api/pipelines/controlnetxs_sdxl
@@ -444,6 +452,8 @@
444452
title: Flux
445453
- local: api/pipelines/control_flux_inpaint
446454
title: FluxControlInpaint
455+
- local: api/pipelines/hidream
456+
title: HiDream-I1
447457
- local: api/pipelines/hunyuandit
448458
title: Hunyuan-DiT
449459
- local: api/pipelines/hunyuan_video
@@ -496,6 +506,8 @@
496506
title: PixArt-Σ
497507
- local: api/pipelines/sana
498508
title: Sana
509+
- local: api/pipelines/sana_sprint
510+
title: Sana Sprint
499511
- local: api/pipelines/self_attention_guidance
500512
title: Self-Attention Guidance
501513
- local: api/pipelines/semantic_stable_diffusion

docs/source/en/api/cache.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,33 @@ config = PyramidAttentionBroadcastConfig(
3838
pipe.transformer.enable_cache(config)
3939
```
4040

41+
## Faster Cache
42+
43+
[FasterCache](https://huggingface.co/papers/2410.19355) from Zhengyao Lv, Chenyang Si, Junhao Song, Zhenyu Yang, Yu Qiao, Ziwei Liu, Kwan-Yee K. Wong.
44+
45+
FasterCache is a method that speeds up inference in diffusion transformers by:
46+
- Reusing attention states between successive inference steps, due to high similarity between them
47+
- Skipping unconditional branch prediction used in classifier-free guidance by revealing redundancies between unconditional and conditional branch outputs for the same timestep, and therefore approximating the unconditional branch output using the conditional branch output
48+
49+
```python
50+
import torch
51+
from diffusers import CogVideoXPipeline, FasterCacheConfig
52+
53+
pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16)
54+
pipe.to("cuda")
55+
56+
config = FasterCacheConfig(
57+
spatial_attention_block_skip_range=2,
58+
spatial_attention_timestep_skip_range=(-1, 681),
59+
current_timestep_callback=lambda: pipe.current_timestep,
60+
attention_weight_callback=lambda _: 0.3,
61+
unconditional_batch_skip_range=5,
62+
unconditional_batch_timestep_skip_range=(-1, 781),
63+
tensor_format="BFCHW",
64+
)
65+
pipe.transformer.enable_cache(config)
66+
```
67+
4168
### CacheMixin
4269

4370
[[autodoc]] CacheMixin
@@ -47,3 +74,9 @@ pipe.transformer.enable_cache(config)
4774
[[autodoc]] PyramidAttentionBroadcastConfig
4875

4976
[[autodoc]] apply_pyramid_attention_broadcast
77+
78+
### FasterCacheConfig
79+
80+
[[autodoc]] FasterCacheConfig
81+
82+
[[autodoc]] apply_faster_cache
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# AutoModel
14+
15+
The `AutoModel` is designed to make it easy to load a checkpoint without needing to know the specific model class. `AutoModel` automatically retrieves the correct model class from the checkpoint `config.json` file.
16+
17+
```python
18+
from diffusers import AutoModel, AutoPipelineForText2Image
19+
20+
unet = AutoModel.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", subfolder="unet")
21+
pipe = AutoPipelineForText2Image.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", unet=unet)
22+
```
23+
24+
25+
## AutoModel
26+
27+
[[autodoc]] AutoModel
28+
- all
29+
- from_pretrained

docs/source/en/api/models/autoencoderkl_allegro.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ The model can be loaded with the following code snippet.
1818
```python
1919
from diffusers import AutoencoderKLAllegro
2020

21-
vae = AutoencoderKLCogVideoX.from_pretrained("rhymes-ai/Allegro", subfolder="vae", torch_dtype=torch.float32).to("cuda")
21+
vae = AutoencoderKLAllegro.from_pretrained("rhymes-ai/Allegro", subfolder="vae", torch_dtype=torch.float32).to("cuda")
2222
```
2323

2424
## AutoencoderKLAllegro
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# SanaControlNetModel
14+
15+
The ControlNet model was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang, Anyi Rao, Maneesh Agrawala. It provides a greater degree of control over text-to-image generation by conditioning the model on additional inputs such as edge maps, depth maps, segmentation maps, and keypoints for pose detection.
16+
17+
The abstract from the paper is:
18+
19+
*We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls. The neural architecture is connected with "zero convolutions" (zero-initialized convolution layers) that progressively grow the parameters from zero and ensure that no harmful noise could affect the finetuning. We test various conditioning controls, eg, edges, depth, segmentation, human pose, etc, with Stable Diffusion, using single or multiple conditions, with or without prompts. We show that the training of ControlNets is robust with small (<50k) and large (>1m) datasets. Extensive results show that ControlNet may facilitate wider applications to control image diffusion models.*
20+
21+
This model was contributed by [ishan24](https://huggingface.co/ishan24). ❤️
22+
The original codebase can be found at [NVlabs/Sana](https://github.com/NVlabs/Sana), and you can find official ControlNet checkpoints on [Efficient-Large-Model's](https://huggingface.co/Efficient-Large-Model) Hub profile.
23+
24+
## SanaControlNetModel
25+
[[autodoc]] SanaControlNetModel
26+
27+
## SanaControlNetOutput
28+
[[autodoc]] models.controlnets.controlnet_sana.SanaControlNetOutput
29+
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# HiDreamImageTransformer2DModel
13+
14+
A Transformer model for image-like data from [HiDream-I1](https://huggingface.co/HiDream-ai).
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import HiDreamImageTransformer2DModel
20+
21+
transformer = HiDreamImageTransformer2DModel.from_pretrained("HiDream-ai/HiDream-I1-Full", subfolder="transformer", torch_dtype=torch.bfloat16)
22+
```
23+
24+
## HiDreamImageTransformer2DModel
25+
26+
[[autodoc]] HiDreamImageTransformer2DModel
27+
28+
## Transformer2DModelOutput
29+
30+
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput

0 commit comments

Comments
 (0)