Skip to content

Commit 2c94c8b

Browse files
authored
Merge branch 'main' into feat/AutoEncoderKLWan/gradient_checkpointing
2 parents c559b9b + a00c73a commit 2c94c8b

File tree

415 files changed

+23051
-6576
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

415 files changed

+23051
-6576
lines changed

.github/workflows/nightly_tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -417,7 +417,7 @@ jobs:
417417
additional_deps: ["peft"]
418418
- backend: "gguf"
419419
test_location: "gguf"
420-
additional_deps: []
420+
additional_deps: ["peft"]
421421
- backend: "torchao"
422422
test_location: "torchao"
423423
additional_deps: []

.github/workflows/pr_style_bot.yml

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -13,39 +13,5 @@ jobs:
1313
uses: huggingface/huggingface_hub/.github/workflows/style-bot-action.yml@main
1414
with:
1515
python_quality_dependencies: "[quality]"
16-
pre_commit_script_name: "Download and Compare files from the main branch"
17-
pre_commit_script: |
18-
echo "Downloading the files from the main branch"
19-
20-
curl -o main_Makefile https://raw.githubusercontent.com/huggingface/diffusers/main/Makefile
21-
curl -o main_setup.py https://raw.githubusercontent.com/huggingface/diffusers/refs/heads/main/setup.py
22-
curl -o main_check_doc_toc.py https://raw.githubusercontent.com/huggingface/diffusers/refs/heads/main/utils/check_doc_toc.py
23-
24-
echo "Compare the files and raise error if needed"
25-
26-
diff_failed=0
27-
if ! diff -q main_Makefile Makefile; then
28-
echo "Error: The Makefile has changed. Please ensure it matches the main branch."
29-
diff_failed=1
30-
fi
31-
32-
if ! diff -q main_setup.py setup.py; then
33-
echo "Error: The setup.py has changed. Please ensure it matches the main branch."
34-
diff_failed=1
35-
fi
36-
37-
if ! diff -q main_check_doc_toc.py utils/check_doc_toc.py; then
38-
echo "Error: The utils/check_doc_toc.py has changed. Please ensure it matches the main branch."
39-
diff_failed=1
40-
fi
41-
42-
if [ $diff_failed -eq 1 ]; then
43-
echo "❌ Error happened as we detected changes in the files that should not be changed ❌"
44-
exit 1
45-
fi
46-
47-
echo "No changes in the files. Proceeding..."
48-
rm -rf main_Makefile main_setup.py main_check_doc_toc.py
49-
style_command: "make style && make quality"
5016
secrets:
5117
bot_token: ${{ secrets.GITHUB_TOKEN }}

.github/workflows/pr_tests_gpu.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -177,6 +177,7 @@ jobs:
177177

178178
torch_cuda_tests:
179179
name: Torch CUDA Tests
180+
needs: [check_code_quality, check_repository_consistency]
180181
runs-on:
181182
group: aws-g4dn-2xlarge
182183
container:
@@ -245,7 +246,7 @@ jobs:
245246

246247
run_examples_tests:
247248
name: Examples PyTorch CUDA tests on Ubuntu
248-
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
249+
needs: [check_code_quality, check_repository_consistency]
249250
runs-on:
250251
group: aws-g4dn-2xlarge
251252

@@ -264,6 +265,7 @@ jobs:
264265
- name: Install dependencies
265266
run: |
266267
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
268+
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
267269
python -m uv pip install -e [quality,test,training]
268270
269271
- name: Environment

docker/diffusers-onnxruntime-cpu/Dockerfile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,9 +28,9 @@ ENV PATH="/opt/venv/bin:$PATH"
2828
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
2929
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3030
python3 -m uv pip install --no-cache-dir \
31-
torch==2.1.2 \
32-
torchvision==0.16.2 \
33-
torchaudio==2.1.2 \
31+
torch \
32+
torchvision \
33+
torchaudio\
3434
onnxruntime \
3535
--extra-index-url https://download.pytorch.org/whl/cpu && \
3636
python3 -m uv pip install --no-cache-dir \

docs/source/en/_toctree.yml

Lines changed: 45 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,7 @@
175175
title: gguf
176176
- local: quantization/torchao
177177
title: torchao
178-
- local: quantization/quanto
178+
- local: quantization/quanto
179179
title: quanto
180180
title: Quantization Methods
181181
- sections:
@@ -265,19 +265,23 @@
265265
sections:
266266
- local: api/models/overview
267267
title: Overview
268+
- local: api/models/auto_model
269+
title: AutoModel
268270
- sections:
269271
- local: api/models/controlnet
270272
title: ControlNetModel
273+
- local: api/models/controlnet_union
274+
title: ControlNetUnionModel
271275
- local: api/models/controlnet_flux
272276
title: FluxControlNetModel
273277
- local: api/models/controlnet_hunyuandit
274278
title: HunyuanDiT2DControlNetModel
279+
- local: api/models/controlnet_sana
280+
title: SanaControlNetModel
275281
- local: api/models/controlnet_sd3
276282
title: SD3ControlNetModel
277283
- local: api/models/controlnet_sparsectrl
278284
title: SparseControlNetModel
279-
- local: api/models/controlnet_union
280-
title: ControlNetUnionModel
281285
title: ControlNets
282286
- sections:
283287
- local: api/models/allegro_transformer3d
@@ -286,30 +290,32 @@
286290
title: AuraFlowTransformer2DModel
287291
- local: api/models/cogvideox_transformer3d
288292
title: CogVideoXTransformer3DModel
289-
- local: api/models/consisid_transformer3d
290-
title: ConsisIDTransformer3DModel
291293
- local: api/models/cogview3plus_transformer2d
292294
title: CogView3PlusTransformer2DModel
293295
- local: api/models/cogview4_transformer2d
294296
title: CogView4Transformer2DModel
297+
- local: api/models/consisid_transformer3d
298+
title: ConsisIDTransformer3DModel
295299
- local: api/models/dit_transformer2d
296300
title: DiTTransformer2DModel
297301
- local: api/models/easyanimate_transformer3d
298302
title: EasyAnimateTransformer3DModel
299303
- local: api/models/flux_transformer
300304
title: FluxTransformer2DModel
305+
- local: api/models/hidream_image_transformer
306+
title: HiDreamImageTransformer2DModel
301307
- local: api/models/hunyuan_transformer2d
302308
title: HunyuanDiT2DModel
303309
- local: api/models/hunyuan_video_transformer_3d
304310
title: HunyuanVideoTransformer3DModel
305311
- local: api/models/latte_transformer3d
306312
title: LatteTransformer3DModel
307-
- local: api/models/lumina_nextdit2d
308-
title: LuminaNextDiT2DModel
309-
- local: api/models/lumina2_transformer2d
310-
title: Lumina2Transformer2DModel
311313
- local: api/models/ltx_video_transformer3d
312314
title: LTXVideoTransformer3DModel
315+
- local: api/models/lumina2_transformer2d
316+
title: Lumina2Transformer2DModel
317+
- local: api/models/lumina_nextdit2d
318+
title: LuminaNextDiT2DModel
313319
- local: api/models/mochi_transformer3d
314320
title: MochiTransformer3DModel
315321
- local: api/models/omnigen_transformer
@@ -318,10 +324,10 @@
318324
title: PixArtTransformer2DModel
319325
- local: api/models/prior_transformer
320326
title: PriorTransformer
321-
- local: api/models/sd3_transformer2d
322-
title: SD3Transformer2DModel
323327
- local: api/models/sana_transformer2d
324328
title: SanaTransformer2DModel
329+
- local: api/models/sd3_transformer2d
330+
title: SD3Transformer2DModel
325331
- local: api/models/stable_audio_transformer
326332
title: StableAudioDiTModel
327333
- local: api/models/transformer2d
@@ -336,10 +342,10 @@
336342
title: StableCascadeUNet
337343
- local: api/models/unet
338344
title: UNet1DModel
339-
- local: api/models/unet2d
340-
title: UNet2DModel
341345
- local: api/models/unet2d-cond
342346
title: UNet2DConditionModel
347+
- local: api/models/unet2d
348+
title: UNet2DModel
343349
- local: api/models/unet3d-cond
344350
title: UNet3DConditionModel
345351
- local: api/models/unet-motion
@@ -348,6 +354,10 @@
348354
title: UViT2DModel
349355
title: UNets
350356
- sections:
357+
- local: api/models/asymmetricautoencoderkl
358+
title: AsymmetricAutoencoderKL
359+
- local: api/models/autoencoder_dc
360+
title: AutoencoderDC
351361
- local: api/models/autoencoderkl
352362
title: AutoencoderKL
353363
- local: api/models/autoencoderkl_allegro
@@ -364,10 +374,6 @@
364374
title: AutoencoderKLMochi
365375
- local: api/models/autoencoder_kl_wan
366376
title: AutoencoderKLWan
367-
- local: api/models/asymmetricautoencoderkl
368-
title: AsymmetricAutoencoderKL
369-
- local: api/models/autoencoder_dc
370-
title: AutoencoderDC
371377
- local: api/models/consistency_decoder_vae
372378
title: ConsistencyDecoderVAE
373379
- local: api/models/autoencoder_oobleck
@@ -420,6 +426,8 @@
420426
title: ControlNet with Stable Diffusion 3
421427
- local: api/pipelines/controlnet_sdxl
422428
title: ControlNet with Stable Diffusion XL
429+
- local: api/pipelines/controlnet_sana
430+
title: ControlNet-Sana
423431
- local: api/pipelines/controlnetxs
424432
title: ControlNet-XS
425433
- local: api/pipelines/controlnetxs_sdxl
@@ -444,6 +452,8 @@
444452
title: Flux
445453
- local: api/pipelines/control_flux_inpaint
446454
title: FluxControlInpaint
455+
- local: api/pipelines/hidream
456+
title: HiDream-I1
447457
- local: api/pipelines/hunyuandit
448458
title: Hunyuan-DiT
449459
- local: api/pipelines/hunyuan_video
@@ -496,6 +506,8 @@
496506
title: PixArt-Σ
497507
- local: api/pipelines/sana
498508
title: Sana
509+
- local: api/pipelines/sana_sprint
510+
title: Sana Sprint
499511
- local: api/pipelines/self_attention_guidance
500512
title: Self-Attention Guidance
501513
- local: api/pipelines/semantic_stable_diffusion
@@ -509,40 +521,40 @@
509521
- sections:
510522
- local: api/pipelines/stable_diffusion/overview
511523
title: Overview
512-
- local: api/pipelines/stable_diffusion/text2img
513-
title: Text-to-image
524+
- local: api/pipelines/stable_diffusion/depth2img
525+
title: Depth-to-image
526+
- local: api/pipelines/stable_diffusion/gligen
527+
title: GLIGEN (Grounded Language-to-Image Generation)
528+
- local: api/pipelines/stable_diffusion/image_variation
529+
title: Image variation
514530
- local: api/pipelines/stable_diffusion/img2img
515531
title: Image-to-image
516532
- local: api/pipelines/stable_diffusion/svd
517533
title: Image-to-video
518534
- local: api/pipelines/stable_diffusion/inpaint
519535
title: Inpainting
520-
- local: api/pipelines/stable_diffusion/depth2img
521-
title: Depth-to-image
522-
- local: api/pipelines/stable_diffusion/image_variation
523-
title: Image variation
536+
- local: api/pipelines/stable_diffusion/k_diffusion
537+
title: K-Diffusion
538+
- local: api/pipelines/stable_diffusion/latent_upscale
539+
title: Latent upscaler
540+
- local: api/pipelines/stable_diffusion/ldm3d_diffusion
541+
title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D Upscaler
524542
- local: api/pipelines/stable_diffusion/stable_diffusion_safe
525543
title: Safe Stable Diffusion
544+
- local: api/pipelines/stable_diffusion/sdxl_turbo
545+
title: SDXL Turbo
526546
- local: api/pipelines/stable_diffusion/stable_diffusion_2
527547
title: Stable Diffusion 2
528548
- local: api/pipelines/stable_diffusion/stable_diffusion_3
529549
title: Stable Diffusion 3
530550
- local: api/pipelines/stable_diffusion/stable_diffusion_xl
531551
title: Stable Diffusion XL
532-
- local: api/pipelines/stable_diffusion/sdxl_turbo
533-
title: SDXL Turbo
534-
- local: api/pipelines/stable_diffusion/latent_upscale
535-
title: Latent upscaler
536552
- local: api/pipelines/stable_diffusion/upscale
537553
title: Super-resolution
538-
- local: api/pipelines/stable_diffusion/k_diffusion
539-
title: K-Diffusion
540-
- local: api/pipelines/stable_diffusion/ldm3d_diffusion
541-
title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D Upscaler
542554
- local: api/pipelines/stable_diffusion/adapter
543555
title: T2I-Adapter
544-
- local: api/pipelines/stable_diffusion/gligen
545-
title: GLIGEN (Grounded Language-to-Image Generation)
556+
- local: api/pipelines/stable_diffusion/text2img
557+
title: Text-to-image
546558
title: Stable Diffusion
547559
- local: api/pipelines/stable_unclip
548560
title: Stable unCLIP

docs/source/en/api/cache.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,33 @@ config = PyramidAttentionBroadcastConfig(
3838
pipe.transformer.enable_cache(config)
3939
```
4040

41+
## Faster Cache
42+
43+
[FasterCache](https://huggingface.co/papers/2410.19355) from Zhengyao Lv, Chenyang Si, Junhao Song, Zhenyu Yang, Yu Qiao, Ziwei Liu, Kwan-Yee K. Wong.
44+
45+
FasterCache is a method that speeds up inference in diffusion transformers by:
46+
- Reusing attention states between successive inference steps, due to high similarity between them
47+
- Skipping unconditional branch prediction used in classifier-free guidance by revealing redundancies between unconditional and conditional branch outputs for the same timestep, and therefore approximating the unconditional branch output using the conditional branch output
48+
49+
```python
50+
import torch
51+
from diffusers import CogVideoXPipeline, FasterCacheConfig
52+
53+
pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16)
54+
pipe.to("cuda")
55+
56+
config = FasterCacheConfig(
57+
spatial_attention_block_skip_range=2,
58+
spatial_attention_timestep_skip_range=(-1, 681),
59+
current_timestep_callback=lambda: pipe.current_timestep,
60+
attention_weight_callback=lambda _: 0.3,
61+
unconditional_batch_skip_range=5,
62+
unconditional_batch_timestep_skip_range=(-1, 781),
63+
tensor_format="BFCHW",
64+
)
65+
pipe.transformer.enable_cache(config)
66+
```
67+
4168
### CacheMixin
4269

4370
[[autodoc]] CacheMixin
@@ -47,3 +74,9 @@ pipe.transformer.enable_cache(config)
4774
[[autodoc]] PyramidAttentionBroadcastConfig
4875

4976
[[autodoc]] apply_pyramid_attention_broadcast
77+
78+
### FasterCacheConfig
79+
80+
[[autodoc]] FasterCacheConfig
81+
82+
[[autodoc]] apply_faster_cache

docs/source/en/api/loaders/lora.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,10 +20,13 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
2020
- [`FluxLoraLoaderMixin`] provides similar functions for [Flux](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux).
2121
- [`CogVideoXLoraLoaderMixin`] provides similar functions for [CogVideoX](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogvideox).
2222
- [`Mochi1LoraLoaderMixin`] provides similar functions for [Mochi](https://huggingface.co/docs/diffusers/main/en/api/pipelines/mochi).
23+
- [`AuraFlowLoraLoaderMixin`] provides similar functions for [AuraFlow](https://huggingface.co/fal/AuraFlow).
2324
- [`LTXVideoLoraLoaderMixin`] provides similar functions for [LTX-Video](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video).
2425
- [`SanaLoraLoaderMixin`] provides similar functions for [Sana](https://huggingface.co/docs/diffusers/main/en/api/pipelines/sana).
2526
- [`HunyuanVideoLoraLoaderMixin`] provides similar functions for [HunyuanVideo](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hunyuan_video).
2627
- [`Lumina2LoraLoaderMixin`] provides similar functions for [Lumina2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/lumina2).
28+
- [`WanLoraLoaderMixin`] provides similar functions for [Wan](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan).
29+
- [`CogView4LoraLoaderMixin`] provides similar functions for [CogView4](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogview4).
2730
- [`AmusedLoraLoaderMixin`] is for the [`AmusedPipeline`].
2831
- [`LoraBaseMixin`] provides a base class with several utility methods to fuse, unfuse, unload, LoRAs and more.
2932

@@ -56,6 +59,9 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
5659
## Mochi1LoraLoaderMixin
5760

5861
[[autodoc]] loaders.lora_pipeline.Mochi1LoraLoaderMixin
62+
## AuraFlowLoraLoaderMixin
63+
64+
[[autodoc]] loaders.lora_pipeline.AuraFlowLoraLoaderMixin
5965

6066
## LTXVideoLoraLoaderMixin
6167

@@ -73,6 +79,14 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
7379

7480
[[autodoc]] loaders.lora_pipeline.Lumina2LoraLoaderMixin
7581

82+
## CogView4LoraLoaderMixin
83+
84+
[[autodoc]] loaders.lora_pipeline.CogView4LoraLoaderMixin
85+
86+
## WanLoraLoaderMixin
87+
88+
[[autodoc]] loaders.lora_pipeline.WanLoraLoaderMixin
89+
7690
## AmusedLoraLoaderMixin
7791

7892
[[autodoc]] loaders.lora_pipeline.AmusedLoraLoaderMixin

0 commit comments

Comments
 (0)