Skip to content

Commit f2a6146

Browse files
Merge branch 'main' into lora-hot-swapping
2 parents b181a47 + 723dbdd commit f2a6146

File tree

143 files changed

+10845
-735
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

143 files changed

+10845
-735
lines changed

.github/workflows/pr_style_bot.yml

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -13,39 +13,5 @@ jobs:
1313
uses: huggingface/huggingface_hub/.github/workflows/style-bot-action.yml@main
1414
with:
1515
python_quality_dependencies: "[quality]"
16-
pre_commit_script_name: "Download and Compare files from the main branch"
17-
pre_commit_script: |
18-
echo "Downloading the files from the main branch"
19-
20-
curl -o main_Makefile https://raw.githubusercontent.com/huggingface/diffusers/main/Makefile
21-
curl -o main_setup.py https://raw.githubusercontent.com/huggingface/diffusers/refs/heads/main/setup.py
22-
curl -o main_check_doc_toc.py https://raw.githubusercontent.com/huggingface/diffusers/refs/heads/main/utils/check_doc_toc.py
23-
24-
echo "Compare the files and raise error if needed"
25-
26-
diff_failed=0
27-
if ! diff -q main_Makefile Makefile; then
28-
echo "Error: The Makefile has changed. Please ensure it matches the main branch."
29-
diff_failed=1
30-
fi
31-
32-
if ! diff -q main_setup.py setup.py; then
33-
echo "Error: The setup.py has changed. Please ensure it matches the main branch."
34-
diff_failed=1
35-
fi
36-
37-
if ! diff -q main_check_doc_toc.py utils/check_doc_toc.py; then
38-
echo "Error: The utils/check_doc_toc.py has changed. Please ensure it matches the main branch."
39-
diff_failed=1
40-
fi
41-
42-
if [ $diff_failed -eq 1 ]; then
43-
echo "❌ Error happened as we detected changes in the files that should not be changed ❌"
44-
exit 1
45-
fi
46-
47-
echo "No changes in the files. Proceeding..."
48-
rm -rf main_Makefile main_setup.py main_check_doc_toc.py
49-
style_command: "make style && make quality"
5016
secrets:
5117
bot_token: ${{ secrets.GITHUB_TOKEN }}

.github/workflows/pr_tests_gpu.yml

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,51 @@ env:
2828
PIPELINE_USAGE_CUTOFF: 1000000000 # set high cutoff so that only always-test pipelines run
2929

3030
jobs:
31+
check_code_quality:
32+
runs-on: ubuntu-22.04
33+
steps:
34+
- uses: actions/checkout@v3
35+
- name: Set up Python
36+
uses: actions/setup-python@v4
37+
with:
38+
python-version: "3.8"
39+
- name: Install dependencies
40+
run: |
41+
python -m pip install --upgrade pip
42+
pip install .[quality]
43+
- name: Check quality
44+
run: make quality
45+
- name: Check if failure
46+
if: ${{ failure() }}
47+
run: |
48+
echo "Quality check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make style && make quality'" >> $GITHUB_STEP_SUMMARY
49+
50+
check_repository_consistency:
51+
needs: check_code_quality
52+
runs-on: ubuntu-22.04
53+
steps:
54+
- uses: actions/checkout@v3
55+
- name: Set up Python
56+
uses: actions/setup-python@v4
57+
with:
58+
python-version: "3.8"
59+
- name: Install dependencies
60+
run: |
61+
python -m pip install --upgrade pip
62+
pip install .[quality]
63+
- name: Check repo consistency
64+
run: |
65+
python utils/check_copies.py
66+
python utils/check_dummies.py
67+
python utils/check_support_list.py
68+
make deps_table_check_updated
69+
- name: Check if failure
70+
if: ${{ failure() }}
71+
run: |
72+
echo "Repo consistency check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make fix-copies'" >> $GITHUB_STEP_SUMMARY
73+
3174
setup_torch_cuda_pipeline_matrix:
75+
needs: [check_code_quality, check_repository_consistency]
3276
name: Setup Torch Pipelines CUDA Slow Tests Matrix
3377
runs-on:
3478
group: aws-general-8-plus
@@ -133,6 +177,7 @@ jobs:
133177

134178
torch_cuda_tests:
135179
name: Torch CUDA Tests
180+
needs: [check_code_quality, check_repository_consistency]
136181
runs-on:
137182
group: aws-g4dn-2xlarge
138183
container:
@@ -201,7 +246,7 @@ jobs:
201246

202247
run_examples_tests:
203248
name: Examples PyTorch CUDA tests on Ubuntu
204-
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
249+
needs: [check_code_quality, check_repository_consistency]
205250
runs-on:
206251
group: aws-g4dn-2xlarge
207252

@@ -220,6 +265,7 @@ jobs:
220265
- name: Install dependencies
221266
run: |
222267
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
268+
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
223269
python -m uv pip install -e [quality,test,training]
224270
225271
- name: Environment

docs/source/en/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -496,6 +496,8 @@
496496
title: PixArt-Σ
497497
- local: api/pipelines/sana
498498
title: Sana
499+
- local: api/pipelines/sana_sprint
500+
title: Sana Sprint
499501
- local: api/pipelines/self_attention_guidance
500502
title: Self-Attention Guidance
501503
- local: api/pipelines/semantic_stable_diffusion

docs/source/en/api/cache.md

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,33 @@ config = PyramidAttentionBroadcastConfig(
3838
pipe.transformer.enable_cache(config)
3939
```
4040

41+
## Faster Cache
42+
43+
[FasterCache](https://huggingface.co/papers/2410.19355) from Zhengyao Lv, Chenyang Si, Junhao Song, Zhenyu Yang, Yu Qiao, Ziwei Liu, Kwan-Yee K. Wong.
44+
45+
FasterCache is a method that speeds up inference in diffusion transformers by:
46+
- Reusing attention states between successive inference steps, due to high similarity between them
47+
- Skipping unconditional branch prediction used in classifier-free guidance by revealing redundancies between unconditional and conditional branch outputs for the same timestep, and therefore approximating the unconditional branch output using the conditional branch output
48+
49+
```python
50+
import torch
51+
from diffusers import CogVideoXPipeline, FasterCacheConfig
52+
53+
pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16)
54+
pipe.to("cuda")
55+
56+
config = FasterCacheConfig(
57+
spatial_attention_block_skip_range=2,
58+
spatial_attention_timestep_skip_range=(-1, 681),
59+
current_timestep_callback=lambda: pipe.current_timestep,
60+
attention_weight_callback=lambda _: 0.3,
61+
unconditional_batch_skip_range=5,
62+
unconditional_batch_timestep_skip_range=(-1, 781),
63+
tensor_format="BFCHW",
64+
)
65+
pipe.transformer.enable_cache(config)
66+
```
67+
4168
### CacheMixin
4269

4370
[[autodoc]] CacheMixin
@@ -47,3 +74,9 @@ pipe.transformer.enable_cache(config)
4774
[[autodoc]] PyramidAttentionBroadcastConfig
4875

4976
[[autodoc]] apply_pyramid_attention_broadcast
77+
78+
### FasterCacheConfig
79+
80+
[[autodoc]] FasterCacheConfig
81+
82+
[[autodoc]] apply_faster_cache

docs/source/en/api/pipelines/deepfloyd_if.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
## Overview

docs/source/en/api/pipelines/flux.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
Flux is a series of text-to-image generation models based on diffusion transformers. To know more about Flux, check out the original [blog post](https://blackforestlabs.ai/announcing-black-forest-labs/) by the creators of Flux, Black Forest Labs.

docs/source/en/api/pipelines/hunyuan_video.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,8 @@ The following models are available for the image-to-video pipeline:
5050
| Model name | Description |
5151
|:---|:---|
5252
| [`Skywork/SkyReels-V1-Hunyuan-I2V`](https://huggingface.co/Skywork/SkyReels-V1-Hunyuan-I2V) | Skywork's custom finetune of HunyuanVideo (de-distilled). Performs best with `97x544x960` resolution. Performs best at `97x544x960` resolution, `guidance_scale=1.0`, `true_cfg_scale=6.0` and a negative prompt. |
53-
| [`hunyuanvideo-community/HunyuanVideo-I2V`](https://huggingface.co/hunyuanvideo-community/HunyuanVideo-I2V) | Tecent's official HunyuanVideo I2V model. Performs best at resolutions of 480, 720, 960, 1280. A higher `shift` value when initializing the scheduler is recommended (good values are between 7 and 20) |
53+
| [`hunyuanvideo-community/HunyuanVideo-I2V-33ch`](https://huggingface.co/hunyuanvideo-community/HunyuanVideo-I2V) | Tecent's official HunyuanVideo 33-channel I2V model. Performs best at resolutions of 480, 720, 960, 1280. A higher `shift` value when initializing the scheduler is recommended (good values are between 7 and 20). |
54+
| [`hunyuanvideo-community/HunyuanVideo-I2V`](https://huggingface.co/hunyuanvideo-community/HunyuanVideo-I2V) | Tecent's official HunyuanVideo 16-channel I2V model. Performs best at resolutions of 480, 720, 960, 1280. A higher `shift` value when initializing the scheduler is recommended (good values are between 7 and 20) |
5455

5556
## Quantization
5657

docs/source/en/api/pipelines/kolors.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ specific language governing permissions and limitations under the License.
1414

1515
<div class="flex flex-wrap space-x-1">
1616
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
17+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1718
</div>
1819

1920
![](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/kolors/kolors_header_collage.png)

docs/source/en/api/pipelines/ltx_video.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616

1717
<div class="flex flex-wrap space-x-1">
1818
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
19+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1920
</div>
2021

2122
[LTX Video](https://huggingface.co/Lightricks/LTX-Video) is the first DiT-based video generation model capable of generating high-quality videos in real-time. It produces 24 FPS videos at a 768x512 resolution faster than they can be watched. Trained on a large-scale dataset of diverse videos, the model generates high-resolution videos with realistic and varied content. We provide a model for both text-to-video as well as image + text-to-video usecases.
@@ -32,6 +33,7 @@ Available models:
3233
|:-------------:|:-----------------:|
3334
| [`LTX Video 0.9.0`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.safetensors) | `torch.bfloat16` |
3435
| [`LTX Video 0.9.1`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.1.safetensors) | `torch.bfloat16` |
36+
| [`LTX Video 0.9.5`](https://huggingface.co/Lightricks/LTX-Video/blob/main/ltx-video-2b-v0.9.5.safetensors) | `torch.bfloat16` |
3537

3638
Note: The recommended dtype is for the transformer component. The VAE and text encoders can be either `torch.float32`, `torch.bfloat16` or `torch.float16` but the recommended dtype is `torch.bfloat16` as used in the original repository.
3739

@@ -196,6 +198,12 @@ export_to_video(video, "ship.mp4", fps=24)
196198
- all
197199
- __call__
198200

201+
## LTXConditionPipeline
202+
203+
[[autodoc]] LTXConditionPipeline
204+
- all
205+
- __call__
206+
199207
## LTXPipelineOutput
200208

201209
[[autodoc]] pipelines.ltx.pipeline_output.LTXPipelineOutput

docs/source/en/api/pipelines/sana.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616

1717
<div class="flex flex-wrap space-x-1">
1818
<img alt="LoRA" src="https://img.shields.io/badge/LoRA-d8b4fe?style=flat"/>
19+
<img alt="MPS" src="https://img.shields.io/badge/MPS-000000?style=flat&logo=apple&logoColor=white%22">
1920
</div>
2021

2122
[SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers](https://huggingface.co/papers/2410.10629) from NVIDIA and MIT HAN Lab, by Enze Xie, Junsong Chen, Junyu Chen, Han Cai, Haotian Tang, Yujun Lin, Zhekai Zhang, Muyang Li, Ligeng Zhu, Yao Lu, Song Han.

0 commit comments

Comments
 (0)