Skip to content

Commit c29684f

Browse files
committed
Merge branch 'main' into add-quanto
2 parents c4b6e24 + f10d3c6 commit c29684f

File tree

127 files changed

+7399
-761
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

127 files changed

+7399
-761
lines changed

.github/workflows/pr_style_bot.yml

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
name: PR Style Bot
2+
3+
on:
4+
issue_comment:
5+
types: [created]
6+
7+
permissions:
8+
contents: write
9+
pull-requests: write
10+
11+
jobs:
12+
run-style-bot:
13+
if: >
14+
contains(github.event.comment.body, '@bot /style') &&
15+
github.event.issue.pull_request != null
16+
runs-on: ubuntu-latest
17+
18+
steps:
19+
- name: Extract PR details
20+
id: pr_info
21+
uses: actions/github-script@v6
22+
with:
23+
script: |
24+
const prNumber = context.payload.issue.number;
25+
const { data: pr } = await github.rest.pulls.get({
26+
owner: context.repo.owner,
27+
repo: context.repo.repo,
28+
pull_number: prNumber
29+
});
30+
31+
// We capture both the branch ref and the "full_name" of the head repo
32+
// so that we can check out the correct repository & branch (including forks).
33+
core.setOutput("prNumber", prNumber);
34+
core.setOutput("headRef", pr.head.ref);
35+
core.setOutput("headRepoFullName", pr.head.repo.full_name);
36+
37+
- name: Check out PR branch
38+
uses: actions/checkout@v3
39+
env:
40+
HEADREPOFULLNAME: ${{ steps.pr_info.outputs.headRepoFullName }}
41+
HEADREF: ${{ steps.pr_info.outputs.headRef }}
42+
with:
43+
# Instead of checking out the base repo, use the contributor's repo name
44+
repository: ${{ env.HEADREPOFULLNAME }}
45+
ref: ${{ env.HEADREF }}
46+
# You may need fetch-depth: 0 for being able to push
47+
fetch-depth: 0
48+
token: ${{ secrets.GITHUB_TOKEN }}
49+
50+
- name: Debug
51+
env:
52+
HEADREPOFULLNAME: ${{ steps.pr_info.outputs.headRepoFullName }}
53+
HEADREF: ${{ steps.pr_info.outputs.headRef }}
54+
PRNUMBER: ${{ steps.pr_info.outputs.prNumber }}
55+
run: |
56+
echo "PR number: ${{ env.PRNUMBER }}"
57+
echo "Head Ref: ${{ env.HEADREF }}"
58+
echo "Head Repo Full Name: ${{ env.HEADREPOFULLNAME }}"
59+
60+
- name: Set up Python
61+
uses: actions/setup-python@v4
62+
63+
- name: Install dependencies
64+
run: |
65+
pip install .[quality]
66+
67+
- name: Download Makefile from main branch
68+
run: |
69+
curl -o main_Makefile https://raw.githubusercontent.com/huggingface/diffusers/main/Makefile
70+
71+
- name: Compare Makefiles
72+
run: |
73+
if ! diff -q main_Makefile Makefile; then
74+
echo "Error: The Makefile has changed. Please ensure it matches the main branch."
75+
exit 1
76+
fi
77+
echo "No changes in Makefile. Proceeding..."
78+
rm -rf main_Makefile
79+
80+
- name: Run make style and make quality
81+
run: |
82+
make style && make quality
83+
84+
- name: Commit and push changes
85+
id: commit_and_push
86+
env:
87+
HEADREPOFULLNAME: ${{ steps.pr_info.outputs.headRepoFullName }}
88+
HEADREF: ${{ steps.pr_info.outputs.headRef }}
89+
PRNUMBER: ${{ steps.pr_info.outputs.prNumber }}
90+
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
91+
run: |
92+
echo "HEADREPOFULLNAME: ${{ env.HEADREPOFULLNAME }}, HEADREF: ${{ env.HEADREF }}"
93+
# Configure git with the Actions bot user
94+
git config user.name "github-actions[bot]"
95+
git config user.email "github-actions[bot]@users.noreply.github.com"
96+
97+
# Make sure your 'origin' remote is set to the contributor's fork
98+
git remote set-url origin "https://x-access-token:${GITHUB_TOKEN}@github.com/${{ env.HEADREPOFULLNAME }}.git"
99+
100+
# If there are changes after running style/quality, commit them
101+
if [ -n "$(git status --porcelain)" ]; then
102+
git add .
103+
git commit -m "Apply style fixes"
104+
# Push to the original contributor's forked branch
105+
git push origin HEAD:${{ env.HEADREF }}
106+
echo "changes_pushed=true" >> $GITHUB_OUTPUT
107+
else
108+
echo "No changes to commit."
109+
echo "changes_pushed=false" >> $GITHUB_OUTPUT
110+
fi
111+
112+
- name: Comment on PR with workflow run link
113+
if: steps.commit_and_push.outputs.changes_pushed == 'true'
114+
uses: actions/github-script@v6
115+
with:
116+
script: |
117+
const prNumber = parseInt(process.env.prNumber, 10);
118+
const runUrl = `${process.env.GITHUB_SERVER_URL}/${process.env.GITHUB_REPOSITORY}/actions/runs/${process.env.GITHUB_RUN_ID}`
119+
120+
await github.rest.issues.createComment({
121+
owner: context.repo.owner,
122+
repo: context.repo.repo,
123+
issue_number: prNumber,
124+
body: `Style fixes have been applied. [View the workflow run here](${runUrl}).`
125+
});
126+
env:
127+
prNumber: ${{ steps.pr_info.outputs.prNumber }}

.github/workflows/pr_tests.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,8 @@ name: Fast tests for PRs
22

33
on:
44
pull_request:
5-
branches:
6-
- main
5+
branches: [main]
6+
types: [synchronize]
77
paths:
88
- "src/diffusers/**.py"
99
- "benchmarks/**.py"

docs/source/en/_toctree.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -280,6 +280,8 @@
280280
title: ConsisIDTransformer3DModel
281281
- local: api/models/cogview3plus_transformer2d
282282
title: CogView3PlusTransformer2DModel
283+
- local: api/models/cogview4_transformer2d
284+
title: CogView4Transformer2DModel
283285
- local: api/models/dit_transformer2d
284286
title: DiTTransformer2DModel
285287
- local: api/models/flux_transformer
@@ -384,6 +386,8 @@
384386
title: CogVideoX
385387
- local: api/pipelines/cogview3
386388
title: CogView3
389+
- local: api/pipelines/cogview4
390+
title: CogView4
387391
- local: api/pipelines/consisid
388392
title: ConsisID
389393
- local: api/pipelines/consistency_models

docs/source/en/api/loaders/lora.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,10 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
2020
- [`FluxLoraLoaderMixin`] provides similar functions for [Flux](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux).
2121
- [`CogVideoXLoraLoaderMixin`] provides similar functions for [CogVideoX](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogvideox).
2222
- [`Mochi1LoraLoaderMixin`] provides similar functions for [Mochi](https://huggingface.co/docs/diffusers/main/en/api/pipelines/mochi).
23+
- [`LTXVideoLoraLoaderMixin`] provides similar functions for [LTX-Video](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video).
24+
- [`SanaLoraLoaderMixin`] provides similar functions for [Sana](https://huggingface.co/docs/diffusers/main/en/api/pipelines/sana).
25+
- [`HunyuanVideoLoraLoaderMixin`] provides similar functions for [HunyuanVideo](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hunyuan_video).
26+
- [`Lumina2LoraLoaderMixin`] provides similar functions for [Lumina2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/lumina2).
2327
- [`AmusedLoraLoaderMixin`] is for the [`AmusedPipeline`].
2428
- [`LoraBaseMixin`] provides a base class with several utility methods to fuse, unfuse, unload, LoRAs and more.
2529

@@ -53,6 +57,22 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
5357

5458
[[autodoc]] loaders.lora_pipeline.Mochi1LoraLoaderMixin
5559

60+
## LTXVideoLoraLoaderMixin
61+
62+
[[autodoc]] loaders.lora_pipeline.LTXVideoLoraLoaderMixin
63+
64+
## SanaLoraLoaderMixin
65+
66+
[[autodoc]] loaders.lora_pipeline.SanaLoraLoaderMixin
67+
68+
## HunyuanVideoLoraLoaderMixin
69+
70+
[[autodoc]] loaders.lora_pipeline.HunyuanVideoLoraLoaderMixin
71+
72+
## Lumina2LoraLoaderMixin
73+
74+
[[autodoc]] loaders.lora_pipeline.Lumina2LoraLoaderMixin
75+
5676
## AmusedLoraLoaderMixin
5777

5878
[[autodoc]] loaders.lora_pipeline.AmusedLoraLoaderMixin
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# CogView4Transformer2DModel
13+
14+
A Diffusion Transformer model for 2D data from [CogView4]()
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import CogView4Transformer2DModel
20+
21+
transformer = CogView4Transformer2DModel.from_pretrained("THUDM/CogView4-6B", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
22+
```
23+
24+
## CogView4Transformer2DModel
25+
26+
[[autodoc]] CogView4Transformer2DModel
27+
28+
## Transformer2DModelOutput
29+
30+
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
-->
15+
16+
# CogView4
17+
18+
<Tip>
19+
20+
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
21+
22+
</Tip>
23+
24+
This pipeline was contributed by [zRzRzRzRzRzRzR](https://github.com/zRzRzRzRzRzRzR). The original codebase can be found [here](https://huggingface.co/THUDM). The original weights can be found under [hf.co/THUDM](https://huggingface.co/THUDM).
25+
26+
## CogView4Pipeline
27+
28+
[[autodoc]] CogView4Pipeline
29+
- all
30+
- __call__
31+
32+
## CogView4PipelineOutput
33+
34+
[[autodoc]] pipelines.cogview4.pipeline_output.CogView4PipelineOutput

docs/source/en/api/utilities.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,3 +45,7 @@ Utility and helper functions for working with 🤗 Diffusers.
4545
## apply_layerwise_casting
4646

4747
[[autodoc]] hooks.layerwise_casting.apply_layerwise_casting
48+
49+
## apply_group_offloading
50+
51+
[[autodoc]] hooks.group_offloading.apply_group_offloading

docs/source/en/optimization/memory.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,46 @@ In order to properly offload models after they're called, it is required to run
158158

159159
</Tip>
160160

161+
## Group offloading
162+
163+
Group offloading is the middle ground between sequential and model offloading. It works by offloading groups of internal layers (either `torch.nn.ModuleList` or `torch.nn.Sequential`), which uses less memory than model-level offloading. It is also faster than sequential-level offloading because the number of device synchronizations is reduced.
164+
165+
To enable group offloading, call the [`~ModelMixin.enable_group_offload`] method on the model if it is a Diffusers model implementation. For any other model implementation, use [`~hooks.group_offloading.apply_group_offloading`]:
166+
167+
```python
168+
import torch
169+
from diffusers import CogVideoXPipeline
170+
from diffusers.hooks import apply_group_offloading
171+
from diffusers.utils import export_to_video
172+
173+
# Load the pipeline
174+
onload_device = torch.device("cuda")
175+
offload_device = torch.device("cpu")
176+
pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16)
177+
178+
# We can utilize the enable_group_offload method for Diffusers model implementations
179+
pipe.transformer.enable_group_offload(onload_device=onload_device, offload_device=offload_device, offload_type="leaf_level", use_stream=True)
180+
181+
# For any other model implementations, the apply_group_offloading function can be used
182+
apply_group_offloading(pipe.text_encoder, onload_device=onload_device, offload_type="block_level", num_blocks_per_group=2)
183+
apply_group_offloading(pipe.vae, onload_device=onload_device, offload_type="leaf_level")
184+
185+
prompt = (
186+
"A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. "
187+
"The panda's fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other "
188+
"pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, "
189+
"casting a gentle glow on the scene. The panda's face is expressive, showing concentration and joy as it plays. "
190+
"The background includes a small, flowing stream and vibrant green foliage, enhancing the peaceful and magical "
191+
"atmosphere of this unique musical performance."
192+
)
193+
video = pipe(prompt=prompt, guidance_scale=6, num_inference_steps=50).frames[0]
194+
# This utilized about 14.79 GB. It can be further reduced by using tiling and using leaf_level offloading throughout the pipeline.
195+
print(f"Max memory reserved: {torch.cuda.max_memory_allocated() / 1024**3:.2f} GB")
196+
export_to_video(video, "output.mp4", fps=8)
197+
```
198+
199+
Group offloading (for CUDA devices with support for asynchronous data transfer streams) overlaps data transfer and computation to reduce the overall execution time compared to sequential offloading. This is enabled using layer prefetching with CUDA streams. The next layer to be executed is loaded onto the accelerator device while the current layer is being executed - this increases the memory requirements slightly. Group offloading also supports leaf-level offloading (equivalent to sequential CPU offloading) but can be made much faster when using streams.
200+
161201
## FP8 layerwise weight-casting
162202

163203
PyTorch supports `torch.float8_e4m3fn` and `torch.float8_e5m2` as weight storage dtypes, but they can't be used for computation in many different tensor operations due to unimplemented kernel support. However, you can use these dtypes to store model weights in fp8 precision and upcast them on-the-fly when the layers are used in the forward pass. This is known as layerwise weight-casting.

docs/source/en/training/custom_diffusion.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -339,7 +339,10 @@ import torch
339339
from huggingface_hub.repocard import RepoCard
340340
from diffusers import DiffusionPipeline
341341

342-
pipeline = DiffusionPipeline.from_pretrained("sayakpaul/custom-diffusion-cat-wooden-pot", torch_dtype=torch.float16).to("cuda")
342+
pipeline = DiffusionPipeline.from_pretrained(
343+
"CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16,
344+
).to("cuda")
345+
model_id = "sayakpaul/custom-diffusion-cat-wooden-pot"
343346
pipeline.unet.load_attn_procs(model_id, weight_name="pytorch_custom_diffusion_weights.bin")
344347
pipeline.load_textual_inversion(model_id, weight_name="<new1>.bin")
345348
pipeline.load_textual_inversion(model_id, weight_name="<new2>.bin")

docs/source/en/tutorials/using_peft_for_inference.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,3 +221,7 @@ pipe.delete_adapters("toy")
221221
pipe.get_active_adapters()
222222
["pixel"]
223223
```
224+
225+
## PeftInputAutocastDisableHook
226+
227+
[[autodoc]] hooks.layerwise_casting.PeftInputAutocastDisableHook

0 commit comments

Comments
 (0)