Skip to content

Commit 48a4d60

Browse files
authored
Merge branch 'main' into AnandK27-lr-prev-timestep-patch
2 parents 9f81c89 + fc72e0f commit 48a4d60

File tree

421 files changed

+39297
-5095
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

421 files changed

+39297
-5095
lines changed

.github/workflows/nightly_tests.yml

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -180,6 +180,62 @@ jobs:
180180
pip install slack_sdk tabulate
181181
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
182182
183+
run_big_gpu_torch_tests:
184+
name: Torch tests on big GPU
185+
strategy:
186+
fail-fast: false
187+
max-parallel: 2
188+
runs-on:
189+
group: aws-g6e-xlarge-plus
190+
container:
191+
image: diffusers/diffusers-pytorch-cuda
192+
options: --shm-size "16gb" --ipc host --gpus 0
193+
steps:
194+
- name: Checkout diffusers
195+
uses: actions/checkout@v3
196+
with:
197+
fetch-depth: 2
198+
- name: NVIDIA-SMI
199+
run: nvidia-smi
200+
- name: Install dependencies
201+
run: |
202+
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
203+
python -m uv pip install -e [quality,test]
204+
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
205+
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
206+
python -m uv pip install pytest-reportlog
207+
- name: Environment
208+
run: |
209+
python utils/print_env.py
210+
- name: Selected Torch CUDA Test on big GPU
211+
env:
212+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
213+
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
214+
CUBLAS_WORKSPACE_CONFIG: :16:8
215+
BIG_GPU_MEMORY: 40
216+
run: |
217+
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
218+
-m "big_gpu_with_torch_cuda" \
219+
--make-reports=tests_big_gpu_torch_cuda \
220+
--report-log=tests_big_gpu_torch_cuda.log \
221+
tests/
222+
- name: Failure short reports
223+
if: ${{ failure() }}
224+
run: |
225+
cat reports/tests_big_gpu_torch_cuda_stats.txt
226+
cat reports/tests_big_gpu_torch_cuda_failures_short.txt
227+
- name: Test suite reports artifacts
228+
if: ${{ always() }}
229+
uses: actions/upload-artifact@v4
230+
with:
231+
name: torch_cuda_big_gpu_test_reports
232+
path: reports
233+
- name: Generate Report and Notify Channel
234+
if: always()
235+
run: |
236+
pip install slack_sdk tabulate
237+
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
238+
183239
run_flax_tpu_tests:
184240
name: Nightly Flax TPU Tests
185241
runs-on: docker-tpu
@@ -291,6 +347,64 @@ jobs:
291347
pip install slack_sdk tabulate
292348
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
293349
350+
run_nightly_quantization_tests:
351+
name: Torch quantization nightly tests
352+
strategy:
353+
fail-fast: false
354+
max-parallel: 2
355+
matrix:
356+
config:
357+
- backend: "bitsandbytes"
358+
test_location: "bnb"
359+
runs-on:
360+
group: aws-g6e-xlarge-plus
361+
container:
362+
image: diffusers/diffusers-pytorch-cuda
363+
options: --shm-size "20gb" --ipc host --gpus 0
364+
steps:
365+
- name: Checkout diffusers
366+
uses: actions/checkout@v3
367+
with:
368+
fetch-depth: 2
369+
- name: NVIDIA-SMI
370+
run: nvidia-smi
371+
- name: Install dependencies
372+
run: |
373+
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
374+
python -m uv pip install -e [quality,test]
375+
python -m uv pip install -U ${{ matrix.config.backend }}
376+
python -m uv pip install pytest-reportlog
377+
- name: Environment
378+
run: |
379+
python utils/print_env.py
380+
- name: ${{ matrix.config.backend }} quantization tests on GPU
381+
env:
382+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
383+
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
384+
CUBLAS_WORKSPACE_CONFIG: :16:8
385+
BIG_GPU_MEMORY: 40
386+
run: |
387+
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
388+
--make-reports=tests_${{ matrix.config.backend }}_torch_cuda \
389+
--report-log=tests_${{ matrix.config.backend }}_torch_cuda.log \
390+
tests/quantization/${{ matrix.config.test_location }}
391+
- name: Failure short reports
392+
if: ${{ failure() }}
393+
run: |
394+
cat reports/tests_${{ matrix.config.backend }}_torch_cuda_stats.txt
395+
cat reports/tests_${{ matrix.config.backend }}_torch_cuda_failures_short.txt
396+
- name: Test suite reports artifacts
397+
if: ${{ always() }}
398+
uses: actions/upload-artifact@v4
399+
with:
400+
name: torch_cuda_${{ matrix.config.backend }}_reports
401+
path: reports
402+
- name: Generate Report and Notify Channel
403+
if: always()
404+
run: |
405+
pip install slack_sdk tabulate
406+
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
407+
294408
# M1 runner currently not well supported
295409
# TODO: (Dhruv) add these back when we setup better testing for Apple Silicon
296410
# run_nightly_tests_apple_m1:

.github/workflows/pr_test_peft_backend.yml

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -92,12 +92,14 @@ jobs:
9292
run: |
9393
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
9494
python -m uv pip install -e [quality,test]
95+
# TODO (sayakpaul, DN6): revisit `--no-deps`
9596
if [ "${{ matrix.lib-versions }}" == "main" ]; then
96-
python -m pip install -U peft@git+https://github.com/huggingface/peft.git
97-
python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git
98-
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
97+
python -m pip install -U peft@git+https://github.com/huggingface/peft.git --no-deps
98+
python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
99+
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git --no-deps
99100
else
100-
python -m uv pip install -U peft transformers accelerate
101+
python -m uv pip install -U peft --no-deps
102+
python -m uv pip install -U transformers accelerate --no-deps
101103
fi
102104
103105
- name: Environment

.github/workflows/push_tests.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ jobs:
8181
- name: Environment
8282
run: |
8383
python utils/print_env.py
84-
- name: Slow PyTorch CUDA checkpoint tests on Ubuntu
84+
- name: PyTorch CUDA checkpoint tests on Ubuntu
8585
env:
8686
HF_TOKEN: ${{ secrets.HF_TOKEN }}
8787
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
@@ -184,7 +184,7 @@ jobs:
184184
run: |
185185
python utils/print_env.py
186186
187-
- name: Run slow Flax TPU tests
187+
- name: Run Flax TPU tests
188188
env:
189189
HF_TOKEN: ${{ secrets.HF_TOKEN }}
190190
run: |
@@ -232,7 +232,7 @@ jobs:
232232
run: |
233233
python utils/print_env.py
234234
235-
- name: Run slow ONNXRuntime CUDA tests
235+
- name: Run ONNXRuntime CUDA tests
236236
env:
237237
HF_TOKEN: ${{ secrets.HF_TOKEN }}
238238
run: |

.github/workflows/ssh-runner.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,13 @@ on:
44
workflow_dispatch:
55
inputs:
66
runner_type:
7-
description: 'Type of runner to test (aws-g6-4xlarge-plus: a10 or aws-g4dn-2xlarge: t4)'
7+
description: 'Type of runner to test (aws-g6-4xlarge-plus: a10, aws-g4dn-2xlarge: t4, aws-g6e-xlarge-plus: L40)'
88
type: choice
99
required: true
1010
options:
1111
- aws-g6-4xlarge-plus
1212
- aws-g4dn-2xlarge
13+
- aws-g6e-xlarge-plus
1314
docker_image:
1415
description: 'Name of the Docker image'
1516
required: true

docs/source/en/_toctree.yml

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@
5555
- sections:
5656
- local: using-diffusers/overview_techniques
5757
title: Overview
58+
- local: using-diffusers/create_a_server
59+
title: Create a server
5860
- local: training/distributed_inference
5961
title: Distributed inference
6062
- local: using-diffusers/merge_loras
@@ -150,6 +152,12 @@
150152
title: Reinforcement learning training with DDPO
151153
title: Methods
152154
title: Training
155+
- sections:
156+
- local: quantization/overview
157+
title: Getting Started
158+
- local: quantization/bitsandbytes
159+
title: bitsandbytes
160+
title: Quantization Methods
153161
- sections:
154162
- local: optimization/fp16
155163
title: Speed up inference
@@ -182,6 +190,8 @@
182190
title: Metal Performance Shaders (MPS)
183191
- local: optimization/habana
184192
title: Habana Gaudi
193+
- local: optimization/neuron
194+
title: AWS Neuron
185195
title: Optimized hardware
186196
title: Accelerate inference and reduce memory
187197
- sections:
@@ -209,6 +219,8 @@
209219
title: Logging
210220
- local: api/outputs
211221
title: Outputs
222+
- local: api/quantization
223+
title: Quantization
212224
title: Main Classes
213225
- isExpanded: false
214226
sections:
@@ -242,6 +254,8 @@
242254
title: SparseControlNetModel
243255
title: ControlNets
244256
- sections:
257+
- local: api/models/allegro_transformer3d
258+
title: AllegroTransformer3DModel
245259
- local: api/models/aura_flow_transformer2d
246260
title: AuraFlowTransformer2DModel
247261
- local: api/models/cogvideox_transformer3d
@@ -258,6 +272,8 @@
258272
title: LatteTransformer3DModel
259273
- local: api/models/lumina_nextdit2d
260274
title: LuminaNextDiT2DModel
275+
- local: api/models/mochi_transformer3d
276+
title: MochiTransformer3DModel
261277
- local: api/models/pixart_transformer2d
262278
title: PixArtTransformer2DModel
263279
- local: api/models/prior_transformer
@@ -290,8 +306,12 @@
290306
- sections:
291307
- local: api/models/autoencoderkl
292308
title: AutoencoderKL
309+
- local: api/models/autoencoderkl_allegro
310+
title: AutoencoderKLAllegro
293311
- local: api/models/autoencoderkl_cogvideox
294312
title: AutoencoderKLCogVideoX
313+
- local: api/models/autoencoderkl_mochi
314+
title: AutoencoderKLMochi
295315
- local: api/models/asymmetricautoencoderkl
296316
title: AsymmetricAutoencoderKL
297317
- local: api/models/consistency_decoder_vae
@@ -308,6 +328,8 @@
308328
sections:
309329
- local: api/pipelines/overview
310330
title: Overview
331+
- local: api/pipelines/allegro
332+
title: Allegro
311333
- local: api/pipelines/amused
312334
title: aMUSEd
313335
- local: api/pipelines/animatediff
@@ -384,6 +406,8 @@
384406
title: Lumina-T2X
385407
- local: api/pipelines/marigold
386408
title: Marigold
409+
- local: api/pipelines/mochi
410+
title: Mochi
387411
- local: api/pipelines/panorama
388412
title: MultiDiffusion
389413
- local: api/pipelines/musicldm
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# AllegroTransformer3DModel
13+
14+
A Diffusion Transformer model for 3D data from [Allegro](https://github.com/rhymes-ai/Allegro) was introduced in [Allegro: Open the Black Box of Commercial-Level Video Generation Model](https://huggingface.co/papers/2410.15458) by RhymesAI.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import AllegroTransformer3DModel
20+
21+
vae = AllegroTransformer3DModel.from_pretrained("rhymes-ai/Allegro", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
22+
```
23+
24+
## AllegroTransformer3DModel
25+
26+
[[autodoc]] AllegroTransformer3DModel
27+
28+
## Transformer2DModelOutput
29+
30+
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# AutoencoderKLAllegro
13+
14+
The 3D variational autoencoder (VAE) model with KL loss used in [Allegro](https://github.com/rhymes-ai/Allegro) was introduced in [Allegro: Open the Black Box of Commercial-Level Video Generation Model](https://huggingface.co/papers/2410.15458) by RhymesAI.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import AutoencoderKLAllegro
20+
21+
vae = AutoencoderKLCogVideoX.from_pretrained("rhymes-ai/Allegro", subfolder="vae", torch_dtype=torch.float32).to("cuda")
22+
```
23+
24+
## AutoencoderKLAllegro
25+
26+
[[autodoc]] AutoencoderKLAllegro
27+
- decode
28+
- encode
29+
- all
30+
31+
## AutoencoderKLOutput
32+
33+
[[autodoc]] models.autoencoders.autoencoder_kl.AutoencoderKLOutput
34+
35+
## DecoderOutput
36+
37+
[[autodoc]] models.autoencoders.vae.DecoderOutput
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# AutoencoderKLMochi
13+
14+
The 3D variational autoencoder (VAE) model with KL loss used in [Mochi](https://github.com/genmoai/models) was introduced in [Mochi 1 Preview](https://huggingface.co/genmo/mochi-1-preview) by Tsinghua University & ZhipuAI.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import AutoencoderKLMochi
20+
21+
vae = AutoencoderKLMochi.from_pretrained("genmo/mochi-1-preview", subfolder="vae", torch_dtype=torch.float32).to("cuda")
22+
```
23+
24+
## AutoencoderKLMochi
25+
26+
[[autodoc]] AutoencoderKLMochi
27+
- decode
28+
- all
29+
30+
## DecoderOutput
31+
32+
[[autodoc]] models.autoencoders.vae.DecoderOutput

docs/source/en/api/models/controlnet.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,12 +39,12 @@ pipe = StableDiffusionControlNetPipeline.from_single_file(url, controlnet=contro
3939

4040
## ControlNetOutput
4141

42-
[[autodoc]] models.controlnet.ControlNetOutput
42+
[[autodoc]] models.controlnets.controlnet.ControlNetOutput
4343

4444
## FlaxControlNetModel
4545

4646
[[autodoc]] FlaxControlNetModel
4747

4848
## FlaxControlNetOutput
4949

50-
[[autodoc]] models.controlnet_flax.FlaxControlNetOutput
50+
[[autodoc]] models.controlnets.controlnet_flax.FlaxControlNetOutput

0 commit comments

Comments
 (0)