Skip to content

Commit a63c316

Browse files
authored
Merge branch 'main' into patch-1
2 parents 605c7ae + 914a585 commit a63c316

File tree

287 files changed

+29368
-2620
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

287 files changed

+29368
-2620
lines changed

.github/workflows/nightly_tests.yml

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -347,6 +347,64 @@ jobs:
347347
pip install slack_sdk tabulate
348348
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
349349
350+
run_nightly_quantization_tests:
351+
name: Torch quantization nightly tests
352+
strategy:
353+
fail-fast: false
354+
max-parallel: 2
355+
matrix:
356+
config:
357+
- backend: "bitsandbytes"
358+
test_location: "bnb"
359+
runs-on:
360+
group: aws-g6e-xlarge-plus
361+
container:
362+
image: diffusers/diffusers-pytorch-cuda
363+
options: --shm-size "20gb" --ipc host --gpus 0
364+
steps:
365+
- name: Checkout diffusers
366+
uses: actions/checkout@v3
367+
with:
368+
fetch-depth: 2
369+
- name: NVIDIA-SMI
370+
run: nvidia-smi
371+
- name: Install dependencies
372+
run: |
373+
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
374+
python -m uv pip install -e [quality,test]
375+
python -m uv pip install -U ${{ matrix.config.backend }}
376+
python -m uv pip install pytest-reportlog
377+
- name: Environment
378+
run: |
379+
python utils/print_env.py
380+
- name: ${{ matrix.config.backend }} quantization tests on GPU
381+
env:
382+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
383+
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
384+
CUBLAS_WORKSPACE_CONFIG: :16:8
385+
BIG_GPU_MEMORY: 40
386+
run: |
387+
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
388+
--make-reports=tests_${{ matrix.config.backend }}_torch_cuda \
389+
--report-log=tests_${{ matrix.config.backend }}_torch_cuda.log \
390+
tests/quantization/${{ matrix.config.test_location }}
391+
- name: Failure short reports
392+
if: ${{ failure() }}
393+
run: |
394+
cat reports/tests_${{ matrix.config.backend }}_torch_cuda_stats.txt
395+
cat reports/tests_${{ matrix.config.backend }}_torch_cuda_failures_short.txt
396+
- name: Test suite reports artifacts
397+
if: ${{ always() }}
398+
uses: actions/upload-artifact@v4
399+
with:
400+
name: torch_cuda_${{ matrix.config.backend }}_reports
401+
path: reports
402+
- name: Generate Report and Notify Channel
403+
if: always()
404+
run: |
405+
pip install slack_sdk tabulate
406+
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
407+
350408
# M1 runner currently not well supported
351409
# TODO: (Dhruv) add these back when we setup better testing for Apple Silicon
352410
# run_nightly_tests_apple_m1:

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -112,9 +112,9 @@ Check out the [Quickstart](https://huggingface.co/docs/diffusers/quicktour) to l
112112
| **Documentation** | **What can I learn?** |
113113
|---------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
114114
| [Tutorial](https://huggingface.co/docs/diffusers/tutorials/tutorial_overview) | A basic crash course for learning how to use the library's most important features like using models and schedulers to build your own diffusion system, and training your own diffusion model. |
115-
| [Loading](https://huggingface.co/docs/diffusers/using-diffusers/loading_overview) | Guides for how to load and configure all the components (pipelines, models, and schedulers) of the library, as well as how to use different schedulers. |
116-
| [Pipelines for inference](https://huggingface.co/docs/diffusers/using-diffusers/pipeline_overview) | Guides for how to use pipelines for different inference tasks, batched generation, controlling generated outputs and randomness, and how to contribute a pipeline to the library. |
117-
| [Optimization](https://huggingface.co/docs/diffusers/optimization/opt_overview) | Guides for how to optimize your diffusion model to run faster and consume less memory. |
115+
| [Loading](https://huggingface.co/docs/diffusers/using-diffusers/loading) | Guides for how to load and configure all the components (pipelines, models, and schedulers) of the library, as well as how to use different schedulers. |
116+
| [Pipelines for inference](https://huggingface.co/docs/diffusers/using-diffusers/overview_techniques) | Guides for how to use pipelines for different inference tasks, batched generation, controlling generated outputs and randomness, and how to contribute a pipeline to the library. |
117+
| [Optimization](https://huggingface.co/docs/diffusers/optimization/fp16) | Guides for how to optimize your diffusion model to run faster and consume less memory. |
118118
| [Training](https://huggingface.co/docs/diffusers/training/overview) | Guides for how to train a diffusion model for different tasks with different training techniques. |
119119
## Contribution
120120

docker/diffusers-onnxruntime-cuda/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ ENV PATH="/opt/venv/bin:$PATH"
2828
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
2929
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3030
python3.10 -m uv pip install --no-cache-dir \
31-
"torch<2.5.0" \
31+
torch \
3232
torchvision \
3333
torchaudio \
3434
"onnxruntime-gpu>=1.13.1" \

docker/diffusers-pytorch-compile-cuda/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ ENV PATH="/opt/venv/bin:$PATH"
2929
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
3030
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3131
python3.10 -m uv pip install --no-cache-dir \
32-
"torch<2.5.0" \
32+
torch \
3333
torchvision \
3434
torchaudio \
3535
invisible_watermark && \

docker/diffusers-pytorch-cpu/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ ENV PATH="/opt/venv/bin:$PATH"
2929
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
3030
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3131
python3.10 -m uv pip install --no-cache-dir \
32-
"torch<2.5.0" \
32+
torch \
3333
torchvision \
3434
torchaudio \
3535
invisible_watermark \

docker/diffusers-pytorch-cuda/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ ENV PATH="/opt/venv/bin:$PATH"
2929
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
3030
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3131
python3.10 -m uv pip install --no-cache-dir \
32-
"torch<2.5.0" \
32+
torch \
3333
torchvision \
3434
torchaudio \
3535
invisible_watermark && \

docker/diffusers-pytorch-xformers-cuda/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ ENV PATH="/opt/venv/bin:$PATH"
2929
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
3030
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
3131
python3.10 -m pip install --no-cache-dir \
32-
"torch<2.5.0" \
32+
torch \
3333
torchvision \
3434
torchaudio \
3535
invisible_watermark && \

docs/source/en/_toctree.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,8 @@
5555
- sections:
5656
- local: using-diffusers/overview_techniques
5757
title: Overview
58+
- local: using-diffusers/create_a_server
59+
title: Create a server
5860
- local: training/distributed_inference
5961
title: Distributed inference
6062
- local: using-diffusers/merge_loras
@@ -250,6 +252,8 @@
250252
title: SD3ControlNetModel
251253
- local: api/models/controlnet_sparsectrl
252254
title: SparseControlNetModel
255+
- local: api/models/controlnet_union
256+
title: ControlNetUnionModel
253257
title: ControlNets
254258
- sections:
255259
- local: api/models/allegro_transformer3d
@@ -312,6 +316,8 @@
312316
title: AutoencoderKLMochi
313317
- local: api/models/asymmetricautoencoderkl
314318
title: AsymmetricAutoencoderKL
319+
- local: api/models/autoencoder_dc
320+
title: AutoencoderDC
315321
- local: api/models/consistency_decoder_vae
316322
title: ConsistencyDecoderVAE
317323
- local: api/models/autoencoder_oobleck
@@ -364,6 +370,8 @@
364370
title: ControlNet-XS
365371
- local: api/pipelines/controlnetxs_sdxl
366372
title: ControlNet-XS with Stable Diffusion XL
373+
- local: api/pipelines/controlnet_union
374+
title: ControlNetUnion
367375
- local: api/pipelines/dance_diffusion
368376
title: Dance Diffusion
369377
- local: api/pipelines/ddim
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# AutoencoderDC
13+
14+
The 2D Autoencoder model used in [SANA](https://huggingface.co/papers/2410.10629) and introduced in [DCAE](https://huggingface.co/papers/2410.10733) by authors Junyu Chen\*, Han Cai\*, Junsong Chen, Enze Xie, Shang Yang, Haotian Tang, Muyang Li, Yao Lu, Song Han from MIT HAN Lab.
15+
16+
The abstract from the paper is:
17+
18+
*We present Deep Compression Autoencoder (DC-AE), a new family of autoencoder models for accelerating high-resolution diffusion models. Existing autoencoder models have demonstrated impressive results at a moderate spatial compression ratio (e.g., 8x), but fail to maintain satisfactory reconstruction accuracy for high spatial compression ratios (e.g., 64x). We address this challenge by introducing two key techniques: (1) Residual Autoencoding, where we design our models to learn residuals based on the space-to-channel transformed features to alleviate the optimization difficulty of high spatial-compression autoencoders; (2) Decoupled High-Resolution Adaptation, an efficient decoupled three-phases training strategy for mitigating the generalization penalty of high spatial-compression autoencoders. With these designs, we improve the autoencoder's spatial compression ratio up to 128 while maintaining the reconstruction quality. Applying our DC-AE to latent diffusion models, we achieve significant speedup without accuracy drop. For example, on ImageNet 512x512, our DC-AE provides 19.1x inference speedup and 17.9x training speedup on H100 GPU for UViT-H while achieving a better FID, compared with the widely used SD-VAE-f8 autoencoder. Our code is available at [this https URL](https://github.com/mit-han-lab/efficientvit).*
19+
20+
The following DCAE models are released and supported in Diffusers.
21+
22+
| Diffusers format | Original format |
23+
|:----------------:|:---------------:|
24+
| [`mit-han-lab/dc-ae-f32c32-sana-1.0-diffusers`](https://huggingface.co/mit-han-lab/dc-ae-f32c32-sana-1.0-diffusers) | [`mit-han-lab/dc-ae-f32c32-sana-1.0`](https://huggingface.co/mit-han-lab/dc-ae-f32c32-sana-1.0)
25+
| [`mit-han-lab/dc-ae-f32c32-in-1.0-diffusers`](https://huggingface.co/mit-han-lab/dc-ae-f32c32-in-1.0-diffusers) | [`mit-han-lab/dc-ae-f32c32-in-1.0`](https://huggingface.co/mit-han-lab/dc-ae-f32c32-in-1.0)
26+
| [`mit-han-lab/dc-ae-f32c32-mix-1.0-diffusers`](https://huggingface.co/mit-han-lab/dc-ae-f32c32-mix-1.0-diffusers) | [`mit-han-lab/dc-ae-f32c32-mix-1.0`](https://huggingface.co/mit-han-lab/dc-ae-f32c32-mix-1.0)
27+
| [`mit-han-lab/dc-ae-f64c128-in-1.0-diffusers`](https://huggingface.co/mit-han-lab/dc-ae-f64c128-in-1.0-diffusers) | [`mit-han-lab/dc-ae-f64c128-in-1.0`](https://huggingface.co/mit-han-lab/dc-ae-f64c128-in-1.0)
28+
| [`mit-han-lab/dc-ae-f64c128-mix-1.0-diffusers`](https://huggingface.co/mit-han-lab/dc-ae-f64c128-mix-1.0-diffusers) | [`mit-han-lab/dc-ae-f64c128-mix-1.0`](https://huggingface.co/mit-han-lab/dc-ae-f64c128-mix-1.0)
29+
| [`mit-han-lab/dc-ae-f128c512-in-1.0-diffusers`](https://huggingface.co/mit-han-lab/dc-ae-f128c512-in-1.0-diffusers) | [`mit-han-lab/dc-ae-f128c512-in-1.0`](https://huggingface.co/mit-han-lab/dc-ae-f128c512-in-1.0)
30+
| [`mit-han-lab/dc-ae-f128c512-mix-1.0-diffusers`](https://huggingface.co/mit-han-lab/dc-ae-f128c512-mix-1.0-diffusers) | [`mit-han-lab/dc-ae-f128c512-mix-1.0`](https://huggingface.co/mit-han-lab/dc-ae-f128c512-mix-1.0)
31+
32+
Load a model in Diffusers format with [`~ModelMixin.from_pretrained`].
33+
34+
```python
35+
from diffusers import AutoencoderDC
36+
37+
ae = AutoencoderDC.from_pretrained("mit-han-lab/dc-ae-f32c32-sana-1.0-diffusers", torch_dtype=torch.float32).to("cuda")
38+
```
39+
40+
## Load a model in Diffusers via `from_single_file`
41+
42+
```python
43+
from difusers import AutoencoderDC
44+
45+
ckpt_path = "https://huggingface.co/mit-han-lab/dc-ae-f32c32-sana-1.0/blob/main/model.safetensors"
46+
model = AutoencoderDC.from_single_file(ckpt_path)
47+
48+
```
49+
50+
The `AutoencoderDC` model has `in` and `mix` single file checkpoint variants that have matching checkpoint keys, but use different scaling factors. It is not possible for Diffusers to automatically infer the correct config file to use with the model based on just the checkpoint and will default to configuring the model using the `mix` variant config file. To override the automatically determined config, please use the `config` argument when using single file loading with `in` variant checkpoints.
51+
52+
```python
53+
from diffusers import AutoencoderDC
54+
55+
ckpt_path = "https://huggingface.co/mit-han-lab/dc-ae-f128c512-in-1.0/blob/main/model.safetensors"
56+
model = AutoencoderDC.from_single_file(ckpt_path, config="mit-han-lab/dc-ae-f128c512-in-1.0-diffusers")
57+
```
58+
59+
60+
## AutoencoderDC
61+
62+
[[autodoc]] AutoencoderDC
63+
- encode
64+
- decode
65+
- all
66+
67+
## DecoderOutput
68+
69+
[[autodoc]] models.autoencoders.vae.DecoderOutput
70+
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
<!--Copyright 2024 The HuggingFace Team and The InstantX Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
-->
12+
13+
# ControlNetUnionModel
14+
15+
ControlNetUnionModel is an implementation of ControlNet for Stable Diffusion XL.
16+
17+
The ControlNet model was introduced in [ControlNetPlus](https://github.com/xinsir6/ControlNetPlus) by xinsir6. It supports multiple conditioning inputs without increasing computation.
18+
19+
*We design a new architecture that can support 10+ control types in condition text-to-image generation and can generate high resolution images visually comparable with midjourney. The network is based on the original ControlNet architecture, we propose two new modules to: 1 Extend the original ControlNet to support different image conditions using the same network parameter. 2 Support multiple conditions input without increasing computation offload, which is especially important for designers who want to edit image in detail, different conditions use the same condition encoder, without adding extra computations or parameters.*
20+
21+
## Loading
22+
23+
By default the [`ControlNetUnionModel`] should be loaded with [`~ModelMixin.from_pretrained`].
24+
25+
```py
26+
from diffusers import StableDiffusionXLControlNetUnionPipeline, ControlNetUnionModel
27+
28+
controlnet = ControlNetUnionModel.from_pretrained("xinsir/controlnet-union-sdxl-1.0")
29+
pipe = StableDiffusionXLControlNetUnionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", controlnet=controlnet)
30+
```
31+
32+
## ControlNetUnionModel
33+
34+
[[autodoc]] ControlNetUnionModel
35+

0 commit comments

Comments
 (0)