Skip to content

Commit 2683bca

Browse files
authored
Merge branch 'main' into simplify-bnb-int8-dequant
2 parents caa0074 + a6f043a commit 2683bca

File tree

289 files changed

+3744
-1646
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

289 files changed

+3744
-1646
lines changed

.github/workflows/build_docker_images.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ jobs:
3434
id: file_changes
3535
uses: jitterbit/get-changed-files@v1
3636
with:
37-
format: 'space-delimited'
37+
format: "space-delimited"
3838
token: ${{ secrets.GITHUB_TOKEN }}
3939

4040
- name: Build Changed Docker Images
@@ -67,6 +67,7 @@ jobs:
6767
- diffusers-pytorch-cuda
6868
- diffusers-pytorch-compile-cuda
6969
- diffusers-pytorch-xformers-cuda
70+
- diffusers-pytorch-minimum-cuda
7071
- diffusers-flax-cpu
7172
- diffusers-flax-tpu
7273
- diffusers-onnxruntime-cpu

.github/workflows/nightly_tests.yml

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,64 @@ jobs:
235235
run: |
236236
pip install slack_sdk tabulate
237237
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
238+
239+
torch_minimum_version_cuda_tests:
240+
name: Torch Minimum Version CUDA Tests
241+
runs-on:
242+
group: aws-g4dn-2xlarge
243+
container:
244+
image: diffusers/diffusers-pytorch-minimum-cuda
245+
options: --shm-size "16gb" --ipc host --gpus 0
246+
defaults:
247+
run:
248+
shell: bash
249+
steps:
250+
- name: Checkout diffusers
251+
uses: actions/checkout@v3
252+
with:
253+
fetch-depth: 2
254+
255+
- name: Install dependencies
256+
run: |
257+
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
258+
python -m uv pip install -e [quality,test]
259+
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
260+
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
238261
262+
- name: Environment
263+
run: |
264+
python utils/print_env.py
265+
266+
- name: Run PyTorch CUDA tests
267+
env:
268+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
269+
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
270+
CUBLAS_WORKSPACE_CONFIG: :16:8
271+
run: |
272+
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
273+
-s -v -k "not Flax and not Onnx" \
274+
--make-reports=tests_torch_minimum_version_cuda \
275+
tests/models/test_modeling_common.py \
276+
tests/pipelines/test_pipelines_common.py \
277+
tests/pipelines/test_pipeline_utils.py \
278+
tests/pipelines/test_pipelines.py \
279+
tests/pipelines/test_pipelines_auto.py \
280+
tests/schedulers/test_schedulers.py \
281+
tests/others
282+
283+
- name: Failure short reports
284+
if: ${{ failure() }}
285+
run: |
286+
cat reports/tests_torch_minimum_version_cuda_stats.txt
287+
cat reports/tests_torch_minimum_version_cuda_failures_short.txt
288+
289+
- name: Test suite reports artifacts
290+
if: ${{ always() }}
291+
uses: actions/upload-artifact@v4
292+
with:
293+
name: torch_minimum_version_cuda_test_reports
294+
path: reports
295+
239296
run_flax_tpu_tests:
240297
name: Nightly Flax TPU Tests
241298
runs-on:

.github/workflows/pr_tests.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -266,6 +266,7 @@ jobs:
266266
# TODO (sayakpaul, DN6): revisit `--no-deps`
267267
python -m pip install -U peft@git+https://github.com/huggingface/peft.git --no-deps
268268
python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
269+
python -m uv pip install -U tokenizers
269270
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git --no-deps
270271
271272
- name: Environment

.github/workflows/release_tests_fast.yml

Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -157,6 +157,63 @@ jobs:
157157
name: torch_cuda_${{ matrix.module }}_test_reports
158158
path: reports
159159

160+
torch_minimum_version_cuda_tests:
161+
name: Torch Minimum Version CUDA Tests
162+
runs-on:
163+
group: aws-g4dn-2xlarge
164+
container:
165+
image: diffusers/diffusers-pytorch-minimum-cuda
166+
options: --shm-size "16gb" --ipc host --gpus 0
167+
defaults:
168+
run:
169+
shell: bash
170+
steps:
171+
- name: Checkout diffusers
172+
uses: actions/checkout@v3
173+
with:
174+
fetch-depth: 2
175+
176+
- name: Install dependencies
177+
run: |
178+
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
179+
python -m uv pip install -e [quality,test]
180+
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
181+
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
182+
183+
- name: Environment
184+
run: |
185+
python utils/print_env.py
186+
187+
- name: Run PyTorch CUDA tests
188+
env:
189+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
190+
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
191+
CUBLAS_WORKSPACE_CONFIG: :16:8
192+
run: |
193+
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
194+
-s -v -k "not Flax and not Onnx" \
195+
--make-reports=tests_torch_minimum_cuda \
196+
tests/models/test_modeling_common.py \
197+
tests/pipelines/test_pipelines_common.py \
198+
tests/pipelines/test_pipeline_utils.py \
199+
tests/pipelines/test_pipelines.py \
200+
tests/pipelines/test_pipelines_auto.py \
201+
tests/schedulers/test_schedulers.py \
202+
tests/others
203+
204+
- name: Failure short reports
205+
if: ${{ failure() }}
206+
run: |
207+
cat reports/tests_torch_minimum_version_cuda_stats.txt
208+
cat reports/tests_torch_minimum_version_cuda_failures_short.txt
209+
210+
- name: Test suite reports artifacts
211+
if: ${{ always() }}
212+
uses: actions/upload-artifact@v4
213+
with:
214+
name: torch_minimum_version_cuda_test_reports
215+
path: reports
216+
160217
flax_tpu_tests:
161218
name: Flax TPU Tests
162219
runs-on: docker-tpu
Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
FROM nvidia/cuda:12.1.0-runtime-ubuntu20.04
2+
LABEL maintainer="Hugging Face"
3+
LABEL repository="diffusers"
4+
5+
ENV DEBIAN_FRONTEND=noninteractive
6+
ENV MINIMUM_SUPPORTED_TORCH_VERSION="2.1.0"
7+
ENV MINIMUM_SUPPORTED_TORCHVISION_VERSION="0.16.0"
8+
ENV MINIMUM_SUPPORTED_TORCHAUDIO_VERSION="2.1.0"
9+
10+
RUN apt-get -y update \
11+
&& apt-get install -y software-properties-common \
12+
&& add-apt-repository ppa:deadsnakes/ppa
13+
14+
RUN apt install -y bash \
15+
build-essential \
16+
git \
17+
git-lfs \
18+
curl \
19+
ca-certificates \
20+
libsndfile1-dev \
21+
libgl1 \
22+
python3.10 \
23+
python3.10-dev \
24+
python3-pip \
25+
python3.10-venv && \
26+
rm -rf /var/lib/apt/lists
27+
28+
# make sure to use venv
29+
RUN python3.10 -m venv /opt/venv
30+
ENV PATH="/opt/venv/bin:$PATH"
31+
32+
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
33+
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
34+
python3.10 -m uv pip install --no-cache-dir \
35+
torch==$MINIMUM_SUPPORTED_TORCH_VERSION \
36+
torchvision==$MINIMUM_SUPPORTED_TORCHVISION_VERSION \
37+
torchaudio==$MINIMUM_SUPPORTED_TORCHAUDIO_VERSION \
38+
invisible_watermark && \
39+
python3.10 -m pip install --no-cache-dir \
40+
accelerate \
41+
datasets \
42+
hf-doc-builder \
43+
huggingface-hub \
44+
hf_transfer \
45+
Jinja2 \
46+
librosa \
47+
numpy==1.26.4 \
48+
scipy \
49+
tensorboard \
50+
transformers \
51+
hf_transfer
52+
53+
CMD ["/bin/bash"]

docs/source/en/api/pipelines/aura_flow.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,33 @@ image = pipeline(prompt).images[0]
6262
image.save("auraflow.png")
6363
```
6464

65+
Loading [GGUF checkpoints](https://huggingface.co/docs/diffusers/quantization/gguf) are also supported:
66+
67+
```py
68+
import torch
69+
from diffusers import (
70+
AuraFlowPipeline,
71+
GGUFQuantizationConfig,
72+
AuraFlowTransformer2DModel,
73+
)
74+
75+
transformer = AuraFlowTransformer2DModel.from_single_file(
76+
"https://huggingface.co/city96/AuraFlow-v0.3-gguf/blob/main/aura_flow_0.3-Q2_K.gguf",
77+
quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
78+
torch_dtype=torch.bfloat16,
79+
)
80+
81+
pipeline = AuraFlowPipeline.from_pretrained(
82+
"fal/AuraFlow-v0.3",
83+
transformer=transformer,
84+
torch_dtype=torch.bfloat16,
85+
)
86+
87+
prompt = "a cute pony in a field of flowers"
88+
image = pipeline(prompt).images[0]
89+
image.save("auraflow.png")
90+
```
91+
6592
## AuraFlowPipeline
6693

6794
[[autodoc]] AuraFlowPipeline

docs/source/en/api/pipelines/sana.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,10 +59,10 @@ Refer to the [Quantization](../../quantization/overview) overview to learn more
5959
```py
6060
import torch
6161
from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig, SanaTransformer2DModel, SanaPipeline
62-
from transformers import BitsAndBytesConfig as BitsAndBytesConfig, AutoModelForCausalLM
62+
from transformers import BitsAndBytesConfig as BitsAndBytesConfig, AutoModel
6363

6464
quant_config = BitsAndBytesConfig(load_in_8bit=True)
65-
text_encoder_8bit = AutoModelForCausalLM.from_pretrained(
65+
text_encoder_8bit = AutoModel.from_pretrained(
6666
"Efficient-Large-Model/Sana_1600M_1024px_diffusers",
6767
subfolder="text_encoder",
6868
quantization_config=quant_config,

examples/advanced_diffusion_training/README.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,17 @@ write_basic_config()
6767
When running `accelerate config`, if we specify torch compile mode to True there can be dramatic speedups.
6868
Note also that we use PEFT library as backend for LoRA training, make sure to have `peft>=0.6.0` installed in your environment.
6969

70+
Lastly, we recommend logging into your HF account so that your trained LoRA is automatically uploaded to the hub:
71+
```bash
72+
huggingface-cli login
73+
```
74+
This command will prompt you for a token. Copy-paste yours from your [settings/tokens](https://huggingface.co/settings/tokens),and press Enter.
75+
76+
> [!NOTE]
77+
> In the examples below we use `wandb` to document the training runs. To do the same, make sure to install `wandb`:
78+
> `pip install wandb`
79+
> Alternatively, you can use other tools / train without reporting by modifying the flag `--report_to="wandb"`.
80+
7081
### Pivotal Tuning
7182
**Training with text encoder(s)**
7283

examples/advanced_diffusion_training/README_flux.md

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,17 @@ write_basic_config()
6565
When running `accelerate config`, if we specify torch compile mode to True there can be dramatic speedups.
6666
Note also that we use PEFT library as backend for LoRA training, make sure to have `peft>=0.6.0` installed in your environment.
6767

68+
Lastly, we recommend logging into your HF account so that your trained LoRA is automatically uploaded to the hub:
69+
```bash
70+
huggingface-cli login
71+
```
72+
This command will prompt you for a token. Copy-paste yours from your [settings/tokens](https://huggingface.co/settings/tokens),and press Enter.
73+
74+
> [!NOTE]
75+
> In the examples below we use `wandb` to document the training runs. To do the same, make sure to install `wandb`:
76+
> `pip install wandb`
77+
> Alternatively, you can use other tools / train without reporting by modifying the flag `--report_to="wandb"`.
78+
6879
### Target Modules
6980
When LoRA was first adapted from language models to diffusion models, it was applied to the cross-attention layers in the Unet that relate the image representations with the prompts that describe them.
7081
More recently, SOTA text-to-image diffusion models replaced the Unet with a diffusion Transformer(DiT). With this change, we may also want to explore

0 commit comments

Comments
 (0)