Skip to content

Commit 9410e46

Browse files
Merge branch 'cogview4_control' of https://github.com/zRzRzRzRzRzRzR/diffusers into cogview4_control
2 parents f55e3cc + efa0f41 commit 9410e46

File tree

120 files changed

+5031
-1103
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

120 files changed

+5031
-1103
lines changed
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
name: "\U0001F31F Remote VAE"
2+
description: Feedback for remote VAE pilot
3+
labels: [ "Remote VAE" ]
4+
5+
body:
6+
- type: textarea
7+
id: positive
8+
validations:
9+
required: true
10+
attributes:
11+
label: Did you like the remote VAE solution?
12+
description: |
13+
If you liked it, we would appreciate it if you could elaborate what you liked.
14+
15+
- type: textarea
16+
id: feedback
17+
validations:
18+
required: true
19+
attributes:
20+
label: What can be improved about the current solution?
21+
description: |
22+
Let us know the things you would like to see improved. Note that we will work optimizing the solution once the pilot is over and we have usage.
23+
24+
- type: textarea
25+
id: others
26+
validations:
27+
required: true
28+
attributes:
29+
label: What other VAEs you would like to see if the pilot goes well?
30+
description: |
31+
Provide a list of the VAEs you would like to see in the future if the pilot goes well.
32+
33+
- type: textarea
34+
id: additional-info
35+
attributes:
36+
label: Notify the members of the team
37+
description: |
38+
Tag the following folks when submitting this feedback: @hlky @sayakpaul

.github/workflows/pr_tests.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,7 @@ jobs:
6464
run: |
6565
python utils/check_copies.py
6666
python utils/check_dummies.py
67+
python utils/check_support_list.py
6768
make deps_table_check_updated
6869
- name: Check if failure
6970
if: ${{ failure() }}
@@ -120,7 +121,8 @@ jobs:
120121
run: |
121122
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
122123
python -m uv pip install -e [quality,test]
123-
python -m uv pip install accelerate
124+
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
125+
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git --no-deps
124126
125127
- name: Environment
126128
run: |

.github/workflows/push_tests.yml

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,13 @@
11
name: Fast GPU Tests on main
22

33
on:
4+
pull_request:
5+
branches: main
6+
paths:
7+
- "src/diffusers/models/modeling_utils.py"
8+
- "src/diffusers/models/model_loading_utils.py"
9+
- "src/diffusers/pipelines/pipeline_utils.py"
10+
- "src/diffusers/pipeline_loading_utils.py"
411
workflow_dispatch:
512
push:
613
branches:
@@ -160,6 +167,7 @@ jobs:
160167
path: reports
161168

162169
flax_tpu_tests:
170+
if: ${{ github.event_name != 'pull_request' }}
163171
name: Flax TPU Tests
164172
runs-on:
165173
group: gcp-ct5lp-hightpu-8t
@@ -208,6 +216,7 @@ jobs:
208216
path: reports
209217

210218
onnx_cuda_tests:
219+
if: ${{ github.event_name != 'pull_request' }}
211220
name: ONNX CUDA Tests
212221
runs-on:
213222
group: aws-g4dn-2xlarge
@@ -256,6 +265,7 @@ jobs:
256265
path: reports
257266

258267
run_torch_compile_tests:
268+
if: ${{ github.event_name != 'pull_request' }}
259269
name: PyTorch Compile CUDA tests
260270

261271
runs-on:
@@ -299,6 +309,7 @@ jobs:
299309
path: reports
300310

301311
run_xformers_tests:
312+
if: ${{ github.event_name != 'pull_request' }}
302313
name: PyTorch xformers CUDA tests
303314

304315
runs-on:
@@ -349,7 +360,6 @@ jobs:
349360
container:
350361
image: diffusers/diffusers-pytorch-cuda
351362
options: --gpus 0 --shm-size "16gb" --ipc host
352-
353363
steps:
354364
- name: Checkout diffusers
355365
uses: actions/checkout@v3
@@ -359,7 +369,6 @@ jobs:
359369
- name: NVIDIA-SMI
360370
run: |
361371
nvidia-smi
362-
363372
- name: Install dependencies
364373
run: |
365374
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"

.github/workflows/run_tests_from_a_pr.yml

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ on:
77
default: 'diffusers/diffusers-pytorch-cuda'
88
description: 'Name of the Docker image'
99
required: true
10-
branch:
11-
description: 'PR Branch to test on'
10+
pr_number:
11+
description: 'PR number to test on'
1212
required: true
1313
test:
1414
description: 'Tests to run (e.g.: `tests/models`).'
@@ -43,8 +43,8 @@ jobs:
4343
exit 1
4444
fi
4545
46-
if [[ ! "$PY_TEST" =~ ^tests/(models|pipelines) ]]; then
47-
echo "Error: The input string must contain either 'models' or 'pipelines' after 'tests/'."
46+
if [[ ! "$PY_TEST" =~ ^tests/(models|pipelines|lora) ]]; then
47+
echo "Error: The input string must contain either 'models', 'pipelines', or 'lora' after 'tests/'."
4848
exit 1
4949
fi
5050
@@ -53,13 +53,13 @@ jobs:
5353
exit 1
5454
fi
5555
echo "$PY_TEST"
56+
57+
shell: bash -e {0}
5658

5759
- name: Checkout PR branch
5860
uses: actions/checkout@v4
5961
with:
60-
ref: ${{ github.event.inputs.branch }}
61-
repository: ${{ github.event.pull_request.head.repo.full_name }}
62-
62+
ref: refs/pull/${{ inputs.pr_number }}/head
6363

6464
- name: Install pytest
6565
run: |

docs/source/en/api/activations.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,3 +25,16 @@ Customized activation functions for supporting various models in 🤗 Diffusers.
2525
## ApproximateGELU
2626

2727
[[autodoc]] models.activations.ApproximateGELU
28+
29+
30+
## SwiGLU
31+
32+
[[autodoc]] models.activations.SwiGLU
33+
34+
## FP32SiLU
35+
36+
[[autodoc]] models.activations.FP32SiLU
37+
38+
## LinearActivation
39+
40+
[[autodoc]] models.activations.LinearActivation

docs/source/en/api/attnprocessor.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -147,3 +147,20 @@ An attention processor is a class for applying different types of attention mech
147147
## XLAFlashAttnProcessor2_0
148148

149149
[[autodoc]] models.attention_processor.XLAFlashAttnProcessor2_0
150+
151+
## XFormersJointAttnProcessor
152+
153+
[[autodoc]] models.attention_processor.XFormersJointAttnProcessor
154+
155+
## IPAdapterXFormersAttnProcessor
156+
157+
[[autodoc]] models.attention_processor.IPAdapterXFormersAttnProcessor
158+
159+
## FluxIPAdapterJointAttnProcessor2_0
160+
161+
[[autodoc]] models.attention_processor.FluxIPAdapterJointAttnProcessor2_0
162+
163+
164+
## XLAFluxFlashAttnProcessor2_0
165+
166+
[[autodoc]] models.attention_processor.XLAFluxFlashAttnProcessor2_0

docs/source/en/api/loaders/lora.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
2323
- [`LTXVideoLoraLoaderMixin`] provides similar functions for [LTX-Video](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video).
2424
- [`SanaLoraLoaderMixin`] provides similar functions for [Sana](https://huggingface.co/docs/diffusers/main/en/api/pipelines/sana).
2525
- [`HunyuanVideoLoraLoaderMixin`] provides similar functions for [HunyuanVideo](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hunyuan_video).
26+
- [`Lumina2LoraLoaderMixin`] provides similar functions for [Lumina2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/lumina2).
2627
- [`AmusedLoraLoaderMixin`] is for the [`AmusedPipeline`].
2728
- [`LoraBaseMixin`] provides a base class with several utility methods to fuse, unfuse, unload, LoRAs and more.
2829

@@ -68,6 +69,10 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
6869

6970
[[autodoc]] loaders.lora_pipeline.HunyuanVideoLoraLoaderMixin
7071

72+
## Lumina2LoraLoaderMixin
73+
74+
[[autodoc]] loaders.lora_pipeline.Lumina2LoraLoaderMixin
75+
7176
## AmusedLoraLoaderMixin
7277

7378
[[autodoc]] loaders.lora_pipeline.AmusedLoraLoaderMixin

docs/source/en/api/normalization.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,43 @@ Customized normalization layers for supporting various models in 🤗 Diffusers.
2929
## AdaGroupNorm
3030

3131
[[autodoc]] models.normalization.AdaGroupNorm
32+
33+
## AdaLayerNormContinuous
34+
35+
[[autodoc]] models.normalization.AdaLayerNormContinuous
36+
37+
## RMSNorm
38+
39+
[[autodoc]] models.normalization.RMSNorm
40+
41+
## GlobalResponseNorm
42+
43+
[[autodoc]] models.normalization.GlobalResponseNorm
44+
45+
46+
## LuminaLayerNormContinuous
47+
[[autodoc]] models.normalization.LuminaLayerNormContinuous
48+
49+
## SD35AdaLayerNormZeroX
50+
[[autodoc]] models.normalization.SD35AdaLayerNormZeroX
51+
52+
## AdaLayerNormZeroSingle
53+
[[autodoc]] models.normalization.AdaLayerNormZeroSingle
54+
55+
## LuminaRMSNormZero
56+
[[autodoc]] models.normalization.LuminaRMSNormZero
57+
58+
## LpNorm
59+
[[autodoc]] models.normalization.LpNorm
60+
61+
## CogView3PlusAdaLayerNormZeroTextImage
62+
[[autodoc]] models.normalization.CogView3PlusAdaLayerNormZeroTextImage
63+
64+
## CogVideoXLayerNormZero
65+
[[autodoc]] models.normalization.CogVideoXLayerNormZero
66+
67+
## MochiRMSNormZero
68+
[[autodoc]] models.transformers.transformer_mochi.MochiRMSNormZero
69+
70+
## MochiRMSNorm
71+
[[autodoc]] models.normalization.MochiRMSNorm

docs/source/en/api/pipelines/hunyuan_video.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,21 @@ Recommendations for inference:
3232
- For smaller resolution videos, try lower values of `shift` (between `2.0` to `5.0`) in the [Scheduler](https://huggingface.co/docs/diffusers/main/en/api/schedulers/flow_match_euler_discrete#diffusers.FlowMatchEulerDiscreteScheduler.shift). For larger resolution images, try higher values (between `7.0` and `12.0`). The default value is `7.0` for HunyuanVideo.
3333
- For more information about supported resolutions and other details, please refer to the original repository [here](https://github.com/Tencent/HunyuanVideo/).
3434

35+
## Available models
36+
37+
The following models are available for the [`HunyuanVideoPipeline`](text-to-video) pipeline:
38+
39+
| Model name | Description |
40+
|:---|:---|
41+
| [`hunyuanvideo-community/HunyuanVideo`](https://huggingface.co/hunyuanvideo-community/HunyuanVideo) | Official HunyuanVideo (guidance-distilled). Performs best at multiple resolutions and frames. Performs best with `guidance_scale=6.0`, `true_cfg_scale=1.0` and without a negative prompt. |
42+
| [`https://huggingface.co/Skywork/SkyReels-V1-Hunyuan-T2V`](https://huggingface.co/Skywork/SkyReels-V1-Hunyuan-T2V) | Skywork's custom finetune of HunyuanVideo (de-distilled). Performs best with `97x544x960` resolution, `guidance_scale=1.0`, `true_cfg_scale=6.0` and a negative prompt. |
43+
44+
The following models are available for the image-to-video pipeline:
45+
46+
| Model name | Description |
47+
|:---|:---|
48+
| [`https://huggingface.co/Skywork/SkyReels-V1-Hunyuan-I2V`](https://huggingface.co/Skywork/SkyReels-V1-Hunyuan-I2V) | Skywork's custom finetune of HunyuanVideo (de-distilled). Performs best with `97x544x960` resolution. Performs best at `97x544x960` resolution, `guidance_scale=1.0`, `true_cfg_scale=6.0` and a negative prompt. |
49+
3550
## Quantization
3651

3752
Quantization helps reduce the memory requirements of very large models by storing model weights in a lower precision data type. However, quantization may have varying impact on video quality depending on the video model.

docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_3.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ from diffusers import StableDiffusion3Pipeline
7777
from transformers import SiglipVisionModel, SiglipImageProcessor
7878

7979
image_encoder_id = "google/siglip-so400m-patch14-384"
80-
ip_adapter_id = "guiyrt/InstantX-SD3.5-Large-IP-Adapter-diffusers"
80+
ip_adapter_id = "InstantX/SD3.5-Large-IP-Adapter"
8181

8282
feature_extractor = SiglipImageProcessor.from_pretrained(
8383
image_encoder_id,

0 commit comments

Comments
 (0)