Skip to content

Commit 176d30f

Browse files
authored
Merge branch 'main' into model-loading-refactor
2 parents f1138d3 + 6fe05b9 commit 176d30f

File tree

62 files changed

+2633
-202
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

62 files changed

+2633
-202
lines changed

docs/source/en/_toctree.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -278,6 +278,8 @@
278278
title: ConsisIDTransformer3DModel
279279
- local: api/models/cogview3plus_transformer2d
280280
title: CogView3PlusTransformer2DModel
281+
- local: api/models/cogview4_transformer2d
282+
title: CogView4Transformer2DModel
281283
- local: api/models/dit_transformer2d
282284
title: DiTTransformer2DModel
283285
- local: api/models/flux_transformer
@@ -382,6 +384,8 @@
382384
title: CogVideoX
383385
- local: api/pipelines/cogview3
384386
title: CogView3
387+
- local: api/pipelines/cogview4
388+
title: CogView4
385389
- local: api/pipelines/consisid
386390
title: ConsisID
387391
- local: api/pipelines/consistency_models

docs/source/en/api/loaders/lora.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,9 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
2020
- [`FluxLoraLoaderMixin`] provides similar functions for [Flux](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux).
2121
- [`CogVideoXLoraLoaderMixin`] provides similar functions for [CogVideoX](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogvideox).
2222
- [`Mochi1LoraLoaderMixin`] provides similar functions for [Mochi](https://huggingface.co/docs/diffusers/main/en/api/pipelines/mochi).
23+
- [`LTXVideoLoraLoaderMixin`] provides similar functions for [LTX-Video](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video).
24+
- [`SanaLoraLoaderMixin`] provides similar functions for [Sana](https://huggingface.co/docs/diffusers/main/en/api/pipelines/sana).
25+
- [`HunyuanVideoLoraLoaderMixin`] provides similar functions for [HunyuanVideo](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hunyuan_video).
2326
- [`AmusedLoraLoaderMixin`] is for the [`AmusedPipeline`].
2427
- [`LoraBaseMixin`] provides a base class with several utility methods to fuse, unfuse, unload, LoRAs and more.
2528

@@ -53,6 +56,18 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
5356

5457
[[autodoc]] loaders.lora_pipeline.Mochi1LoraLoaderMixin
5558

59+
## LTXVideoLoraLoaderMixin
60+
61+
[[autodoc]] loaders.lora_pipeline.LTXVideoLoraLoaderMixin
62+
63+
## SanaLoraLoaderMixin
64+
65+
[[autodoc]] loaders.lora_pipeline.SanaLoraLoaderMixin
66+
67+
## HunyuanVideoLoraLoaderMixin
68+
69+
[[autodoc]] loaders.lora_pipeline.HunyuanVideoLoraLoaderMixin
70+
5671
## AmusedLoraLoaderMixin
5772

5873
[[autodoc]] loaders.lora_pipeline.AmusedLoraLoaderMixin
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# CogView4Transformer2DModel
13+
14+
A Diffusion Transformer model for 2D data from [CogView4]()
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import CogView4Transformer2DModel
20+
21+
transformer = CogView4Transformer2DModel.from_pretrained("THUDM/CogView4-6B", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
22+
```
23+
24+
## CogView4Transformer2DModel
25+
26+
[[autodoc]] CogView4Transformer2DModel
27+
28+
## Transformer2DModelOutput
29+
30+
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
-->
15+
16+
# CogView4
17+
18+
<Tip>
19+
20+
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
21+
22+
</Tip>
23+
24+
This pipeline was contributed by [zRzRzRzRzRzRzR](https://github.com/zRzRzRzRzRzRzR). The original codebase can be found [here](https://huggingface.co/THUDM). The original weights can be found under [hf.co/THUDM](https://huggingface.co/THUDM).
25+
26+
## CogView4Pipeline
27+
28+
[[autodoc]] CogView4Pipeline
29+
- all
30+
- __call__
31+
32+
## CogView4PipelineOutput
33+
34+
[[autodoc]] pipelines.cogview4.pipeline_output.CogView4PipelineOutput

docs/source/en/training/custom_diffusion.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -339,7 +339,10 @@ import torch
339339
from huggingface_hub.repocard import RepoCard
340340
from diffusers import DiffusionPipeline
341341

342-
pipeline = DiffusionPipeline.from_pretrained("sayakpaul/custom-diffusion-cat-wooden-pot", torch_dtype=torch.float16).to("cuda")
342+
pipeline = DiffusionPipeline.from_pretrained(
343+
"CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16,
344+
).to("cuda")
345+
model_id = "sayakpaul/custom-diffusion-cat-wooden-pot"
343346
pipeline.unet.load_attn_procs(model_id, weight_name="pytorch_custom_diffusion_weights.bin")
344347
pipeline.load_textual_inversion(model_id, weight_name="<new1>.bin")
345348
pipeline.load_textual_inversion(model_id, weight_name="<new2>.bin")

examples/community/pipeline_flux_differential_img2img.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ def calculate_shift(
8787
base_seq_len: int = 256,
8888
max_seq_len: int = 4096,
8989
base_shift: float = 0.5,
90-
max_shift: float = 1.16,
90+
max_shift: float = 1.15,
9191
):
9292
m = (max_shift - base_shift) / (max_seq_len - base_seq_len)
9393
b = base_shift - m * base_seq_len
@@ -878,7 +878,7 @@ def __call__(
878878
self.scheduler.config.get("base_image_seq_len", 256),
879879
self.scheduler.config.get("max_image_seq_len", 4096),
880880
self.scheduler.config.get("base_shift", 0.5),
881-
self.scheduler.config.get("max_shift", 1.16),
881+
self.scheduler.config.get("max_shift", 1.15),
882882
)
883883
timesteps, num_inference_steps = retrieve_timesteps(
884884
self.scheduler,

examples/community/pipeline_flux_rf_inversion.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ def calculate_shift(
9494
base_seq_len: int = 256,
9595
max_seq_len: int = 4096,
9696
base_shift: float = 0.5,
97-
max_shift: float = 1.16,
97+
max_shift: float = 1.15,
9898
):
9999
m = (max_shift - base_shift) / (max_seq_len - base_seq_len)
100100
b = base_shift - m * base_seq_len
@@ -823,7 +823,7 @@ def __call__(
823823
self.scheduler.config.get("base_image_seq_len", 256),
824824
self.scheduler.config.get("max_image_seq_len", 4096),
825825
self.scheduler.config.get("base_shift", 0.5),
826-
self.scheduler.config.get("max_shift", 1.16),
826+
self.scheduler.config.get("max_shift", 1.15),
827827
)
828828
timesteps, num_inference_steps = retrieve_timesteps(
829829
self.scheduler,
@@ -993,7 +993,7 @@ def invert(
993993
self.scheduler.config.get("base_image_seq_len", 256),
994994
self.scheduler.config.get("max_image_seq_len", 4096),
995995
self.scheduler.config.get("base_shift", 0.5),
996-
self.scheduler.config.get("max_shift", 1.16),
996+
self.scheduler.config.get("max_shift", 1.15),
997997
)
998998
timesteps, num_inversion_steps = retrieve_timesteps(
999999
self.scheduler,

examples/community/pipeline_flux_semantic_guidance.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,7 @@ def calculate_shift(
9191
base_seq_len: int = 256,
9292
max_seq_len: int = 4096,
9393
base_shift: float = 0.5,
94-
max_shift: float = 1.16,
94+
max_shift: float = 1.15,
9595
):
9696
m = (max_shift - base_shift) / (max_seq_len - base_seq_len)
9797
b = base_shift - m * base_seq_len
@@ -1041,7 +1041,7 @@ def __call__(
10411041
self.scheduler.config.get("base_image_seq_len", 256),
10421042
self.scheduler.config.get("max_image_seq_len", 4096),
10431043
self.scheduler.config.get("base_shift", 0.5),
1044-
self.scheduler.config.get("max_shift", 1.16),
1044+
self.scheduler.config.get("max_shift", 1.15),
10451045
)
10461046
timesteps, num_inference_steps = retrieve_timesteps(
10471047
self.scheduler,

examples/community/pipeline_flux_with_cfg.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ def calculate_shift(
7070
base_seq_len: int = 256,
7171
max_seq_len: int = 4096,
7272
base_shift: float = 0.5,
73-
max_shift: float = 1.16,
73+
max_shift: float = 1.15,
7474
):
7575
m = (max_shift - base_shift) / (max_seq_len - base_seq_len)
7676
b = base_shift - m * base_seq_len
@@ -759,7 +759,7 @@ def __call__(
759759
self.scheduler.config.get("base_image_seq_len", 256),
760760
self.scheduler.config.get("max_image_seq_len", 4096),
761761
self.scheduler.config.get("base_shift", 0.5),
762-
self.scheduler.config.get("max_shift", 1.16),
762+
self.scheduler.config.get("max_shift", 1.15),
763763
)
764764
timesteps, num_inference_steps = retrieve_timesteps(
765765
self.scheduler,

examples/controlnet/train_controlnet.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1143,7 +1143,7 @@ def load_model_hook(models, input_dir):
11431143
if global_step >= args.max_train_steps:
11441144
break
11451145

1146-
# Create the pipeline using using the trained modules and save it.
1146+
# Create the pipeline using the trained modules and save it.
11471147
accelerator.wait_for_everyone()
11481148
if accelerator.is_main_process:
11491149
controlnet = unwrap_model(controlnet)

0 commit comments

Comments
 (0)