Skip to content

Commit e9f6626

Browse files
committed
Merge remote-tracking branch 'origin/cogview4' into cogview4
2 parents c7d1227 + 6090ea7 commit e9f6626

File tree

241 files changed

+3329
-1025
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

241 files changed

+3329
-1025
lines changed

.github/workflows/push_tests.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -83,7 +83,7 @@ jobs:
8383
python utils/print_env.py
8484
- name: PyTorch CUDA checkpoint tests on Ubuntu
8585
env:
86-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
86+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
8787
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
8888
CUBLAS_WORKSPACE_CONFIG: :16:8
8989
run: |
@@ -137,7 +137,7 @@ jobs:
137137
138138
- name: Run PyTorch CUDA tests
139139
env:
140-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
140+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
141141
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
142142
CUBLAS_WORKSPACE_CONFIG: :16:8
143143
run: |

docs/source/en/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -370,6 +370,8 @@
370370
title: CogVideoX
371371
- local: api/pipelines/cogview3
372372
title: CogView3
373+
- local: api/pipelines/cogview4
374+
title: CogView4
373375
- local: api/pipelines/consistency_models
374376
title: Consistency Models
375377
- local: api/pipelines/controlnet
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
-->
15+
16+
# CogView4
17+
18+
<Tip>
19+
20+
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
21+
22+
</Tip>
23+
24+
This pipeline was contributed by [zRzRzRzRzRzRzR](https://github.com/zRzRzRzRzRzRzR). The original codebase can be found [here](https://huggingface.co/THUDM). The original weights can be found under [hf.co/THUDM](https://huggingface.co/THUDM).
25+
26+
## CogView4Pipeline
27+
28+
[[autodoc]] CogView4Pipeline
29+
- all
30+
- __call__
31+
32+
## CogView4PipelineOutput
33+
34+
[[autodoc]] pipelines.cogview4.pipeline_output.CogView4PipelineOutput

docs/source/en/using-diffusers/other-formats.md

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -240,6 +240,46 @@ Benefits of using a single-file layout include:
240240
1. Easy compatibility with diffusion interfaces such as [ComfyUI](https://github.com/comfyanonymous/ComfyUI) or [Automatic1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui) which commonly use a single-file layout.
241241
2. Easier to manage (download and share) a single file.
242242

243+
### DDUF
244+
245+
> [!WARNING]
246+
> DDUF is an experimental file format and APIs related to it can change in the future.
247+
248+
DDUF (**D**DUF **D**iffusion **U**nified **F**ormat) is a file format designed to make storing, distributing, and using diffusion models much easier. Built on the ZIP file format, DDUF offers a standardized, efficient, and flexible way to package all parts of a diffusion model into a single, easy-to-manage file. It provides a balance between Diffusers multi-folder format and the widely popular single-file format.
249+
250+
Learn more details about DDUF on the Hugging Face Hub [documentation](https://huggingface.co/docs/hub/dduf).
251+
252+
Pass a checkpoint to the `dduf_file` parameter to load it in [`DiffusionPipeline`].
253+
254+
```py
255+
from diffusers import DiffusionPipeline
256+
import torch
257+
258+
pipe = DiffusionPipeline.from_pretrained(
259+
"DDUF/FLUX.1-dev-DDUF", dduf_file="FLUX.1-dev.dduf", torch_dtype=torch.bfloat16
260+
).to("cuda")
261+
image = pipe(
262+
"photo a cat holding a sign that says Diffusers", num_inference_steps=50, guidance_scale=3.5
263+
).images[0]
264+
image.save("cat.png")
265+
```
266+
267+
To save a pipeline as a `.dduf` checkpoint, use the [`~huggingface_hub.export_folder_as_dduf`] utility, which takes care of all the necessary file-level validations.
268+
269+
```py
270+
from huggingface_hub import export_folder_as_dduf
271+
from diffusers import DiffusionPipeline
272+
import torch
273+
274+
pipe = DiffusionPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
275+
276+
save_folder = "flux-dev"
277+
pipe.save_pretrained("flux-dev")
278+
export_folder_as_dduf("flux-dev.dduf", folder_path=save_folder)
279+
280+
> [!TIP]
281+
> Packaging and loading quantized checkpoints in the DDUF format is supported as long as they respect the multi-folder structure.
282+
243283
## Convert layout and files
244284

245285
Diffusers provides many scripts and methods to convert storage layouts and file formats to enable broader support across the diffusion ecosystem.

examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -818,9 +818,9 @@ def initialize_new_tokens(self, inserting_toks: List[str]):
818818
idx = 0
819819
for tokenizer, text_encoder in zip(self.tokenizers, self.text_encoders):
820820
assert isinstance(inserting_toks, list), "inserting_toks should be a list of strings."
821-
assert all(
822-
isinstance(tok, str) for tok in inserting_toks
823-
), "All elements in inserting_toks should be strings."
821+
assert all(isinstance(tok, str) for tok in inserting_toks), (
822+
"All elements in inserting_toks should be strings."
823+
)
824824

825825
self.inserting_toks = inserting_toks
826826
special_tokens_dict = {"additional_special_tokens": self.inserting_toks}
@@ -1683,7 +1683,7 @@ def load_model_hook(models, input_dir):
16831683
lora_state_dict = FluxPipeline.lora_state_dict(input_dir)
16841684

16851685
transformer_state_dict = {
1686-
f'{k.replace("transformer.", "")}': v for k, v in lora_state_dict.items() if k.startswith("transformer.")
1686+
f"{k.replace('transformer.', '')}": v for k, v in lora_state_dict.items() if k.startswith("transformer.")
16871687
}
16881688
transformer_state_dict = convert_unet_state_dict_to_peft(transformer_state_dict)
16891689
incompatible_keys = set_peft_model_state_dict(transformer_, transformer_state_dict, adapter_name="default")

examples/advanced_diffusion_training/train_dreambooth_lora_sd15_advanced.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -200,7 +200,7 @@ def save_model_card(
200200
"diffusers",
201201
"diffusers-training",
202202
lora,
203-
"template:sd-lora" "stable-diffusion",
203+
"template:sd-lorastable-diffusion",
204204
"stable-diffusion-diffusers",
205205
]
206206
model_card = populate_model_card(model_card, tags=tags)
@@ -724,9 +724,9 @@ def initialize_new_tokens(self, inserting_toks: List[str]):
724724
idx = 0
725725
for tokenizer, text_encoder in zip(self.tokenizers, self.text_encoders):
726726
assert isinstance(inserting_toks, list), "inserting_toks should be a list of strings."
727-
assert all(
728-
isinstance(tok, str) for tok in inserting_toks
729-
), "All elements in inserting_toks should be strings."
727+
assert all(isinstance(tok, str) for tok in inserting_toks), (
728+
"All elements in inserting_toks should be strings."
729+
)
730730

731731
self.inserting_toks = inserting_toks
732732
special_tokens_dict = {"additional_special_tokens": self.inserting_toks}
@@ -746,9 +746,9 @@ def initialize_new_tokens(self, inserting_toks: List[str]):
746746
.to(dtype=self.dtype)
747747
* std_token_embedding
748748
)
749-
self.embeddings_settings[
750-
f"original_embeddings_{idx}"
751-
] = text_encoder.text_model.embeddings.token_embedding.weight.data.clone()
749+
self.embeddings_settings[f"original_embeddings_{idx}"] = (
750+
text_encoder.text_model.embeddings.token_embedding.weight.data.clone()
751+
)
752752
self.embeddings_settings[f"std_token_embedding_{idx}"] = std_token_embedding
753753

754754
inu = torch.ones((len(tokenizer),), dtype=torch.bool)
@@ -1322,7 +1322,7 @@ def load_model_hook(models, input_dir):
13221322

13231323
lora_state_dict, network_alphas = StableDiffusionPipeline.lora_state_dict(input_dir)
13241324

1325-
unet_state_dict = {f'{k.replace("unet.", "")}': v for k, v in lora_state_dict.items() if k.startswith("unet.")}
1325+
unet_state_dict = {f"{k.replace('unet.', '')}": v for k, v in lora_state_dict.items() if k.startswith("unet.")}
13261326
unet_state_dict = convert_unet_state_dict_to_peft(unet_state_dict)
13271327
incompatible_keys = set_peft_model_state_dict(unet_, unet_state_dict, adapter_name="default")
13281328
if incompatible_keys is not None:

examples/advanced_diffusion_training/train_dreambooth_lora_sdxl_advanced.py

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ def save_model_card(
116116
for i, image in enumerate(images):
117117
image.save(os.path.join(repo_folder, f"image_{i}.png"))
118118
img_str += f"""
119-
- text: '{validation_prompt if validation_prompt else ' ' }'
119+
- text: '{validation_prompt if validation_prompt else " "}'
120120
output:
121121
url:
122122
"image_{i}.png"
@@ -891,9 +891,9 @@ def initialize_new_tokens(self, inserting_toks: List[str]):
891891
idx = 0
892892
for tokenizer, text_encoder in zip(self.tokenizers, self.text_encoders):
893893
assert isinstance(inserting_toks, list), "inserting_toks should be a list of strings."
894-
assert all(
895-
isinstance(tok, str) for tok in inserting_toks
896-
), "All elements in inserting_toks should be strings."
894+
assert all(isinstance(tok, str) for tok in inserting_toks), (
895+
"All elements in inserting_toks should be strings."
896+
)
897897

898898
self.inserting_toks = inserting_toks
899899
special_tokens_dict = {"additional_special_tokens": self.inserting_toks}
@@ -913,9 +913,9 @@ def initialize_new_tokens(self, inserting_toks: List[str]):
913913
.to(dtype=self.dtype)
914914
* std_token_embedding
915915
)
916-
self.embeddings_settings[
917-
f"original_embeddings_{idx}"
918-
] = text_encoder.text_model.embeddings.token_embedding.weight.data.clone()
916+
self.embeddings_settings[f"original_embeddings_{idx}"] = (
917+
text_encoder.text_model.embeddings.token_embedding.weight.data.clone()
918+
)
919919
self.embeddings_settings[f"std_token_embedding_{idx}"] = std_token_embedding
920920

921921
inu = torch.ones((len(tokenizer),), dtype=torch.bool)
@@ -1648,7 +1648,7 @@ def load_model_hook(models, input_dir):
16481648

16491649
lora_state_dict, network_alphas = StableDiffusionLoraLoaderMixin.lora_state_dict(input_dir)
16501650

1651-
unet_state_dict = {f'{k.replace("unet.", "")}': v for k, v in lora_state_dict.items() if k.startswith("unet.")}
1651+
unet_state_dict = {f"{k.replace('unet.', '')}": v for k, v in lora_state_dict.items() if k.startswith("unet.")}
16521652
unet_state_dict = convert_unet_state_dict_to_peft(unet_state_dict)
16531653
incompatible_keys = set_peft_model_state_dict(unet_, unet_state_dict, adapter_name="default")
16541654
if incompatible_keys is not None:

examples/amused/train_amused.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -720,7 +720,7 @@ def load_model_hook(models, input_dir):
720720
# Train!
721721
logger.info("***** Running training *****")
722722
logger.info(f" Num training steps = {args.max_train_steps}")
723-
logger.info(f" Instantaneous batch size per device = { args.train_batch_size}")
723+
logger.info(f" Instantaneous batch size per device = {args.train_batch_size}")
724724
logger.info(f" Total train batch size (w. parallel, distributed & accumulation) = {total_batch_size}")
725725
logger.info(f" Gradient Accumulation steps = {args.gradient_accumulation_steps}")
726726

examples/cogvideo/train_cogvideox_image_to_video_lora.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1138,7 +1138,7 @@ def load_model_hook(models, input_dir):
11381138
lora_state_dict = CogVideoXImageToVideoPipeline.lora_state_dict(input_dir)
11391139

11401140
transformer_state_dict = {
1141-
f'{k.replace("transformer.", "")}': v for k, v in lora_state_dict.items() if k.startswith("transformer.")
1141+
f"{k.replace('transformer.', '')}": v for k, v in lora_state_dict.items() if k.startswith("transformer.")
11421142
}
11431143
transformer_state_dict = convert_unet_state_dict_to_peft(transformer_state_dict)
11441144
incompatible_keys = set_peft_model_state_dict(transformer_, transformer_state_dict, adapter_name="default")

examples/cogvideo/train_cogvideox_lora.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1159,7 +1159,7 @@ def load_model_hook(models, input_dir):
11591159
lora_state_dict = CogVideoXPipeline.lora_state_dict(input_dir)
11601160

11611161
transformer_state_dict = {
1162-
f'{k.replace("transformer.", "")}': v for k, v in lora_state_dict.items() if k.startswith("transformer.")
1162+
f"{k.replace('transformer.', '')}": v for k, v in lora_state_dict.items() if k.startswith("transformer.")
11631163
}
11641164
transformer_state_dict = convert_unet_state_dict_to_peft(transformer_state_dict)
11651165
incompatible_keys = set_peft_model_state_dict(transformer_, transformer_state_dict, adapter_name="default")

0 commit comments

Comments
 (0)