Skip to content

Commit 2ffd45d

Browse files
authored
Merge branch 'main' into Add-Lora-Alpha-For-HiDream-Lora
2 parents 6bafe8f + f36ba9f commit 2ffd45d

File tree

95 files changed

+11910
-1342
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

95 files changed

+11910
-1342
lines changed

docs/source/en/_toctree.yml

Lines changed: 168 additions & 153 deletions
Large diffs are not rendered by default.

docs/source/en/api/loaders/lora.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
2626
- [`HunyuanVideoLoraLoaderMixin`] provides similar functions for [HunyuanVideo](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hunyuan_video).
2727
- [`Lumina2LoraLoaderMixin`] provides similar functions for [Lumina2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/lumina2).
2828
- [`WanLoraLoaderMixin`] provides similar functions for [Wan](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan).
29+
- [`SkyReelsV2LoraLoaderMixin`] provides similar functions for [SkyReels-V2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/skyreels_v2).
2930
- [`CogView4LoraLoaderMixin`] provides similar functions for [CogView4](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogview4).
3031
- [`AmusedLoraLoaderMixin`] is for the [`AmusedPipeline`].
3132
- [`HiDreamImageLoraLoaderMixin`] provides similar functions for [HiDream Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hidream)
@@ -92,6 +93,10 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
9293

9394
[[autodoc]] loaders.lora_pipeline.WanLoraLoaderMixin
9495

96+
## SkyReelsV2LoraLoaderMixin
97+
98+
[[autodoc]] loaders.lora_pipeline.SkyReelsV2LoraLoaderMixin
99+
95100
## AmusedLoraLoaderMixin
96101

97102
[[autodoc]] loaders.lora_pipeline.AmusedLoraLoaderMixin
@@ -100,6 +105,6 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
100105

101106
[[autodoc]] loaders.lora_pipeline.HiDreamImageLoraLoaderMixin
102107

103-
## WanLoraLoaderMixin
108+
## LoraBaseMixin
104109

105-
[[autodoc]] loaders.lora_pipeline.WanLoraLoaderMixin
110+
[[autodoc]] loaders.lora_base.LoraBaseMixin
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# SkyReelsV2Transformer3DModel
13+
14+
A Diffusion Transformer model for 3D video-like data was introduced in [SkyReels-V2](https://github.com/SkyworkAI/SkyReels-V2) by the Skywork AI.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import SkyReelsV2Transformer3DModel
20+
21+
transformer = SkyReelsV2Transformer3DModel.from_pretrained("Skywork/SkyReels-V2-DF-1.3B-540P-Diffusers", subfolder="transformer", torch_dtype=torch.bfloat16)
22+
```
23+
24+
## SkyReelsV2Transformer3DModel
25+
26+
[[autodoc]] SkyReelsV2Transformer3DModel
27+
28+
## Transformer2DModelOutput
29+
30+
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput

docs/source/en/api/pipelines/skyreels_v2.md

Lines changed: 367 additions & 0 deletions
Large diffs are not rendered by default.

docs/source/en/optimization/fp16.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,12 @@ The `step()` function is [called](https://github.com/huggingface/diffusers/blob/
239239

240240
In general, the `sigmas` should [stay on the CPU](https://github.com/huggingface/diffusers/blob/35a969d297cba69110d175ee79c59312b9f49e1e/src/diffusers/schedulers/scheduling_euler_discrete.py#L240) to avoid the communication sync and latency.
241241

242+
<Tip>
243+
244+
Refer to the [torch.compile and Diffusers: A Hands-On Guide to Peak Performance](https://pytorch.org/blog/torch-compile-and-diffusers-a-hands-on-guide-to-peak-performance/) blog post for maximizing performance with `torch.compile` for diffusion models.
245+
246+
</Tip>
247+
242248
### Benchmarks
243249

244250
Refer to the [diffusers/benchmarks](https://huggingface.co/datasets/diffusers/benchmarks) dataset to see inference latency and memory usage data for compiled pipelines.
@@ -298,4 +304,6 @@ pipeline.fuse_qkv_projections()
298304

299305
- Read the [Presenting Flux Fast: Making Flux go brrr on H100s](https://pytorch.org/blog/presenting-flux-fast-making-flux-go-brrr-on-h100s/) blog post to learn more about how you can combine all of these optimizations with [TorchInductor](https://docs.pytorch.org/docs/stable/torch.compiler.html) and [AOTInductor](https://docs.pytorch.org/docs/stable/torch.compiler_aot_inductor.html) for a ~2.5x speedup using recipes from [flux-fast](https://github.com/huggingface/flux-fast).
300306

301-
These recipes support AMD hardware and [Flux.1 Kontext Dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev).
307+
These recipes support AMD hardware and [Flux.1 Kontext Dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev).
308+
- Read the [torch.compile and Diffusers: A Hands-On Guide to Peak Performance](https://pytorch.org/blog/torch-compile-and-diffusers-a-hands-on-guide-to-peak-performance/) blog post
309+
to maximize performance when using `torch.compile`.

docs/source/en/tutorials/tutorial_overview.md

Lines changed: 0 additions & 23 deletions
This file was deleted.

docs/source/en/using-diffusers/overview_techniques.md

Lines changed: 0 additions & 18 deletions
This file was deleted.

examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced.py

Lines changed: 7 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -971,6 +971,7 @@ class DreamBoothDataset(Dataset):
971971

972972
def __init__(
973973
self,
974+
args,
974975
instance_data_root,
975976
instance_prompt,
976977
class_prompt,
@@ -980,10 +981,8 @@ def __init__(
980981
class_num=None,
981982
size=1024,
982983
repeats=1,
983-
center_crop=False,
984984
):
985985
self.size = size
986-
self.center_crop = center_crop
987986

988987
self.instance_prompt = instance_prompt
989988
self.custom_instance_prompts = None
@@ -1058,7 +1057,7 @@ def __init__(
10581057
if interpolation is None:
10591058
raise ValueError(f"Unsupported interpolation mode {interpolation=}.")
10601059
train_resize = transforms.Resize(size, interpolation=interpolation)
1061-
train_crop = transforms.CenterCrop(size) if center_crop else transforms.RandomCrop(size)
1060+
train_crop = transforms.CenterCrop(size) if args.center_crop else transforms.RandomCrop(size)
10621061
train_flip = transforms.RandomHorizontalFlip(p=1.0)
10631062
train_transforms = transforms.Compose(
10641063
[
@@ -1075,11 +1074,11 @@ def __init__(
10751074
# flip
10761075
image = train_flip(image)
10771076
if args.center_crop:
1078-
y1 = max(0, int(round((image.height - args.resolution) / 2.0)))
1079-
x1 = max(0, int(round((image.width - args.resolution) / 2.0)))
1077+
y1 = max(0, int(round((image.height - self.size) / 2.0)))
1078+
x1 = max(0, int(round((image.width - self.size) / 2.0)))
10801079
image = train_crop(image)
10811080
else:
1082-
y1, x1, h, w = train_crop.get_params(image, (args.resolution, args.resolution))
1081+
y1, x1, h, w = train_crop.get_params(image, (self.size, self.size))
10831082
image = crop(image, y1, x1, h, w)
10841083
image = train_transforms(image)
10851084
self.pixel_values.append(image)
@@ -1102,7 +1101,7 @@ def __init__(
11021101
self.image_transforms = transforms.Compose(
11031102
[
11041103
transforms.Resize(size, interpolation=interpolation),
1105-
transforms.CenterCrop(size) if center_crop else transforms.RandomCrop(size),
1104+
transforms.CenterCrop(size) if args.center_crop else transforms.RandomCrop(size),
11061105
transforms.ToTensor(),
11071106
transforms.Normalize([0.5], [0.5]),
11081107
]
@@ -1827,6 +1826,7 @@ def load_model_hook(models, input_dir):
18271826

18281827
# Dataset and DataLoaders creation:
18291828
train_dataset = DreamBoothDataset(
1829+
args=args,
18301830
instance_data_root=args.instance_data_dir,
18311831
instance_prompt=args.instance_prompt,
18321832
train_text_encoder_ti=args.train_text_encoder_ti,
@@ -1836,7 +1836,6 @@ def load_model_hook(models, input_dir):
18361836
class_num=args.num_class_images,
18371837
size=args.resolution,
18381838
repeats=args.repeats,
1839-
center_crop=args.center_crop,
18401839
)
18411840

18421841
train_dataloader = torch.utils.data.DataLoader(

examples/dreambooth/train_dreambooth_lora_flux_kontext.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1614,7 +1614,7 @@ def load_model_hook(models, input_dir):
16141614
)
16151615
if args.cond_image_column is not None:
16161616
logger.info("I2I fine-tuning enabled.")
1617-
batch_sampler = BucketBatchSampler(train_dataset, batch_size=args.train_batch_size, drop_last=False)
1617+
batch_sampler = BucketBatchSampler(train_dataset, batch_size=args.train_batch_size, drop_last=True)
16181618
train_dataloader = torch.utils.data.DataLoader(
16191619
train_dataset,
16201620
batch_sampler=batch_sampler,

0 commit comments

Comments
 (0)