Skip to content

Commit e9ff8bc

Browse files
authored
Merge branch 'main' into rf-inversion
2 parents b2c6455 + 8421c14 commit e9ff8bc

21 files changed

+134
-102
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -114,7 +114,7 @@ Check out the [Quickstart](https://huggingface.co/docs/diffusers/quicktour) to l
114114
| [Tutorial](https://huggingface.co/docs/diffusers/tutorials/tutorial_overview) | A basic crash course for learning how to use the library's most important features like using models and schedulers to build your own diffusion system, and training your own diffusion model. |
115115
| [Loading](https://huggingface.co/docs/diffusers/using-diffusers/loading_overview) | Guides for how to load and configure all the components (pipelines, models, and schedulers) of the library, as well as how to use different schedulers. |
116116
| [Pipelines for inference](https://huggingface.co/docs/diffusers/using-diffusers/pipeline_overview) | Guides for how to use pipelines for different inference tasks, batched generation, controlling generated outputs and randomness, and how to contribute a pipeline to the library. |
117-
| [Optimization](https://huggingface.co/docs/diffusers/optimization/opt_overview) | Guides for how to optimize your diffusion model to run faster and consume less memory. |
117+
| [Optimization](https://huggingface.co/docs/diffusers/optimization/fp16) | Guides for how to optimize your diffusion model to run faster and consume less memory. |
118118
| [Training](https://huggingface.co/docs/diffusers/training/overview) | Guides for how to train a diffusion model for different tasks with different training techniques. |
119119
## Contribution
120120

examples/community/README.md

Lines changed: 14 additions & 8 deletions
Large diffs are not rendered by default.

examples/community/README_community_scripts.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,9 @@ If a community script doesn't work as expected, please open an issue and ping th
66

77
| Example | Description | Code Example | Colab | Author |
88
|:--------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------:|
9-
| Using IP-Adapter with Negative Noise | Using negative noise with IP-adapter to better control the generation (see the [original post](https://github.com/huggingface/diffusers/discussions/7167) on the forum for more details) | [IP-Adapter Negative Noise](#ip-adapter-negative-noise) | https://github.com/huggingface/notebooks/blob/main/diffusers/ip_adapter_negative_noise.ipynb | [Álvaro Somoza](https://github.com/asomoza)|
10-
| Asymmetric Tiling |configure seamless image tiling independently for the X and Y axes | [Asymmetric Tiling](#Asymmetric-Tiling ) |https://github.com/huggingface/notebooks/blob/main/diffusers/asymetric_tiling.ipynb | [alexisrolland](https://github.com/alexisrolland)|
11-
| Prompt Scheduling Callback |Allows changing prompts during a generation | [Prompt Scheduling-Callback](#Prompt-Scheduling-Callback ) |https://github.com/huggingface/notebooks/blob/main/diffusers/prompt_scheduling_callback.ipynb | [hlky](https://github.com/hlky)|
9+
| Using IP-Adapter with Negative Noise | Using negative noise with IP-adapter to better control the generation (see the [original post](https://github.com/huggingface/diffusers/discussions/7167) on the forum for more details) | [IP-Adapter Negative Noise](#ip-adapter-negative-noise) |[Notebook](https://github.com/huggingface/notebooks/blob/main/diffusers/ip_adapter_negative_noise.ipynb) | [Álvaro Somoza](https://github.com/asomoza)|
10+
| Asymmetric Tiling |configure seamless image tiling independently for the X and Y axes | [Asymmetric Tiling](#Asymmetric-Tiling ) |[Notebook](https://github.com/huggingface/notebooks/blob/main/diffusers/asymetric_tiling.ipynb) | [alexisrolland](https://github.com/alexisrolland)|
11+
| Prompt Scheduling Callback |Allows changing prompts during a generation | [Prompt Scheduling-Callback](#Prompt-Scheduling-Callback ) |[Notebook](https://github.com/huggingface/notebooks/blob/main/diffusers/prompt_scheduling_callback.ipynb) | [hlky](https://github.com/hlky)|
1212

1313

1414
## Example usages

examples/controlnet/train_controlnet.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -571,9 +571,6 @@ def parse_args(input_args=None):
571571
if args.dataset_name is None and args.train_data_dir is None:
572572
raise ValueError("Specify either `--dataset_name` or `--train_data_dir`")
573573

574-
if args.dataset_name is not None and args.train_data_dir is not None:
575-
raise ValueError("Specify only one of `--dataset_name` or `--train_data_dir`")
576-
577574
if args.proportion_empty_prompts < 0 or args.proportion_empty_prompts > 1:
578575
raise ValueError("`--proportion_empty_prompts` must be in the range [0, 1].")
579576

@@ -615,6 +612,7 @@ def make_train_dataset(args, tokenizer, accelerator):
615612
args.dataset_name,
616613
args.dataset_config_name,
617614
cache_dir=args.cache_dir,
615+
data_dir=args.train_data_dir,
618616
)
619617
else:
620618
if args.train_data_dir is not None:

examples/controlnet/train_controlnet_sdxl.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -598,9 +598,6 @@ def parse_args(input_args=None):
598598
if args.dataset_name is None and args.train_data_dir is None:
599599
raise ValueError("Specify either `--dataset_name` or `--train_data_dir`")
600600

601-
if args.dataset_name is not None and args.train_data_dir is not None:
602-
raise ValueError("Specify only one of `--dataset_name` or `--train_data_dir`")
603-
604601
if args.proportion_empty_prompts < 0 or args.proportion_empty_prompts > 1:
605602
raise ValueError("`--proportion_empty_prompts` must be in the range [0, 1].")
606603

@@ -642,6 +639,7 @@ def get_train_dataset(args, accelerator):
642639
args.dataset_name,
643640
args.dataset_config_name,
644641
cache_dir=args.cache_dir,
642+
data_dir=args.train_data_dir,
645643
)
646644
else:
647645
if args.train_data_dir is not None:

examples/text_to_image/train_text_to_image_sdxl.py

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -483,7 +483,6 @@ def parse_args(input_args=None):
483483
# Sanity checks
484484
if args.dataset_name is None and args.train_data_dir is None:
485485
raise ValueError("Need either a dataset name or a training folder.")
486-
487486
if args.proportion_empty_prompts < 0 or args.proportion_empty_prompts > 1:
488487
raise ValueError("`--proportion_empty_prompts` must be in the range [0, 1].")
489488

@@ -824,9 +823,7 @@ def load_model_hook(models, input_dir):
824823
if args.dataset_name is not None:
825824
# Downloading and loading a dataset from the hub.
826825
dataset = load_dataset(
827-
args.dataset_name,
828-
args.dataset_config_name,
829-
cache_dir=args.cache_dir,
826+
args.dataset_name, args.dataset_config_name, cache_dir=args.cache_dir, data_dir=args.train_data_dir
830827
)
831828
else:
832829
data_files = {}

src/diffusers/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -338,8 +338,8 @@
338338
"StableDiffusion3ControlNetPipeline",
339339
"StableDiffusion3Img2ImgPipeline",
340340
"StableDiffusion3InpaintPipeline",
341-
"StableDiffusion3PAGPipeline",
342341
"StableDiffusion3PAGImg2ImgPipeline",
342+
"StableDiffusion3PAGPipeline",
343343
"StableDiffusion3Pipeline",
344344
"StableDiffusionAdapterPipeline",
345345
"StableDiffusionAttendAndExcitePipeline",

src/diffusers/models/autoencoders/autoencoder_kl_temporal_decoder.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
1212
# See the License for the specific language governing permissions and
1313
# limitations under the License.
14+
import itertools
1415
from typing import Dict, Optional, Tuple, Union
1516

1617
import torch
@@ -94,7 +95,7 @@ def forward(
9495

9596
sample = self.conv_in(sample)
9697

97-
upscale_dtype = next(iter(self.up_blocks.parameters())).dtype
98+
upscale_dtype = next(itertools.chain(self.up_blocks.parameters(), self.up_blocks.buffers())).dtype
9899
if torch.is_grad_enabled() and self.gradient_checkpointing:
99100

100101
def create_custom_forward(module):

src/diffusers/pipelines/allegro/pipeline_allegro.py

Lines changed: 6 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -251,13 +251,6 @@ def encode_prompt(
251251
if device is None:
252252
device = self._execution_device
253253

254-
if prompt is not None and isinstance(prompt, str):
255-
batch_size = 1
256-
elif prompt is not None and isinstance(prompt, list):
257-
batch_size = len(prompt)
258-
else:
259-
batch_size = prompt_embeds.shape[0]
260-
261254
# See Section 3.1. of the paper.
262255
max_length = max_sequence_length
263256

@@ -302,12 +295,12 @@ def encode_prompt(
302295
# duplicate text embeddings and attention mask for each generation per prompt, using mps friendly method
303296
prompt_embeds = prompt_embeds.repeat(1, num_videos_per_prompt, 1)
304297
prompt_embeds = prompt_embeds.view(bs_embed * num_videos_per_prompt, seq_len, -1)
305-
prompt_attention_mask = prompt_attention_mask.view(bs_embed, -1)
306-
prompt_attention_mask = prompt_attention_mask.repeat(num_videos_per_prompt, 1)
298+
prompt_attention_mask = prompt_attention_mask.repeat(1, num_videos_per_prompt)
299+
prompt_attention_mask = prompt_attention_mask.view(bs_embed * num_videos_per_prompt, -1)
307300

308301
# get unconditional embeddings for classifier free guidance
309302
if do_classifier_free_guidance and negative_prompt_embeds is None:
310-
uncond_tokens = [negative_prompt] * batch_size if isinstance(negative_prompt, str) else negative_prompt
303+
uncond_tokens = [negative_prompt] * bs_embed if isinstance(negative_prompt, str) else negative_prompt
311304
uncond_tokens = self._text_preprocessing(uncond_tokens, clean_caption=clean_caption)
312305
max_length = prompt_embeds.shape[1]
313306
uncond_input = self.tokenizer(
@@ -334,10 +327,10 @@ def encode_prompt(
334327
negative_prompt_embeds = negative_prompt_embeds.to(dtype=dtype, device=device)
335328

336329
negative_prompt_embeds = negative_prompt_embeds.repeat(1, num_videos_per_prompt, 1)
337-
negative_prompt_embeds = negative_prompt_embeds.view(batch_size * num_videos_per_prompt, seq_len, -1)
330+
negative_prompt_embeds = negative_prompt_embeds.view(bs_embed * num_videos_per_prompt, seq_len, -1)
338331

339-
negative_prompt_attention_mask = negative_prompt_attention_mask.view(bs_embed, -1)
340-
negative_prompt_attention_mask = negative_prompt_attention_mask.repeat(num_videos_per_prompt, 1)
332+
negative_prompt_attention_mask = negative_prompt_attention_mask.repeat(1, num_videos_per_prompt)
333+
negative_prompt_attention_mask = negative_prompt_attention_mask.view(bs_embed * num_videos_per_prompt, -1)
341334
else:
342335
negative_prompt_embeds = None
343336
negative_prompt_attention_mask = None

src/diffusers/pipelines/pag/pipeline_pag_pixart_sigma.py

Lines changed: 6 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -227,13 +227,6 @@ def encode_prompt(
227227
if device is None:
228228
device = self._execution_device
229229

230-
if prompt is not None and isinstance(prompt, str):
231-
batch_size = 1
232-
elif prompt is not None and isinstance(prompt, list):
233-
batch_size = len(prompt)
234-
else:
235-
batch_size = prompt_embeds.shape[0]
236-
237230
# See Section 3.1. of the paper.
238231
max_length = max_sequence_length
239232

@@ -278,12 +271,12 @@ def encode_prompt(
278271
# duplicate text embeddings and attention mask for each generation per prompt, using mps friendly method
279272
prompt_embeds = prompt_embeds.repeat(1, num_images_per_prompt, 1)
280273
prompt_embeds = prompt_embeds.view(bs_embed * num_images_per_prompt, seq_len, -1)
281-
prompt_attention_mask = prompt_attention_mask.view(bs_embed, -1)
282-
prompt_attention_mask = prompt_attention_mask.repeat(num_images_per_prompt, 1)
274+
prompt_attention_mask = prompt_attention_mask.repeat(1, num_images_per_prompt)
275+
prompt_attention_mask = prompt_attention_mask.view(bs_embed * num_images_per_prompt, -1)
283276

284277
# get unconditional embeddings for classifier free guidance
285278
if do_classifier_free_guidance and negative_prompt_embeds is None:
286-
uncond_tokens = [negative_prompt] * batch_size if isinstance(negative_prompt, str) else negative_prompt
279+
uncond_tokens = [negative_prompt] * bs_embed if isinstance(negative_prompt, str) else negative_prompt
287280
uncond_tokens = self._text_preprocessing(uncond_tokens, clean_caption=clean_caption)
288281
max_length = prompt_embeds.shape[1]
289282
uncond_input = self.tokenizer(
@@ -310,10 +303,10 @@ def encode_prompt(
310303
negative_prompt_embeds = negative_prompt_embeds.to(dtype=dtype, device=device)
311304

312305
negative_prompt_embeds = negative_prompt_embeds.repeat(1, num_images_per_prompt, 1)
313-
negative_prompt_embeds = negative_prompt_embeds.view(batch_size * num_images_per_prompt, seq_len, -1)
306+
negative_prompt_embeds = negative_prompt_embeds.view(bs_embed * num_images_per_prompt, seq_len, -1)
314307

315-
negative_prompt_attention_mask = negative_prompt_attention_mask.view(bs_embed, -1)
316-
negative_prompt_attention_mask = negative_prompt_attention_mask.repeat(num_images_per_prompt, 1)
308+
negative_prompt_attention_mask = negative_prompt_attention_mask.repeat(1, num_images_per_prompt)
309+
negative_prompt_attention_mask = negative_prompt_attention_mask.view(bs_embed * num_images_per_prompt, -1)
317310
else:
318311
negative_prompt_embeds = None
319312
negative_prompt_attention_mask = None

0 commit comments

Comments
 (0)