Skip to content

Commit 6e41806

Browse files
committed
fixfeat: added _get_crops_coords_list function to pipeline to automatically define ctop,cleft coord to focus on image generation, helps to better harmonize the image and corrects the problem of flattened elements.
1 parent 8a792cd commit 6e41806

File tree

3 files changed

+69
-18
lines changed

3 files changed

+69
-18
lines changed

examples/community/README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -50,8 +50,9 @@ Please also check out our [Community Scripts](https://github.com/huggingface/dif
5050
| IADB Pipeline | Implementation of [Iterative α-(de)Blending: a Minimalist Deterministic Diffusion Model](https://arxiv.org/abs/2305.03486) | [IADB Pipeline](#iadb-pipeline) | - | [Thomas Chambon](https://github.com/tchambon)
5151
| Zero1to3 Pipeline | Implementation of [Zero-1-to-3: Zero-shot One Image to 3D Object](https://arxiv.org/abs/2303.11328) | [Zero1to3 Pipeline](#zero1to3-pipeline) | - | [Xin Kong](https://github.com/kxhit) |
5252
| Stable Diffusion XL Long Weighted Prompt Pipeline | A pipeline support unlimited length of prompt and negative prompt, use A1111 style of prompt weighting | [Stable Diffusion XL Long Weighted Prompt Pipeline](#stable-diffusion-xl-long-weighted-prompt-pipeline) | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1LsqilswLR40XLLcp6XFOl5nKb_wOe26W?usp=sharing) | [Andrew Zhu](https://xhinker.medium.com/) |
53-
| Stable Diffusion Mixture Tiling Pipeline SD 1.5 | A pipeline generates cohesive images by integrating multiple diffusion processes, each focused on a specific image region and considering boundary effects for smooth blending | [Stable Diffusion Mixture Tiling Pipeline SD 1.5](#stable-diffusion-mixture-tiling-sd-15) | [![Hugging Face Space](https://img.shields.io/badge/🤗%20Hugging%20Face-Space-yellow)](https://huggingface.co/spaces/albarji/mixture-of-diffusers) | [Álvaro B Jiménez](https://github.com/albarji/) |
54-
| Stable Diffusion Mixture Tiling Pipeline SDXL | A pipeline generates cohesive images by integrating multiple diffusion processes, each focused on a specific image region and considering boundary effects for smooth blending | [Stable Diffusion Mixture Tiling Pipeline SDXL](#stable-diffusion-mixture-tiling-sdxl) | [![Hugging Face Space](https://img.shields.io/badge/🤗%20Hugging%20Face-Space-yellow)](https://huggingface.co/spaces/elismasilva/mixture-of-diffusers-sdxl-tiling) | [Eliseu Silva](https://github.com/DEVAIEXP/) |
53+
| Stable Diffusion Mixture Tiling Pipeline SD 1.5 | A pipeline generates cohesive images by integrating multiple diffusion processes, each focused on a specific image region and considering boundary effects for smooth blending | [Stable Diffusion Mixture Tiling Pipeline SD 1.5](#stable-diffusion-mixture-tiling-pipeline-sd-15) | [![Hugging Face Space](https://img.shields.io/badge/🤗%20Hugging%20Face-Space-yellow)](https://huggingface.co/spaces/albarji/mixture-of-diffusers) | [Álvaro B Jiménez](https://github.com/albarji/) |
54+
| Stable Diffusion Mixture Canvas Pipeline SD 1.5 | A pipeline generates cohesive images by integrating multiple diffusion processes, each focused on a specific image region and considering boundary effects for smooth blending. Works by defining a list of Text2Image region objects that detail the region of influence of each diffuser. | [Stable Diffusion Mixture Canvas Pipeline SD 1.5](#stable-diffusion-mixture-canvas-pipeline-sd-15) | [![Hugging Face Space](https://img.shields.io/badge/🤗%20Hugging%20Face-Space-yellow)](https://huggingface.co/spaces/albarji/mixture-of-diffusers) | [Álvaro B Jiménez](https://github.com/albarji/) |
55+
| Stable Diffusion Mixture Tiling Pipeline SDXL | A pipeline generates cohesive images by integrating multiple diffusion processes, each focused on a specific image region and considering boundary effects for smooth blending | [Stable Diffusion Mixture Tiling Pipeline SDXL](#stable-diffusion-mixture-tiling-pipeline-sdxl) | [![Hugging Face Space](https://img.shields.io/badge/🤗%20Hugging%20Face-Space-yellow)](https://huggingface.co/spaces/elismasilva/mixture-of-diffusers-sdxl-tiling) | [Eliseu Silva](https://github.com/DEVAIEXP/) |
5556
| FABRIC - Stable Diffusion with feedback Pipeline | pipeline supports feedback from liked and disliked images | [Stable Diffusion Fabric Pipeline](#stable-diffusion-fabric-pipeline) | [Notebook](https://github.com/huggingface/notebooks/blob/main/diffusers/stable_diffusion_fabric.ipynb)| [Shauray Singh](https://shauray8.github.io/about_shauray/) |
5657
| sketch inpaint - Inpainting with non-inpaint Stable Diffusion | sketch inpaint much like in automatic1111 | [Masked Im2Im Stable Diffusion Pipeline](#stable-diffusion-masked-im2im) | - | [Anatoly Belikov](https://github.com/noskill) |
5758
| sketch inpaint xl - Inpainting with non-inpaint Stable Diffusion | sketch inpaint much like in automatic1111 | [Masked Im2Im Stable Diffusion XL Pipeline](#stable-diffusion-xl-masked-im2im) | - | [Anatoly Belikov](https://github.com/noskill) |
@@ -2404,7 +2405,7 @@ pipe_images = mixing_pipeline(
24042405

24052406
![image_mixing_result](https://huggingface.co/datasets/TheDenk/images_mixing/resolve/main/boromir_gigachad.png)
24062407

2407-
### Stable Diffusion Mixture Tiling SD 1.5
2408+
### Stable Diffusion Mixture Tiling Pipeline SD 1.5
24082409

24092410
This pipeline uses the Mixture. Refer to the [Mixture](https://arxiv.org/abs/2302.02412) paper for more details.
24102411

@@ -2435,7 +2436,7 @@ image = pipeline(
24352436

24362437
![mixture_tiling_results](https://huggingface.co/datasets/kadirnar/diffusers_readme_images/resolve/main/mixture_tiling.png)
24372438

2438-
### Stable Diffusion Mixture Canvas
2439+
### Stable Diffusion Mixture Canvas Pipeline SD 1.5
24392440

24402441
This pipeline uses the Mixture. Refer to the [Mixture](https://arxiv.org/abs/2302.02412) paper for more details.
24412442

@@ -2470,7 +2471,7 @@ output = pipeline(
24702471
![Input_Image](https://huggingface.co/datasets/kadirnar/diffusers_readme_images/resolve/main/input_image.png)
24712472
![mixture_canvas_results](https://huggingface.co/datasets/kadirnar/diffusers_readme_images/resolve/main/canvas.png)
24722473

2473-
### Stable Diffusion Mixture Tiling SDXL
2474+
### Stable Diffusion Mixture Tiling Pipeline SDXL
24742475

24752476
This pipeline uses the Mixture. Refer to the [Mixture](https://arxiv.org/abs/2302.02412) paper for more details.
24762477

@@ -2516,14 +2517,13 @@ image = pipe(
25162517
tile_col_overlap=256,
25172518
guidance_scale_tiles=[[7, 7, 7]], # or guidance_scale=7 if is the same for all prompts
25182519
height=1024,
2519-
width=3840,
2520-
target_size=(1024, 3840),
2520+
width=3840,
25212521
generator=generator,
25222522
num_inference_steps=30,
25232523
)["images"][0]
25242524
```
25252525

2526-
![mixture_tiling_results](https://huggingface.co/datasets/elismasilva/results/resolve/main/mixture_sdxl.png)
2526+
![mixture_tiling_results](https://huggingface.co/datasets/elismasilva/results/resolve/main/mixture_of_diffusers_sdxl_1.png)
25272527

25282528
### TensorRT Inpainting Stable Diffusion Pipeline
25292529

examples/community/mixture_tiling_sdxl.py

Lines changed: 59 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright 2024 The HuggingFace Team. All rights reserved.
1+
# Copyright 2025 The HuggingFace Team. All rights reserved.
22
#
33
# Licensed under the Apache License, Version 2.0 (the "License");
44
# you may not use this file except in compliance with the License.
@@ -151,6 +151,51 @@ def _tile2latent_exclusive_indices(
151151
return row_segment[0], row_segment[1], col_segment[0], col_segment[1]
152152

153153

154+
def _get_crops_coords_list(num_rows, num_cols, output_width):
155+
"""
156+
Generates a list of lists of `crops_coords_top_left` tuples for focusing on
157+
different horizontal parts of an image, and repeats this list for the specified
158+
number of rows in the output structure.
159+
160+
This function calculates `crops_coords_top_left` tuples to create horizontal
161+
focus variations (like left, center, right focus) based on `output_width`
162+
and `num_cols` (which represents the number of horizontal focus points/columns).
163+
It then repeats the *list* of these horizontal focus tuples `num_rows` times to
164+
create the final list of lists output structure.
165+
166+
Args:
167+
num_rows (int): The desired number of rows in the output list of lists.
168+
This determines how many times the list of horizontal
169+
focus variations will be repeated.
170+
num_cols (int): The number of horizontal focus points (columns) to generate.
171+
This determines how many horizontal focus variations are
172+
created based on dividing the `output_width`.
173+
output_width (int): The desired width of the output image.
174+
175+
Returns:
176+
list[list[tuple[int, int]]]: A list of lists of tuples. Each inner list
177+
contains `num_cols` tuples of `(ctop, cleft)`,
178+
representing horizontal focus points. The outer list
179+
contains `num_rows` such inner lists.
180+
"""
181+
crops_coords_list = []
182+
if num_cols <= 0:
183+
crops_coords_list = []
184+
elif num_cols == 1:
185+
crops_coords_list = [(0, 0)]
186+
else:
187+
section_width = output_width / num_cols
188+
for i in range(num_cols):
189+
cleft = int(round(i * section_width))
190+
crops_coords_list.append((0, cleft))
191+
192+
result_list = []
193+
for _ in range(num_rows):
194+
result_list.append(list(crops_coords_list))
195+
196+
return result_list
197+
198+
154199
# Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion.rescale_noise_cfg
155200
def rescale_noise_cfg(noise_cfg, noise_pred_text, guidance_rescale=0.0):
156201
r"""
@@ -757,10 +802,10 @@ def __call__(
757802
return_dict: bool = True,
758803
cross_attention_kwargs: Optional[Dict[str, Any]] = None,
759804
original_size: Optional[Tuple[int, int]] = None,
760-
crops_coords_top_left: Tuple[int, int] = (0, 0),
805+
crops_coords_top_left: Optional[List[List[Tuple[int, int]]]] = None,
761806
target_size: Optional[Tuple[int, int]] = None,
762807
negative_original_size: Optional[Tuple[int, int]] = None,
763-
negative_crops_coords_top_left: Tuple[int, int] = (0, 0),
808+
negative_crops_coords_top_left: Optional[List[List[Tuple[int, int]]]] = None,
764809
negative_target_size: Optional[Tuple[int, int]] = None,
765810
clip_skip: Optional[int] = None,
766811
tile_height: Optional[int] = 1024,
@@ -826,7 +871,7 @@ def __call__(
826871
`original_size` defaults to `(height, width)` if not specified. Part of SDXL's micro-conditioning as
827872
explained in section 2.2 of
828873
[https://huggingface.co/papers/2307.01952](https://huggingface.co/papers/2307.01952).
829-
crops_coords_top_left (`Tuple[int]`, *optional*, defaults to (0, 0)):
874+
crops_coords_top_left (`List[List[Tuple[int, int]]]`, *optional*, defaults to (0, 0)):
830875
`crops_coords_top_left` can be used to generate an image that appears to be "cropped" from the position
831876
`crops_coords_top_left` downwards. Favorable, well-centered images are usually achieved by setting
832877
`crops_coords_top_left` to (0, 0). Part of SDXL's micro-conditioning as explained in section 2.2 of
@@ -840,7 +885,7 @@ def __call__(
840885
micro-conditioning as explained in section 2.2 of
841886
[https://huggingface.co/papers/2307.01952](https://huggingface.co/papers/2307.01952). For more
842887
information, refer to this issue thread: https://github.com/huggingface/diffusers/issues/4208.
843-
negative_crops_coords_top_left (`Tuple[int]`, *optional*, defaults to (0, 0)):
888+
negative_crops_coords_top_left (`List[List[Tuple[int, int]]]`, *optional*, defaults to (0, 0)):
844889
To negatively condition the generation process based on a specific crop coordinates. Part of SDXL's
845890
micro-conditioning as explained in section 2.2 of
846891
[https://huggingface.co/papers/2307.01952](https://huggingface.co/papers/2307.01952). For more
@@ -883,6 +928,8 @@ def __call__(
883928

884929
original_size = original_size or (height, width)
885930
target_size = target_size or (height, width)
931+
negative_original_size = negative_original_size or (height, width)
932+
negative_target_size = negative_target_size or (height, width)
886933

887934
self._guidance_scale = guidance_scale
888935
self._clip_skip = clip_skip
@@ -891,7 +938,6 @@ def __call__(
891938

892939
grid_rows = len(prompt)
893940
grid_cols = len(prompt[0])
894-
895941
tiles_mode = [mode.value for mode in self.SeedTilesMode]
896942

897943
if isinstance(seed_tiles_mode, str):
@@ -914,6 +960,11 @@ def __call__(
914960

915961
device = self._execution_device
916962

963+
# update crops coords list
964+
crops_coords_top_left = _get_crops_coords_list(grid_rows, grid_cols, tile_width)
965+
if negative_original_size is not None and negative_target_size is not None:
966+
negative_crops_coords_top_left = _get_crops_coords_list(grid_rows, grid_cols, tile_width)
967+
917968
# update height and width tile size and tile overlap size
918969
height = tile_height + (grid_rows - 1) * (tile_height - tile_row_overlap)
919970
width = tile_width + (grid_cols - 1) * (tile_width - tile_col_overlap)
@@ -1020,15 +1071,15 @@ def __call__(
10201071
text_encoder_projection_dim = self.text_encoder_2.config.projection_dim
10211072
add_time_ids = self._get_add_time_ids(
10221073
original_size,
1023-
crops_coords_top_left,
1074+
crops_coords_top_left[row][col],
10241075
target_size,
10251076
dtype=prompt_embeds.dtype,
10261077
text_encoder_projection_dim=text_encoder_projection_dim,
10271078
)
10281079
if negative_original_size is not None and negative_target_size is not None:
10291080
negative_add_time_ids = self._get_add_time_ids(
10301081
negative_original_size,
1031-
negative_crops_coords_top_left,
1082+
negative_crops_coords_top_left[row][col],
10321083
negative_target_size,
10331084
dtype=prompt_embeds.dtype,
10341085
text_encoder_projection_dim=text_encoder_projection_dim,

examples/research_projects/instructpix2pix_lora/train_instruct_pix2pix_lora.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,8 @@
1515
# limitations under the License.
1616

1717
"""
18-
Script to fine-tune Stable Diffusion for LORA InstructPix2Pix.
19-
Base code referred from: https://github.com/huggingface/diffusers/blob/main/examples/instruct_pix2pix/train_instruct_pix2pix.py
18+
Script to fine-tune Stable Diffusion for LORA InstructPix2Pix.
19+
Base code referred from: https://github.com/huggingface/diffusers/blob/main/examples/instruct_pix2pix/train_instruct_pix2pix.py
2020
"""
2121

2222
import argparse

0 commit comments

Comments
 (0)