Skip to content

Conversation

@elismasilva
Copy link
Contributor

What does this PR do?

This PR add a _get_crops_coords_list function to community Mixture-of-diffusers Tiling Pipeline SDXL to automatically get
the (ctop,cleft) coords and do a best focus on image generation, it helps to better harmonize the image and corrects the problem of flattened elements.

related to #10759 PR

For local reproduction

import torch
from diffusers import DPMSolverMultistepScheduler, AutoencoderKL
from mixture_tiling_sdxl import StableDiffusionXLTilingPipeline

device="cuda"

vae = AutoencoderKL.from_pretrained(
    "madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16
).to(device)

model_id="stablediffusionapi/yamermix-v8-vae"
scheduler = DPMSolverMultistepScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", num_train_timesteps=1000)
pipe = StableDiffusionXLTilingPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    vae=vae,
    scheduler=scheduler,
    use_safetensors=False    
).to(device)

pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
pipe.enable_vae_slicing()

generator = torch.Generator(device).manual_seed(297984183)

# Mixture of Diffusers generation
image = pipe(
    prompt=[[
        "A charming house in the countryside, by jakub rozalski, sunset lighting, elegant, highly detailed, smooth, sharp focus, artstation, stunning masterpiece",
        "A dirt road in the countryside crossing pastures, by jakub rozalski, sunset lighting, elegant, highly detailed, smooth, sharp focus, artstation, stunning masterpiece",        
        "An old and rusty giant robot lying on a dirt road, by jakub rozalski, dark sunset lighting, elegant, highly detailed, smooth, sharp focus, artstation, stunning masterpiece"
    ]],
    tile_height=1024,
    tile_width=1280,
    tile_row_overlap=0,
    tile_col_overlap=256,
    guidance_scale_tiles=[[7, 7, 7]], # or guidance_scale=7 if is the same for all prompts
    height=1024,
    width=3840,
    generator=generator,
    num_inference_steps=30,
)["images"][0]

image.save("mixture_sdxl.png")

After published:

import torch
from diffusers import DiffusionPipeline, DPMSolverMultistepScheduler, AutoencoderKL

device="cuda"

vae = AutoencoderKL.from_pretrained(
    "madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16
).to(device)

model_id="stablediffusionapi/yamermix-v8-vae"
scheduler = DPMSolverMultistepScheduler(beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", num_train_timesteps=1000)
pipe = DiffusionPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    vae=vae,
    custom_pipeline="mixture_tiling_sdxl",
    scheduler=scheduler,
    use_safetensors=False    
).to(device)

pipe.enable_model_cpu_offload()
pipe.enable_vae_tiling()
pipe.enable_vae_slicing()

generator = torch.Generator(device).manual_seed(297984183)

# Mixture of Diffusers generation
image = pipe(
    prompt=[[
        "A charming house in the countryside, by jakub rozalski, sunset lighting, elegant, highly detailed, smooth, sharp focus, artstation, stunning masterpiece",
        "A dirt road in the countryside crossing pastures, by jakub rozalski, sunset lighting, elegant, highly detailed, smooth, sharp focus, artstation, stunning masterpiece",        
        "An old and rusty giant robot lying on a dirt road, by jakub rozalski, dark sunset lighting, elegant, highly detailed, smooth, sharp focus, artstation, stunning masterpiece"
    ]],
    tile_height=1024,
    tile_width=1280,
    tile_row_overlap=0,
    tile_col_overlap=256,
    guidance_scale_tiles=[[7, 7, 7]], # or guidance_scale=7 if is the same for all prompts
    height=1024,
    width=3840,    
    generator=generator,
    num_inference_steps=30,
)["images"][0]

image.save("mixture_sdxl.png")

Final result

mixture_of_diffusers_sdxl_1

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@asomoza @sayakpaul

@asomoza
Copy link
Member

asomoza commented Feb 12, 2025

thanks, there's some changes to instruct_pix2pix pipeline, even if they're correct let's just keep the PR for the relevant pipeline only.

@elismasilva
Copy link
Contributor Author

elismasilva commented Feb 12, 2025

it was not me. I think was make style and make quality changed it. Tomorrow i see if i can undo this.

@elismasilva elismasilva force-pushed the add-mixture-tiling-sdxl branch from b84c9c0 to 4efbcc9 Compare February 12, 2025 13:47
@elismasilva
Copy link
Contributor Author

thanks, there's some changes to instruct_pix2pix pipeline, even if they're correct let's just keep the PR for the relevant pipeline only.

done!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@asomoza
Copy link
Member

asomoza commented Feb 12, 2025

thanks!!!

@asomoza asomoza merged commit 051ebc3 into huggingface:main Feb 12, 2025
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants