Skip to content

Conversation

@sayakpaul
Copy link
Member

@sayakpaul sayakpaul commented Oct 1, 2025

What does this PR do?

Test code:
import torch
from diffusers import ModularPipeline
from diffusers.utils import load_image

# repo_id = "Qwen/Qwen-Image-Edit"
repo_id = "Qwen/Qwen-Image-Edit-2509"

pipeline = ModularPipeline.from_pretrained(repo_id)
pipeline.load_components(torch_dtype=torch.bfloat16)
pipeline.to("cuda")

guider_spec = pipeline.get_component_spec("guider")
guider = guider_spec.create(guidance_scale=4.5)
pipeline.update_components(guider=guider)

image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/yarn-art-pikachu.png"
).convert("RGB")
prompt = (
    "Make Pikachu hold a sign that says 'Qwen is awesome', yarn art style, detailed, vibrant colors"
)
image = pipeline(
    image=image, 
    prompt=prompt, 
    negative_prompt=" ",
    num_inference_steps=40,
    generator=torch.manual_seed(0),
).images[0]
image.save("qwenimage_edit_plus_modular.png")

Result:

image

)
except EnvironmentError:
except EnvironmentError as e:
logger.debug(f"EnvironmentError: {e}")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully this is okay?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Comment on lines 664 to 672
block_state.img_shapes = [
[
(1, height // components.vae_scale_factor // 2, width // components.vae_scale_factor // 2),
*[
(1, vae_height // components.vae_scale_factor // 2, vae_width // components.vae_scale_factor // 2)
for vae_width, vae_height in vae_image_sizes
],
]
] * block_state.batch_size
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where this step differs from the existing RoPE input step.

def inputs(self) -> List[InputParam]:
inputs_list = super().inputs
return inputs_list + [
InputParam(name=self._image_size_output_name, required=True),
Copy link
Member

@asomoza asomoza Oct 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!, this part is just for this model and when doing the denoise step alone and passing the image latents, it gives an error since the images_sizes input is missing.

ValueError: Required input 'image_sizes' is missing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you test again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, it works!

@sayakpaul
Copy link
Member Author

@asomoza if you tests are looking good, maybe we could merge it? If so, could you review and approve as you go?

@asomoza
Copy link
Member

asomoza commented Oct 5, 2025

sure, I wanted to test and review with more scenarios but that will take more time and probably they're going to work, everything else looks good to me, if I find something after we can fix it in a following PR.

@sayakpaul sayakpaul merged commit c3675d4 into main Oct 5, 2025
16 of 17 checks passed
@sayakpaul
Copy link
Member Author

Sure! Plese hit me up if any issue comes up.

@sayakpaul sayakpaul deleted the qwen-edit-plus-modular branch October 5, 2025 16:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants