Support ControlNet for Qwen-Image #12215

haofanwang · 2025-08-22T06:28:39Z

What does this PR do?

Add ControlNet-Union (InstantX/Qwen-Image-ControlNet-Union) support for Qwen-Image.

Inference

import torch
from diffusers.utils import load_image
from diffusers import QwenImageControlNetPipeline, QwenImageControlNetModel

base_model = "Qwen/Qwen-Image"
controlnet_model = "InstantX/Qwen-Image-ControlNet-Union"

controlnet = QwenImageControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)

pipe = QwenImageControlNetPipeline.from_pretrained(
    base_model, controlnet=controlnet, torch_dtype=torch.bfloat16
)
pipe.to("cuda")

# canny
control_image = load_image("https://huggingface.co/InstantX/Qwen-Image-ControlNet-Union/resolve/main/conds/canny.png")
prompt = "Aesthetics art, traditional asian pagoda, elaborate golden accents, sky blue and white color palette, swirling cloud pattern, digital illustration, east asian architecture, ornamental rooftop, intricate detailing on building, cultural representation."
controlnet_conditioning_scale = 1.0

image = pipe(
    prompt=prompt,
    negative_prompt=" ",
    control_image=control_image,
    controlnet_conditioning_scale=controlnet_conditioning_scale,
    width=control_image.size[0],
    height=control_image.size[1],
    num_inference_steps=30,
    true_cfg_scale=4.0,
    generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
image.save(f"qwenimage_cn_union_result.png")

Multi-Conditions

import torch
from diffusers.utils import load_image
from diffusers import QwenImageControlNetPipeline, QwenImageControlNetModel, QwenImageMultiControlNetModel

base_model = "Qwen/Qwen-Image"
controlnet_model = "InstantX/Qwen-Image-ControlNet-Union"

controlnet = QwenImageControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
controlnet = QwenImageMultiControlNetModel([controlnet])

pipe = QwenImageControlNetPipeline.from_pretrained(
    base_model, controlnet=controlnet, torch_dtype=torch.bfloat16
)
pipe.to("cuda")

# canny
control_image = load_image("https://huggingface.co/InstantX/Qwen-Image-ControlNet-Union/resolve/main/conds/canny.png")
prompt = "Aesthetics art, traditional asian pagoda, elaborate golden accents, sky blue and white color palette, swirling cloud pattern, digital illustration, east asian architecture, ornamental rooftop, intricate detailing on building, cultural representation."
controlnet_conditioning_scale = 1.0

# Please note that the results will not be identical, because the generator is called in different order.
image = pipe(
    prompt=prompt,
    negative_prompt=" ",
    control_image=[control_image, control_image],
    controlnet_conditioning_scale=[controlnet_conditioning_scale/2, controlnet_conditioning_scale/2],
    width=control_image.size[0],
    height=control_image.size[1],
    num_inference_steps=30,
    true_cfg_scale=4.0,
    generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]
image.save(f"qwenimage_cn_union_multi_result_.png")

Sanity Check

src/diffusers/pipelines/qwenimage/pipeline_qwenimage_controlnet.py

yiyixuxu · 2025-08-22T08:30:59Z

src/diffusers/pipelines/qwenimage/pipeline_qwenimage_controlnet.py

+
+        # 3. Prepare control image
+        num_channels_latents = self.transformer.config.in_channels // 4
+        if isinstance(self.controlnet, QwenImageControlNetModel):


ohh, do we not support multi-controlnets use case?

yiyixuxu · 2025-08-22T08:33:48Z

src/diffusers/models/controlnets/controlnet_qwenimage.py

+        if len(self.nets) == 1:
+            controlnet = self.nets[0]
+
+            for i, (image, scale) in enumerate(zip(controlnet_cond, conditioning_scale)):
+                block_samples = controlnet(


we can probably do something like this, and this way we don't need to something special for one union controlnet

Suggested change

if len(self.nets) == 1:

controlnet = self.nets[0]

for i, (image, scale) in enumerate(zip(controlnet_cond, conditioning_scale)):

block_samples = controlnet(

if len(self.nets) == 1:

controlnets = [self.nets] * len(controlnet_cond)

for i, (image, scale, controlnet) in enumerate(zip(controlnet_cond, conditioning_scale, controlnets)):

block_samples = controlnet(

haofanwang · 2025-08-22T10:48:36Z

@yiyixuxu Added support for multiple conditions. ruff check has passed.

vuongminh1907 · 2025-08-22T14:45:21Z

Excellent work @haofanwang ! did you try controlnet with Flux Kontext too?

yiyixuxu · 2025-08-22T17:57:25Z

@bot /style

github-actions · 2025-08-22T17:57:48Z

Style bot fixed some files and pushed the changes.

HuggingFaceDocBuilderDev · 2025-08-22T17:58:20Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu

thanks!

vladmandic · 2025-08-23T15:27:04Z

page https://huggingface.co/InstantX/Qwen-Image-ControlNet-Union returns 404 ?!

haofanwang · 2025-08-23T16:14:38Z

Will be released soon.

haofanwang added 7 commits August 22, 2025 13:05

Update transformer_qwenimage.py

8c628eb

support qwen-image-cn-union

b7eecf0

fix

9d11ab9

fix

a920562

import

897c31d

format

a5e1a58

Merge branch 'huggingface:main' into qwen-image-cn-union

db4f648

yiyixuxu reviewed Aug 22, 2025

View reviewed changes

haofanwang added 3 commits August 22, 2025 16:53

format

3316b98

add multi-cn

66f835b

format

33277c2

haofanwang requested a review from yiyixuxu August 22, 2025 11:57

Apply style fixes

c5c0a4b

Merge branch 'main' into qwen-image-cn-union

c4efcb7

yiyixuxu approved these changes Aug 22, 2025

View reviewed changes

yiyixuxu added 4 commits August 22, 2025 21:52

up

58d403b

add copied from

56d1073

add api doc etc

a291c82

Merge branch 'main' into qwen-image-cn-union

e2408b8

yiyixuxu merged commit 561ab54 into huggingface:main Aug 22, 2025
14 checks passed

yiyixuxu mentioned this pull request Aug 22, 2025

[Contribution welcome] adding a fast test for Qwen-Image Controlnet Pipeline #12222

Closed

yiyixuxu mentioned this pull request Aug 24, 2025

added a fast test for Qwen-Image Controlnet Pipeline #12226

Merged

DN6 added the roadmap Add to current release roadmap label Aug 28, 2025

github-project-automation bot added this to Diffusers Roadmap 0.36 Aug 28, 2025

github-project-automation bot moved this to In Progress in Diffusers Roadmap 0.36 Aug 28, 2025

DN6 moved this from In Progress to Done in Diffusers Roadmap 0.36 Aug 28, 2025

Dong1017 mentioned this pull request Oct 29, 2025

feat&fix(diffusers): add QwenImage lora finetune, new models and pipes and fix bugs mindspore-lab/mindone#1394

Open

6 tasks

Support ControlNet for Qwen-Image #12215

Support ControlNet for Qwen-Image #12215

Conversation

haofanwang commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Inference

Multi-Conditions

Sanity Check

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yiyixuxu Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

haofanwang commented Aug 22, 2025

Uh oh!

vuongminh1907 commented Aug 22, 2025

Uh oh!

yiyixuxu commented Aug 22, 2025

Uh oh!

github-actions bot commented Aug 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Aug 22, 2025

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vladmandic commented Aug 23, 2025

Uh oh!

haofanwang commented Aug 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

haofanwang commented Aug 22, 2025 •

edited

Loading

yiyixuxu Aug 22, 2025 •

edited

Loading

github-actions bot commented Aug 22, 2025 •

edited

Loading