| title | Qwen-Image ComfyUI Native Workflow Example |
|---|---|
| description | Qwen-Image is a 20B parameter MMDiT (Multimodal Diffusion Transformer) model open-sourced under the Apache 2.0 license. |
| sidebarTitle | Qwen-Image |
import UpdateReminder from '/snippets/tutorials/update-reminder.mdx'
Qwen-Image is the first image generation foundation model released by Alibaba's Qwen team. It's a 20B parameter MMDiT (Multimodal Diffusion Transformer) model open-sourced under the Apache 2.0 license. The model has made significant advances in complex text rendering and precise image editing, achieving high-fidelity output for multiple languages including English and Chinese.
Model Highlights:
- Excellent Multilingual Text Rendering: Supports high-precision text generation in multiple languages including English, Chinese, Korean, Japanese, maintaining font details and layout consistency
- Diverse Artistic Styles: From photorealistic scenes to impressionist paintings, from anime aesthetics to minimalist design, fluidly adapting to various creative prompts
Related Links:
The models used in this document can be obtained from Huggingface or Modelscope
After updating ComfyUI, you can find the workflow file in the templates, or drag the workflow below into ComfyUI to load it.

<a className="prose" target='_blank' href="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/image_qwen_image.json" style={{ display: 'inline-block', backgroundColor: '#0078D6', color: '#ffffff', padding: '10px 20px', borderRadius: '8px', borderColor: "transparent", textDecoration: 'none', fontWeight: 'bold'}}> <p className="prose" style={{ margin: 0, fontSize: "0.8rem" }}>Download JSON Workflow
You can find all the models on Huggingface or Modelscope
Diffusion Model
Text Encoder
VAE
Model Storage Location
π ComfyUI/
βββ π models/
β βββ π diffusion_models/
β β βββ qwen_image_fp8_e4m3fn.safetensors
β βββ π vae/
β β βββ qwen_image_vae.safetensors
β βββ π text_encoders/
β βββ qwen_2.5_vl_7b_fp8_scaled.safetensors
- Load
qwen_image_fp8_e4m3fn.safetensorsin theLoad Diffusion Modelnode - Load
qwen_2.5_vl_7b_fp8_scaled.safetensorsin theLoad CLIPnode - Load
qwen_image_vae.safetensorsin theLoad VAEnode - Set image dimensions in the
EmptySD3LatentImagenode - Enter your prompts in the
CLIP Text Encoder(supports English, Chinese, Korean, Japanese, Italian, etc.) - Click Queue or press
Ctrl+Enterto run
