|
| 1 | +--- |
| 2 | +title: "Qwen-Image ComfyUI Native Workflow Example" |
| 3 | +description: "Qwen-Image is a 20B parameter MMDiT (Multimodal Diffusion Transformer) model open-sourced under the Apache 2.0 license." |
| 4 | +sidebarTitle: "Qwen-Image" |
| 5 | +--- |
| 6 | + |
| 7 | +import UpdateReminder from '/snippets/tutorials/update-reminder.mdx' |
| 8 | + |
| 9 | + |
| 10 | +**Qwen-Image** is the first image generation foundation model released by Alibaba's Qwen team. It's a 20B parameter MMDiT (Multimodal Diffusion Transformer) model open-sourced under the Apache 2.0 license. The model has made significant advances in **complex text rendering** and **precise image editing**, achieving high-fidelity output for multiple languages including English and Chinese. |
| 11 | + |
| 12 | +**Model Highlights**: |
| 13 | +- **Excellent Multilingual Text Rendering**: Supports high-precision text generation in multiple languages including English, Chinese, Korean, Japanese, maintaining font details and layout consistency |
| 14 | +- **Diverse Artistic Styles**: From photorealistic scenes to impressionist paintings, from anime aesthetics to minimalist design, fluidly adapting to various creative prompts |
| 15 | + |
| 16 | +**Related Links**: |
| 17 | + - [GitHub](https://github.com/QwenLM/Qwen-Image) |
| 18 | + - [Hugging Face](https://huggingface.co/Qwen/Qwen-Image) |
| 19 | + - [ModelScope](https://modelscope.cn/models/qwen/Qwen-Image) |
| 20 | + |
| 21 | +## Qwen-Image Native Workflow Example |
| 22 | + |
| 23 | +<UpdateReminder /> |
| 24 | + |
| 25 | +The models used in this document can be obtained from [Huggingface](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main) or [Modelscope](https://modelscope.cn/models/Comfy-Org/Qwen-Image_ComfyUI/files) |
| 26 | + |
| 27 | +## 1. Workflow File |
| 28 | + |
| 29 | +After updating ComfyUI, you can find the workflow file in the templates, or drag the workflow below into ComfyUI to load it. |
| 30 | + |
| 31 | + |
| 32 | +<a className="prose" target='_blank' href="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/image_qwen_image.json" style={{ display: 'inline-block', backgroundColor: '#0078D6', color: '#ffffff', padding: '10px 20px', borderRadius: '8px', borderColor: "transparent", textDecoration: 'none', fontWeight: 'bold'}}> |
| 33 | + <p className="prose" style={{ margin: 0, fontSize: "0.8rem" }}>Download JSON Workflow</p> |
| 34 | +</a> |
| 35 | +## 2. Model Download |
| 36 | + |
| 37 | +You can find all the models on [Huggingface](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main) or [Modelscope](https://modelscope.cn/models/Comfy-Org/Qwen-Image_ComfyUI/files) |
| 38 | + |
| 39 | +**Diffusion Model** |
| 40 | + |
| 41 | +- [qwen_image_fp8_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/diffusion_models/qwen_image_fp8_e4m3fn.safetensors) |
| 42 | + |
| 43 | +**Text Encoder** |
| 44 | + |
| 45 | +- [qwen_2.5_vl_7b_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors) |
| 46 | + |
| 47 | +**VAE** |
| 48 | + |
| 49 | +- [qwen_image_vae.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors) |
| 50 | + |
| 51 | +**Model Storage Location** |
| 52 | + |
| 53 | +``` |
| 54 | +📂 ComfyUI/ |
| 55 | +├── 📂 models/ |
| 56 | +│ ├── 📂 diffusion_models/ |
| 57 | +│ │ └── qwen_image_fp8_e4m3fn.safetensors |
| 58 | +│ ├── 📂 vae/ |
| 59 | +│ │ └── qwen_image_vae.safetensors |
| 60 | +│ └── 📂 text_encoders/ |
| 61 | +│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors |
| 62 | +``` |
| 63 | +### 3. Complete the Workflow Step by Step |
| 64 | + |
| 65 | + |
| 66 | + |
| 67 | +1. Load `qwen_image_fp8_e4m3fn.safetensors` in the `Load Diffusion Model` node |
| 68 | +2. Load `qwen_2.5_vl_7b_fp8_scaled.safetensors` in the `Load CLIP` node |
| 69 | +3. Load `qwen_image_vae.safetensors` in the `Load VAE` node |
| 70 | +4. Set image dimensions in the `EmptySD3LatentImage` node |
| 71 | +5. Enter your prompts in the `CLIP Text Encoder` (supports English, Chinese, Korean, Japanese, Italian, etc.) |
| 72 | +6. Click Queue or press `Ctrl+Enter` to run |
0 commit comments