diff --git a/docs.json b/docs.json index c1d08caa..45d8057a 100644 --- a/docs.json +++ b/docs.json @@ -123,6 +123,7 @@ { "group": "Image", "pages": [ + "tutorials/image/qwen/qwen-image", { "group": "HiDream", "pages": [ @@ -672,6 +673,7 @@ { "group": "Image", "pages": [ + "zh-CN/tutorials/image/qwen/qwen-image", { "group": "HiDream", "pages": [ diff --git a/images/tutorial/image/qwen/image_qwen_image-guide.jpg b/images/tutorial/image/qwen/image_qwen_image-guide.jpg new file mode 100644 index 00000000..05090107 Binary files /dev/null and b/images/tutorial/image/qwen/image_qwen_image-guide.jpg differ diff --git a/tutorials/image/qwen/qwen-image.mdx b/tutorials/image/qwen/qwen-image.mdx new file mode 100644 index 00000000..49fddce7 --- /dev/null +++ b/tutorials/image/qwen/qwen-image.mdx @@ -0,0 +1,72 @@ +--- +title: "Qwen-Image ComfyUI Native Workflow Example" +description: "Qwen-Image is a 20B parameter MMDiT (Multimodal Diffusion Transformer) model open-sourced under the Apache 2.0 license." +sidebarTitle: "Qwen-Image" +--- + +import UpdateReminder from '/snippets/tutorials/update-reminder.mdx' + + +**Qwen-Image** is the first image generation foundation model released by Alibaba's Qwen team. It's a 20B parameter MMDiT (Multimodal Diffusion Transformer) model open-sourced under the Apache 2.0 license. The model has made significant advances in **complex text rendering** and **precise image editing**, achieving high-fidelity output for multiple languages including English and Chinese. + +**Model Highlights**: +- **Excellent Multilingual Text Rendering**: Supports high-precision text generation in multiple languages including English, Chinese, Korean, Japanese, maintaining font details and layout consistency +- **Diverse Artistic Styles**: From photorealistic scenes to impressionist paintings, from anime aesthetics to minimalist design, fluidly adapting to various creative prompts + +**Related Links**: + - [GitHub](https://github.com/QwenLM/Qwen-Image) + - [Hugging Face](https://huggingface.co/Qwen/Qwen-Image) + - [ModelScope](https://modelscope.cn/models/qwen/Qwen-Image) + +## Qwen-Image Native Workflow Example + + + +The models used in this document can be obtained from [Huggingface](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main) or [Modelscope](https://modelscope.cn/models/Comfy-Org/Qwen-Image_ComfyUI/files) + +## 1. Workflow File + +After updating ComfyUI, you can find the workflow file in the templates, or drag the workflow below into ComfyUI to load it. +![Qwen-image Text-to-Image Workflow](https://raw.githubusercontent.com/Comfy-Org/example_workflows/main/image/qwen/qwen-image.png) + + +

Download JSON Workflow

+
+## 2. Model Download + +You can find all the models on [Huggingface](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main) or [Modelscope](https://modelscope.cn/models/Comfy-Org/Qwen-Image_ComfyUI/files) + +**Diffusion Model** + +- [qwen_image_fp8_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/diffusion_models/qwen_image_fp8_e4m3fn.safetensors) + +**Text Encoder** + +- [qwen_2.5_vl_7b_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors) + +**VAE** + +- [qwen_image_vae.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors) + +**Model Storage Location** + +``` +📂 ComfyUI/ +├── 📂 models/ +│ ├── 📂 diffusion_models/ +│ │ └── qwen_image_fp8_e4m3fn.safetensors +│ ├── 📂 vae/ +│ │ └── qwen_image_vae.safetensors +│ └── 📂 text_encoders/ +│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors +``` +### 3. Complete the Workflow Step by Step + +![Step Guide](/images/tutorial/image/qwen/image_qwen_image-guide.jpg) + +1. Load `qwen_image_fp8_e4m3fn.safetensors` in the `Load Diffusion Model` node +2. Load `qwen_2.5_vl_7b_fp8_scaled.safetensors` in the `Load CLIP` node +3. Load `qwen_image_vae.safetensors` in the `Load VAE` node +4. Set image dimensions in the `EmptySD3LatentImage` node +5. Enter your prompts in the `CLIP Text Encoder` (supports English, Chinese, Korean, Japanese, Italian, etc.) +6. Click Queue or press `Ctrl+Enter` to run \ No newline at end of file diff --git a/zh-CN/tutorials/image/qwen/qwen-image.mdx b/zh-CN/tutorials/image/qwen/qwen-image.mdx new file mode 100644 index 00000000..5d7a6b30 --- /dev/null +++ b/zh-CN/tutorials/image/qwen/qwen-image.mdx @@ -0,0 +1,72 @@ +--- +title: "Qwen-Image ComfyUI原生工作流示例" +description: "Qwen-Image 是一个拥有 20B 参数的 MMDiT(多模态扩散变换器)模型,基于 Apache 2.0 许可证开源。" +sidebarTitle: "Qwen-Image" +--- + +import UpdateReminder from '/snippets/zh/tutorials/update-reminder.mdx' + + +**Qwen-Image** 是阿里巴巴通义千问团队发布的首个图像生成基础模型,这是一个拥有 20B 参数的 MMDiT(多模态扩散变换器)模型,基于 Apache 2.0 许可证开源。该模型在**复杂文本渲染**和**精确图像编辑**方面取得了显著进展,无论是英语还是中文等多种语言都能实现高保真输出。 + +**模型亮点**: +- **卓越的多语言文本渲染**:支持英语、中文、韩语、日语等多种语言的高精度文本生成,保持字体细节和布局一致性 +- **多样化艺术风格**:从照片级真实到印象派绘画,从动漫美学到极简设计,流畅适应各种创意提示 + +*相关链接**: + - [GitHub](https://github.com/QwenLM/Qwen-Image) + - [Hugging Face](https://huggingface.co/Qwen/Qwen-Image) + - [ModelScope](https://modelscope.cn/models/qwen/Qwen-Image) + +## Qwen-Image 原生工作流示例 + + + +本文档中使用的模型你可以在 [Huggingface](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main) 获取到 [Modelscope](https://modelscope.cn/models/Comfy-Org/Qwen-Image_ComfyUI/files) 获取到 + +## 1. 工作流文件 + +更新 ComfyUI 后你可以从模板中找到工作流文件,或者将下面的工作流拖入 ComfyUI 中加载 +![Qwen-image 文生图工作流](https://raw.githubusercontent.com/Comfy-Org/example_workflows/main/image/qwen/qwen-image.png) + + +

下载 JSON 格式工作流

+
+## 2. 模型下载 + +You can find all the models on [Huggingface](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/tree/main) or [Modelscope](https://modelscope.cn/models/Comfy-Org/Qwen-Image_ComfyUI/files) + +**Diffusion Model** + +- [qwen_image_fp8_e4m3fn.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/diffusion_models/qwen_image_fp8_e4m3fn.safetensors) + +**Text Encoder** + +- [qwen_2.5_vl_7b_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors) + +**VAE** + +- [qwen_image_vae.safetensors](https://huggingface.co/Comfy-Org/Qwen-Image_ComfyUI/resolve/main/split_files/vae/qwen_image_vae.safetensors) + +Model Storage Location + +``` +📂 ComfyUI/ +├── 📂 models/ +│ ├── 📂 diffusion_models/ +│ │ └── qwen_image_fp8_e4m3fn.safetensors +│ ├── 📂 vae/ +│ │ └── qwen_image_vae.safetensors +│ └── 📂 text_encoders/ +│ └── qwen_2.5_vl_7b_fp8_scaled.safetensors +``` +### 3. 按步骤完成工作流 + +![步骤图](/images/tutorial/image/qwen/image_qwen_image-guide.jpg) + +1. 确保 `Load Diffusion Model`节点加载了`qwen_image_fp8_e4m3fn.safetensors` +2. 确保 `Load CLIP`节点中加载了`qwen_2.5_vl_7b_fp8_scaled.safetensors` +3. 确保 `Load VAE`节点中加载了`qwen_image_vae.safetensors` +4. 确保 `EmptySD3LatentImage`节点中设置好了图片的尺寸 +5. 在`CLIP Text Encoder`节点中设置好提示词,目前经过测试目前至少支持:英语、中文、韩语、日语、意大利语等 +6. 点击 `Queue` 按钮,或者使用快捷键 `Ctrl(cmd) + Enter(回车)` 来运行工作流 \ No newline at end of file