Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,12 @@
"tutorials/image/z-image/z-image-turbo"
]
},
{
"group": "Ovis",
"pages": [
"tutorials/image/ovis/ovis-image"
]
},
"tutorials/image/cosmos/cosmos-predict2-t2i",
"tutorials/image/omnigen/omnigen2"
]
Expand Down Expand Up @@ -773,6 +779,12 @@
"zh-CN/tutorials/image/z-image/z-image-turbo"
]
},
{
"group": "Ovis",
"pages": [
"zh-CN/tutorials/image/ovis/ovis-image"
]
},
"zh-CN/tutorials/image/cosmos/cosmos-predict2-t2i",
"zh-CN/tutorials/image/omnigen/omnigen2"
]
Expand Down
54 changes: 54 additions & 0 deletions tutorials/image/ovis/ovis-image.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
title: "Ovis-Image ComfyUI Workflow Example"
description: "Ovis-Image is a 7B text-to-image model specifically optimized for high-quality text rendering, designed to operate efficiently under stringent computational constraints."
sidebarTitle: "Ovis-Image"
---

import UpdateReminder from '/snippets/tutorials/update-reminder.mdx'

**Ovis-Image** is a 7B text-to-image model built upon [Ovis-U1](https://github.com/AIDC-AI/Ovis-U1), specifically optimized for high-quality text rendering. It delivers text rendering quality comparable to much larger 20B-class systems while remaining compact enough to run on widely accessible hardware.

**Model Highlights**:
- **Strong Text Rendering at 7B Scale**: Delivers text rendering quality comparable to much larger 20B-class systems like Qwen-Image and competitive with leading closed-source models like GPT4o in text-centric scenarios
- **High Fidelity on Text-Heavy Prompts**: Excels on prompts that demand tight alignment between linguistic content and rendered typography (e.g., posters, banners, logos, UI mockups, infographics)
- **Accurate Bilingual Text Rendering**: Produces legible, correctly spelled, and semantically consistent text in both Chinese and English across diverse fonts, sizes, and aspect ratios
- **Efficiency and Deployability**: Fits on a single high-end GPU with moderate memory, supports low-latency interactive use

**Related Links**:
- [GitHub](https://github.com/AIDC-AI/Ovis-Image)
- [Hugging Face](https://huggingface.co/AIDC-AI/Ovis-Image-7B)

## Ovis-Image text-to-image workflow

<a className="prose" target='_blank' href="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/image_ovis_text_to_image.json" style={{ display: 'inline-block', backgroundColor: '#0078D6', color: '#ffffff', padding: '10px 20px', borderRadius: '8px', borderColor: "transparent", textDecoration: 'none', fontWeight: 'bold'}}>
<p className="prose" style={{ margin: 0, fontSize: "0.8rem" }}>Download JSON Workflow File</p>
</a>

<UpdateReminder />

## Model links

**text_encoders**

- [ovis_2.5.safetensors](https://huggingface.co/Comfy-Org/Ovis-Image/resolve/main/split_files/text_encoders/ovis_2.5.safetensors)

**diffusion_models**

- [ovis_image_bf16.safetensors](https://huggingface.co/Comfy-Org/Ovis-Image/resolve/main/split_files/diffusion_models/ovis_image_bf16.safetensors)

**vae**

- [ae.safetensors](https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/vae/ae.safetensors)

**Model Storage Location**

```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 text_encoders/
│ │ └── ovis_2.5.safetensors
│ ├── 📂 diffusion_models/
│ │ └── ovis_image_bf16.safetensors
│ └── 📂 vae/
│ └── ae.safetensors
```
54 changes: 54 additions & 0 deletions zh-CN/tutorials/image/ovis/ovis-image.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
title: "Ovis-Image ComfyUI 工作流示例"
description: "Ovis-Image 是一个 7B 文生图模型,专门针对高质量文本渲染进行优化,旨在严格的计算约束下高效运行。"
sidebarTitle: "Ovis-Image"
---

import UpdateReminder from '/snippets/zh/tutorials/update-reminder.mdx'

**Ovis-Image** 是一个基于 [Ovis-U1](https://github.com/AIDC-AI/Ovis-U1) 构建的 7B 文生图模型,专门针对高质量文本渲染进行优化。它能够提供与更大的 20B 级别系统相当的文本渲染质量,同时保持足够紧凑,可在常见硬件上运行。

**模型亮点**:
- **7B 规模下的强大文本渲染**:提供与 Qwen-Image 等更大的 20B 级别系统相当的文本渲染质量,在文本场景中与 GPT4o 等领先的闭源模型具有竞争力
- **文本密集型提示词的高保真度**:擅长处理需要语言内容与渲染排版紧密对齐的提示词(如海报、横幅、标志、UI 模型、信息图表)
- **精准的双语文本渲染**:在各种字体、大小和宽高比下,生成清晰、拼写正确且语义一致的中英文文本
- **高效且易于部署**:可在单个高端 GPU 上运行,内存需求适中,支持低延迟交互使用

**相关链接**:
- [GitHub](https://github.com/AIDC-AI/Ovis-Image)
- [Hugging Face](https://huggingface.co/AIDC-AI/Ovis-Image-7B)

## Ovis-Image 文生图工作流

<a className="prose" target='_blank' href="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/image_ovis_text_to_image.json" style={{ display: 'inline-block', backgroundColor: '#0078D6', color: '#ffffff', padding: '10px 20px', borderRadius: '8px', borderColor: "transparent", textDecoration: 'none', fontWeight: 'bold'}}>
<p className="prose" style={{ margin: 0, fontSize: "0.8rem" }}>下载 JSON 工作流文件</p>
</a>

<UpdateReminder />

## 模型链接

**text_encoders(文本编码器)**

- [ovis_2.5.safetensors](https://huggingface.co/Comfy-Org/Ovis-Image/resolve/main/split_files/text_encoders/ovis_2.5.safetensors)

**diffusion_models(扩散模型)**

- [ovis_image_bf16.safetensors](https://huggingface.co/Comfy-Org/Ovis-Image/resolve/main/split_files/diffusion_models/ovis_image_bf16.safetensors)

**vae**

- [ae.safetensors](https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/vae/ae.safetensors)

**模型存储位置**

```
📂 ComfyUI/
├── 📂 models/
│ ├── 📂 text_encoders/
│ │ └── ovis_2.5.safetensors
│ ├── 📂 diffusion_models/
│ │ └── ovis_image_bf16.safetensors
│ └── 📂 vae/
│ └── ae.safetensors
```