Add Kandinsky 5.0 video generation tutorial (#630)

mintlify[bot] · web-flow · commit 93b054b2bb1b · 2025-12-09T20:18:31.000+08:00
* Update tutorials/video/kandinsky/kandinsky-5.mdx

Co-Authored-By: mintlify[bot] &lt;109931778+mintlify[bot]@users.noreply.github.com&gt;

* Update docs.json

Co-Authored-By: mintlify[bot] &lt;109931778+mintlify[bot]@users.noreply.github.com&gt;

* Update tutorials/video/kandinsky/kandinsky-5.mdx

Co-Authored-By: mintlify[bot] &lt;109931778+mintlify[bot]@users.noreply.github.com&gt;

* Update tutorials/video/kandinsky/kandinsky-5.mdx

Co-Authored-By: mintlify[bot] &lt;109931778+mintlify[bot]@users.noreply.github.com&gt;

* Update tutorials/video/kandinsky/kandinsky-5.mdx

Co-Authored-By: mintlify[bot] &lt;109931778+mintlify[bot]@users.noreply.github.com&gt;

* Update zh-CN/tutorials/video/kandinsky/kandinsky-5.mdx

Co-Authored-By: mintlify[bot] &lt;109931778+mintlify[bot]@users.noreply.github.com&gt;

* Update docs.json

Co-Authored-By: mintlify[bot] &lt;109931778+mintlify[bot]@users.noreply.github.com&gt;

---------

Co-authored-by: mintlify[bot] &lt;109931778+mintlify[bot]@users.noreply.github.com&gt;
diff --git a/docs.json b/docs.json
@@ -204,6 +204,12 @@
                         "pages": [
                           "tutorials/video/cosmos/cosmos-predict2-video2world"
                         ]
+                      },
+                      {
+                        "group": "Kandinsky",
+                        "pages": [
+                          "tutorials/video/kandinsky/kandinsky-5"
+                        ]
                       }
                     ]
                   },
@@ -834,6 +840,12 @@
                         "pages": [
                           "zh-CN/tutorials/video/cosmos/cosmos-predict2-video2world"
                         ]
+                      },
+                      {
+                        "group": "Kandinsky",
+                        "pages": [
+                          "zh-CN/tutorials/video/kandinsky/kandinsky-5"
+                        ]
                       }
                     ]
                   },
diff --git a/tutorials/video/kandinsky/kandinsky-5.mdx b/tutorials/video/kandinsky/kandinsky-5.mdx
@@ -0,0 +1,111 @@
+---
+title: "Kandinsky 5.0"
+description: "This guide shows how to use Kandinsky 5.0 video generation workflows in ComfyUI"
+sidebarTitle: "Kandinsky 5.0"
+---
+
+import UpdateReminder from "/snippets/tutorials/update-reminder.mdx";
+
+[Kandinsky 5.0](https://huggingface.co/kandinskylab/Kandinsky-5.0-I2V-Lite-5s) is a family of diffusion models for video and image generation developed by [Kandinsky Lab](https://huggingface.co/kandinskylab). The Kandinsky 5.0 T2V Lite is a lightweight 2B parameter model that ranks among the top open-source video generation models, capable of generating videos up to 10 seconds long.
+
+<UpdateReminder/>
+
+## Overview
+
+Kandinsky 5.0 uses a latent diffusion pipeline with Flow Matching and features:
+
+- **Diffusion Transformer (DiT):** Main generative backbone with cross-attention to text embeddings
+- **Qwen2.5-VL and CLIP:** Provides high-quality text embeddings
+- **HunyuanVideo 3D VAE:** Encodes and decodes video into a latent space
+
+The model family includes multiple variants optimized for different use cases:
+- **SFT model:** Highest generation quality
+- **CFG-distilled:** 2× faster inference
+- **Diffusion-distilled:** 6× faster with minimal quality loss (16 steps)
+- **Pretrain model:** Designed for fine-tuning
+
+All models are available in 5-second and 10-second video generation versions.
+
+## Model variants
+
+| Model | Video Duration | NFE | Latency (H100) |
+|-------|---------------|-----|----------------|
+| Kandinsky 5.0 T2V Lite SFT | 5s / 10s | 100 | 139s / 224s |
+| Kandinsky 5.0 T2V Lite no-CFG | 5s / 10s | 50 | 77s / 124s |
+| Kandinsky 5.0 T2V Lite distill | 5s / 10s | 16 | 35s / 61s |
+| Kandinsky 5.0 I2V Lite | 5s | 100 | 673s |
+
+## Text-to-Video workflow
+
+### 1. Download workflow file
+
+Please update your ComfyUI to the latest version, and through the menu `Workflow` -> `Browse Templates` -> `Video`, find "Kandinsky 5.0 T2V" to load the workflow.
+
+<a className="prose" target='_blank' href="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/video_kandinsky5_t2v.json" style={{ display: 'inline-block', backgroundColor: '#0078D6', color: '#ffffff', padding: '10px 20px', borderRadius: '8px', borderColor: "transparent", textDecoration: 'none', fontWeight: 'bold'}}>
+    <p className="prose" style={{ margin: 0, fontSize: "0.8rem" }}>Download JSON Workflow File</p>
+</a>
+
+### 2. Manually download models
+
+**Text Encoders**
+- [qwen_2.5_vl_7b_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors)
+- [clip_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors)
+
+**Diffusion Model**
+- [kandinsky5lite_t2v_sft_5s.safetensors](https://huggingface.co/kandinskylab/Kandinsky-5.0-T2V-Lite-sft-5s/resolve/main/model/kandinsky5lite_t2v_sft_5s.safetensors)
+
+**VAE**
+- [hunyuan_video_vae_bf16.safetensors](https://huggingface.co/Kijai/HunyuanVideo_comfy/resolve/main/hunyuan_video_vae_bf16.safetensors)
+
+```
+ComfyUI/
+├── 📂 models/
+│   ├── 📂 text_encoders/
+│   │      ├── qwen_2.5_vl_7b_fp8_scaled.safetensors
+│   │      └── clip_l.safetensors
+│   ├── 📂 diffusion_models/
+│   │      └── kandinsky5lite_t2v_sft_5s.safetensors
+│   └── 📂 vae/
+│          └── hunyuan_video_vae_bf16.safetensors
+```
+
+## Image-to-Video workflow
+
+### 1. Download workflow file
+
+Please update your ComfyUI to the latest version, and through the menu `Workflow` -> `Browse Templates` -> `Video`, find "Kandinsky 5.0 I2V" to load the workflow.
+
+<a className="prose" target='_blank' href="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/video_kandinsky5_i2v.json" style={{ display: 'inline-block', backgroundColor: '#0078D6', color: '#ffffff', padding: '10px 20px', borderRadius: '8px', borderColor: "transparent", textDecoration: 'none', fontWeight: 'bold'}}>
+    <p className="prose" style={{ margin: 0, fontSize: "0.8rem" }}>Download JSON Workflow File</p>
+</a>
+
+### 2. Manually download models
+
+**Text Encoders**
+- [qwen_2.5_vl_7b_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors)
+- [clip_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors)
+
+**Diffusion Model**
+- [kandinsky5lite_i2v_sft_5s.safetensors](https://huggingface.co/kandinskylab/Kandinsky-5.0-I2V-Lite-5s/resolve/main/model/kandinsky5lite_i2v_sft_5s.safetensors)
+
+**VAE**
+- [hunyuan_video_vae_bf16.safetensors](https://huggingface.co/Kijai/HunyuanVideo_comfy/resolve/main/hunyuan_video_vae_bf16.safetensors)
+
+```
+ComfyUI/
+├── 📂 models/
+│   ├── 📂 text_encoders/
+│   │      ├── qwen_2.5_vl_7b_fp8_scaled.safetensors
+│   │      └── clip_l.safetensors
+│   ├── 📂 diffusion_models/
+│   │      └── kandinsky5lite_i2v_sft_5s.safetensors
+│   └── 📂 vae/
+│          └── hunyuan_video_vae_bf16.safetensors
+```
+
+## Resources
+
+- [HuggingFace Model Collection](https://huggingface.co/collections/kandinskylab/kandinsky-50-video-lite)
+- [GitHub Repository](https://github.com/ai-forever/Kandinsky-5)
+- [ComfyUI Integration](https://github.com/ai-forever/Kandinsky-5/blob/main/comfyui/README.md)
+- [Project Page](https://ai-forever.github.io/Kandinsky-5/)
diff --git a/zh-CN/tutorials/video/kandinsky/kandinsky-5.mdx b/zh-CN/tutorials/video/kandinsky/kandinsky-5.mdx
@@ -0,0 +1,111 @@
+---
+title: "Kandinsky 5.0"
+description: "本指南介绍如何在 ComfyUI 中使用 Kandinsky 5.0 视频生成工作流"
+sidebarTitle: "Kandinsky 5.0"
+---
+
+import UpdateReminder from "/snippets/zh/tutorials/update-reminder.mdx";
+
+[Kandinsky 5.0](https://huggingface.co/kandinskylab/Kandinsky-5.0-I2V-Lite-5s) 是由 [Kandinsky Lab](https://huggingface.co/kandinskylab) 开发的视频和图像生成扩散模型系列。Kandinsky 5.0 T2V Lite 是一个轻量级的 2B 参数模型，在开源视频生成模型中名列前茅，能够生成长达 10 秒的视频。
+
+<UpdateReminder/>
+
+## 概述
+
+Kandinsky 5.0 使用带有 Flow Matching 的潜在扩散管道，具有以下特点：
+
+- **扩散 Transformer (DiT)：** 主要生成骨干网络，通过交叉注意力连接文本嵌入
+- **Qwen2.5-VL 和 CLIP：** 提供高质量的文本嵌入
+- **HunyuanVideo 3D VAE：** 将视频编码和解码到潜在空间
+
+该模型系列包含多个针对不同用例优化的变体：
+- **SFT 模型：** 最高生成质量
+- **CFG-distilled：** 推理速度提升 2 倍
+- **Diffusion-distilled：** 速度提升 6 倍，质量损失极小（16 步）
+- **Pretrain 模型：** 专为微调设计
+
+所有模型均提供 5 秒和 10 秒视频生成版本。
+
+## 模型变体
+
+| 模型 | 视频时长 | NFE | 延迟 (H100) |
+|-------|---------------|-----|----------------|
+| Kandinsky 5.0 T2V Lite SFT | 5s / 10s | 100 | 139s / 224s |
+| Kandinsky 5.0 T2V Lite no-CFG | 5s / 10s | 50 | 77s / 124s |
+| Kandinsky 5.0 T2V Lite distill | 5s / 10s | 16 | 35s / 61s |
+| Kandinsky 5.0 I2V Lite | 5s | 100 | 673s |
+
+## 文生视频工作流
+
+### 1. 下载工作流文件
+
+请更新你的 ComfyUI 到最新版本，并通过菜单 `工作流` -> `浏览模板` -> `视频` 找到 "Kandinsky 5.0 T2V" 以加载工作流。
+
+<a className="prose" target='_blank' href="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/video_kandinsky5_t2v.json" style={{ display: 'inline-block', backgroundColor: '#0078D6', color: '#ffffff', padding: '10px 20px', borderRadius: '8px', borderColor: "transparent", textDecoration: 'none', fontWeight: 'bold'}}>
+    <p className="prose" style={{ margin: 0, fontSize: "0.8rem" }}>下载 JSON 格式工作流</p>
+</a>
+
+### 2. 手动下载模型
+
+**Text Encoders**
+- [qwen_2.5_vl_7b_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors)
+- [clip_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors)
+
+**Diffusion Model**
+- [kandinsky5lite_t2v_sft_5s.safetensors](https://huggingface.co/kandinskylab/Kandinsky-5.0-T2V-Lite-sft-5s/resolve/main/model/kandinsky5lite_t2v_sft_5s.safetensors)
+
+**VAE**
+- [hunyuan_video_vae_bf16.safetensors](https://huggingface.co/Kijai/HunyuanVideo_comfy/resolve/main/hunyuan_video_vae_bf16.safetensors)
+
+```
+ComfyUI/
+├── 📂 models/
+│   ├── 📂 text_encoders/
+│   │      ├── qwen_2.5_vl_7b_fp8_scaled.safetensors
+│   │      └── clip_l.safetensors
+│   ├── 📂 diffusion_models/
+│   │      └── kandinsky5lite_t2v_sft_5s.safetensors
+│   └── 📂 vae/
+│          └── hunyuan_video_vae_bf16.safetensors
+```
+
+## 图生视频工作流
+
+### 1. 下载工作流文件
+
+请更新你的 ComfyUI 到最新版本，并通过菜单 `工作流` -> `浏览模板` -> `视频` 找到 "Kandinsky 5.0 I2V" 以加载工作流。
+
+<a className="prose" target='_blank' href="https://raw.githubusercontent.com/Comfy-Org/workflow_templates/refs/heads/main/templates/video_kandinsky5_i2v.json" style={{ display: 'inline-block', backgroundColor: '#0078D6', color: '#ffffff', padding: '10px 20px', borderRadius: '8px', borderColor: "transparent", textDecoration: 'none', fontWeight: 'bold'}}>
+    <p className="prose" style={{ margin: 0, fontSize: "0.8rem" }}>下载 JSON 格式工作流</p>
+</a>
+
+### 2. 手动下载模型
+
+**Text Encoders**
+- [qwen_2.5_vl_7b_fp8_scaled.safetensors](https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/resolve/main/split_files/text_encoders/qwen_2.5_vl_7b_fp8_scaled.safetensors)
+- [clip_l.safetensors](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors)
+
+**Diffusion Model**
+- [kandinsky5lite_i2v_sft_5s.safetensors](https://huggingface.co/kandinskylab/Kandinsky-5.0-I2V-Lite-5s/resolve/main/model/kandinsky5lite_i2v_sft_5s.safetensors)
+
+**VAE**
+- [hunyuan_video_vae_bf16.safetensors](https://huggingface.co/Kijai/HunyuanVideo_comfy/resolve/main/hunyuan_video_vae_bf16.safetensors)
+
+```
+ComfyUI/
+├── 📂 models/
+│   ├── 📂 text_encoders/
+│   │      ├── qwen_2.5_vl_7b_fp8_scaled.safetensors
+│   │      └── clip_l.safetensors
+│   ├── 📂 diffusion_models/
+│   │      └── kandinsky5lite_i2v_sft_5s.safetensors
+│   └── 📂 vae/
+│          └── hunyuan_video_vae_bf16.safetensors
+```
+
+## 资源
+
+- [HuggingFace 模型合集](https://huggingface.co/collections/kandinskylab/kandinsky-50-video-lite)
+- [GitHub 仓库](https://github.com/ai-forever/Kandinsky-5)
+- [ComfyUI 集成](https://github.com/ai-forever/Kandinsky-5/blob/main/comfyui/README.md)
+- [项目主页](https://ai-forever.github.io/Kandinsky-5/)

Original file line number	Diff line number	Diff line change
`@@ -204,6 +204,12 @@`
`204`	`204`	`"pages": [`
`205`	`205`	`"tutorials/video/cosmos/cosmos-predict2-video2world"`
`206`	`206`	`]`
	`207`	`+ },`
	`208`	`+ {`
	`209`	`+ "group": "Kandinsky",`
	`210`	`+ "pages": [`
	`211`	`+ "tutorials/video/kandinsky/kandinsky-5"`
	`212`	`+ ]`
`207`	`213`	`}`
`208`	`214`	`]`
`209`	`215`	`},`
`@@ -834,6 +840,12 @@`
`834`	`840`	`"pages": [`
`835`	`841`	`"zh-CN/tutorials/video/cosmos/cosmos-predict2-video2world"`
`836`	`842`	`]`
	`843`	`+ },`
	`844`	`+ {`
	`845`	`+ "group": "Kandinsky",`
	`846`	`+ "pages": [`
	`847`	`+ "zh-CN/tutorials/video/kandinsky/kandinsky-5"`
	`848`	`+ ]`
`837`	`849`	`}`
`838`	`850`	`]`
`839`	`851`	`},`