Skip to content
Open
Show file tree
Hide file tree
Changes from 95 commits
Commits
Show all changes
101 commits
Select commit Hold shift + click to select a range
d53f848
add transformer pipeline first version
leffff Oct 4, 2025
7db6093
updates
leffff Oct 6, 2025
a0cf07f
fix 5sec generation
leffff Oct 9, 2025
0bd738f
Merge branch 'huggingface:main' into main
leffff Oct 9, 2025
c8f3a36
rewrite Kandinsky5T2VPipeline to diffusers style
leffff Oct 10, 2025
86b6c2b
Merge branch 'huggingface:main' into main
leffff Oct 10, 2025
723d149
add multiprompt support
leffff Oct 10, 2025
22e14bd
remove prints in pipeline
leffff Oct 10, 2025
70fa62b
add nabla attention
leffff Oct 12, 2025
07e11b2
Merge branch 'huggingface:main' into main
leffff Oct 12, 2025
45240a7
Wrap Transformer in Diffusers style
leffff Oct 13, 2025
43bd1e8
fix license
leffff Oct 13, 2025
f35c279
Merge branch 'huggingface:main' into main
leffff Oct 13, 2025
149fd53
fix prompt type
leffff Oct 13, 2025
e3a3e9d
Merge branch 'main' of https://github.com/leffff/diffusers
leffff Oct 13, 2025
7af80e9
add gradient checkpointing and peft support
leffff Oct 14, 2025
04efb19
add usage example
leffff Oct 14, 2025
4aa22f3
Merge branch 'main' into main
leffff Oct 14, 2025
235f0d5
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 14, 2025
88a8eea
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 14, 2025
f52f3b4
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 14, 2025
0190e55
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 14, 2025
d62dffc
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 14, 2025
7084106
remove unused imports
leffff Oct 14, 2025
d5dcd94
Merge branch 'huggingface:main' into main
leffff Oct 15, 2025
b615d5c
add 10 second models support
leffff Oct 15, 2025
6a0233e
Merge branch 'main' of https://github.com/leffff/diffusers
leffff Oct 15, 2025
588c12a
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 16, 2025
327ab84
remove no_grad and simplified prompt paddings
leffff Oct 16, 2025
9b06afb
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 16, 2025
8fd22c0
merge
leffff Oct 16, 2025
28458d0
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 16, 2025
e7b91ed
merge suggestions
leffff Oct 16, 2025
cd3cc61
moved template to __init__
leffff Oct 16, 2025
4450265
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 16, 2025
b9a3be2
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 16, 2025
78a23b9
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 16, 2025
56b90b1
moved sdps inside processor
leffff Oct 16, 2025
600e9d6
Merge branch 'main' of https://github.com/leffff/diffusers
leffff Oct 16, 2025
31a1474
remove oneline function
leffff Oct 16, 2025
894aa98
remove reset_dtype methods
leffff Oct 16, 2025
c8be081
Transformer: move all methods to forward
leffff Oct 16, 2025
3ffdf7f
separated prompt encoding
leffff Oct 16, 2025
b0e1b86
Merge branch 'main' into main
leffff Oct 16, 2025
9f52335
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 16, 2025
cc46e2d
refactoring
leffff Oct 16, 2025
573b966
Merge branch 'main' of https://github.com/leffff/diffusers
leffff Oct 16, 2025
9672c6b
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 16, 2025
1e597cb
Merge branch 'main' of https://github.com/leffff/diffusers
leffff Oct 16, 2025
900feba
refactoring acording to https://github.com/huggingface/diffusers/comm…
leffff Oct 17, 2025
3839f5e
Merge branch 'main' into main
yiyixuxu Oct 17, 2025
226bbf8
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 17, 2025
9504fb0
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 17, 2025
f0eca08
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 17, 2025
cc74c1e
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 17, 2025
cb915d7
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 17, 2025
9aa3c2e
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 17, 2025
feac8f0
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 17, 2025
d3b9597
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 17, 2025
693b9aa
Update src/diffusers/models/transformers/transformer_kandinsky.py
leffff Oct 17, 2025
e2ed6ec
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
2925447
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
b02ad82
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
dc67c2b
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
d0fc426
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
222ba4c
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
3a49505
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
1e12017
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
5a30079
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
0d96ecf
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
aadafc1
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
54cf03c
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
22c503f
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
211d3dd
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
70cfb9e
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
6e83133
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
7ad87f3
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
bf229af
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
06afd9b
Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py
leffff Oct 17, 2025
e1a635e
fixed
leffff Oct 17, 2025
e4856e5
Merge branch 'main' into main
leffff Oct 17, 2025
1bf19f0
style +copies
yiyixuxu Oct 18, 2025
1746f6d
Update src/diffusers/models/transformers/transformer_kandinsky.py
yiyixuxu Oct 18, 2025
5bb1657
more
yiyixuxu Oct 18, 2025
a26300f
Apply suggestions from code review
yiyixuxu Oct 18, 2025
ecbe522
add lora loader doc
yiyixuxu Oct 18, 2025
11200b4
Merge branch 'huggingface:main' into main
leffff Oct 20, 2025
b35445c
add compiled Nabla Attention
leffff Oct 21, 2025
51b078c
Merge branch 'huggingface:main' into main
leffff Oct 21, 2025
4ed2f53
Merge branch 'main' into main
sayakpaul Oct 22, 2025
54e7757
all needed changes for 10 sec models are added!
leffff Oct 22, 2025
939f7d0
Merge branch 'main' of https://github.com/leffff/diffusers
leffff Oct 22, 2025
91133e0
Merge branch 'huggingface:main' into main
leffff Oct 22, 2025
25f2e9c
add docs
leffff Oct 23, 2025
e45c036
Merge branch 'huggingface:main' into main
leffff Oct 23, 2025
3bbc232
Apply style fixes
github-actions[bot] Oct 23, 2025
e181f13
Merge branch 'huggingface:main' into main
leffff Oct 24, 2025
dd6bf39
update docs
leffff Oct 24, 2025
add757b
Merge branch 'main' into main
yiyixuxu Oct 24, 2025
5fb528b
add kandinsky5 to toctree
leffff Oct 24, 2025
c9c1190
Merge branch 'main' of https://github.com/leffff/diffusers
leffff Oct 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 109 additions & 0 deletions docs/source/en/api/pipelines/kandinsky_v5.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Kandinsky 5.0

Kandinsky 5.0 is created by the Kandinsky team: Alexey Letunovskiy, Maria Kovaleva, Ivan Kirillov, Lev Novitskiy, Denis Koposov, Dmitrii Mikhailov, Anna Averchenkova, Andrey Shutkin, Julia Agafonova, Olga Kim, Anastasiia Kargapoltseva, Nikita Kiselev, Anna Dmitrienko, Anastasia Maltseva, Kirill Chernyshev, Ilia Vasiliev, Viacheslav Vasilev, Vladimir Polovnikov, Yury Kolabushin, Alexander Belykh, Mikhail Mamaev, Anastasia Aliaskina, Tatiana Nikulina, Polina Gavrilova, Vladimir Arkhipkin, Vladimir Korviakov, Nikolai Gerasimenko, Denis Parkhomenko, Denis Dimitrov


Kandinsky 5.0 is a family of diffusion models for Video & Image generation. Kandinsky 5.0 T2V Lite is a lightweight video generation model (2B parameters) that ranks #1 among open-source models in its class. It outperforms larger models and offers the best understanding of Russian concepts in the open-source ecosystem.

The model introduces several key innovations:
- **Latent diffusion pipeline** with **Flow Matching** for improved training stability
- **Diffusion Transformer (DiT)** as the main generative backbone with cross-attention to text embeddings
- Dual text encoding using **Qwen2.5-VL** and **CLIP** for comprehensive text understanding
- **HunyuanVideo 3D VAE** for efficient video encoding and decoding
- **Sparse attention mechanisms** (NABLA) for efficient long-sequence processing

The original codebase can be found at [ai-forever/Kandinsky-5](https://github.com/ai-forever/Kandinsky-5).

> [!TIP]
> Check out the [AI Forever](https://huggingface.co/ai-forever) organization on the Hub for the official model checkpoints for text-to-video generation, including pretrained, SFT, no-CFG, and distilled variants.

## Available Models

Kandinsky 5.0 T2V Lite comes in several variants optimized for different use cases:

| Model Type | Description | Use Cases |
|------------|-------------|-----------|
| **SFT** | Supervised Fine-Tuned model | Highest generation quality |
| **no-CFG** | Classifier-Free Guidance distilled | 2× faster inference |
| **Distilled** | Diffusion distilled to 16 steps | 6× faster inference, minimal quality loss |
| **Pretrain** | Base pretrained model | Research and fine-tuning |

All models are available in 5-second and 10-second video generation versions.

## Kandinsky5T2VPipeline

[[autodoc]] Kandinsky5T2VPipeline
- all
- __call__

## Usage Examples

### Basic Text-to-Video Generation

```python
import torch
from diffusers import Kandinsky5T2VPipeline
from diffusers.utils import export_to_video

# Load the pipeline
model_id = "ai-forever/Kandinsky-5.0-T2V-Lite-sft-5s-Diffusers"
pipe = Kandinsky5T2VPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16)
pipe = pipe.to("cuda")

# Generate video
prompt = "A cat and a dog baking a cake together in a kitchen."
negative_prompt = "Static, 2D cartoon, cartoon, 2d animation, paintings, images, worst quality, low quality, ugly, deformed, walking backwards"

output = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
height=512,
width=768,
num_frames=121, # ~5 seconds at 24fps
num_inference_steps=50,
guidance_scale=5.0,
).frames[0]

export_to_video(output, "output.mp4", fps=24, quality=9)
```


### Using Different Model Variants
```python
# For faster generation with distilled model
model_id = "ai-forever/Kandinsky-5.0-T2V-Lite-distilled16steps-5s-Diffusers"
pipe = Kandinsky5T2VPipeline.from_pretrained(model_id, torch_dtype=torch.bfloat16)
pipe = pipe.to("cuda")

# Generate with fewer steps
output = pipe(
prompt="A beautiful sunset over mountains",
num_inference_steps=16, # Only 16 steps needed for distilled model
guidance_scale=1.0,
).frames[0]
```

## Citation
```bibtex
@misc{kandinsky2025,
author = {Alexey Letunovskiy and Maria Kovaleva and Ivan Kirillov and Lev Novitskiy and Denis Koposov and
Dmitrii Mikhailov and Anna Averchenkova and Andrey Shutkin and Julia Agafonova and Olga Kim and
Anastasiia Kargapoltseva and Nikita Kiselev and Vladimir Arkhipkin and Vladimir Korviakov and
Nikolai Gerasimenko and Denis Parkhomenko and Anna Dmitrienko and Anastasia Maltseva and
Kirill Chernyshev and Ilia Vasiliev and Viacheslav Vasilev and Vladimir Polovnikov and
Yury Kolabushin and Alexander Belykh and Mikhail Mamaev and Anastasia Aliaskina and
Tatiana Nikulina and Polina Gavrilova and Denis Dimitrov},
title = {Kandinsky 5.0: A family of diffusion models for Video & Image generation},
howpublished = {\url{https://github.com/ai-forever/Kandinsky-5}},
year = 2025
}
```
4 changes: 3 additions & 1 deletion src/diffusers/models/transformers/transformer_kandinsky.py
Original file line number Diff line number Diff line change
Expand Up @@ -324,9 +324,10 @@ def apply_rotary(x, rope):
sparse_params["sta_mask"],
thr=sparse_params["P"],
)

else:
attn_mask = None

hidden_states = dispatch_attention_fn(
query,
key,
Expand All @@ -335,6 +336,7 @@ def apply_rotary(x, rope):
backend=self._attention_backend,
parallel_config=self._parallel_config,
)

hidden_states = hidden_states.flatten(-2, -1)

attn_out = attn.out_layer(hidden_states)
Expand Down
Loading