Kandinsky 5 is finally in Diffusers! #12478

leffff · 2025-10-13T22:43:57Z

What does this PR do?

This PR adds Kandinsky5T2VPipeline and Kandinsky5Transformer3DModel as well as several layer classes neede for Kandinsky 5.0 Lite T2V model

@sayakpaul Please review

sayakpaul · 2025-10-14T04:02:35Z

Could you please update the PR with test code and some example outputs?

leffff · 2025-10-14T08:40:46Z

Sure!

leffff · 2025-10-14T12:20:06Z

@sayakpaul

The example is here:
https://github.com/leffff/diffusers/blob/04efb19b1aeba3b41b7b1bd6d0353a1715c0f839/src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py#L51

Just like in Wan:

diffusers/src/diffusers/pipelines/wan/pipeline_wan.py

Line 46 in fa468c5

EXAMPLE_DOC_STRING = """

leffff · 2025-10-14T12:28:33Z

Dear @sayakpaul @yiyixuxu @DN6
How should the test code and example outputs look like?

leffff · 2025-10-14T13:53:38Z

import torch
from diffusers import Kandinsky5T2VPipeline
from diffusers.utils import export_to_video

pipe = Kandinsky5T2VPipeline.from_pretrained(
    "ai-forever/Kandinsky-5.0-T2V-Lite-sft-5s-Diffusers", 
    torch_dtype=torch.bfloat16
)
pipe = pipe.to("cuda")

negative_prompt = [
    "Static, 2D cartoon, cartoon, 2d animation, paintings, images, worst quality, low quality, ugly, deformed, walking backwards",
]
prompt = [
    "A cat and a dog baking a cake together in a kitchen.",
]

output = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=512,
    width=768,
    num_frames=121,
    num_inference_steps=50,
    guidance_scale=5.0,
    num_videos_per_prompt=1,
    generator=torch.Generator(42)
)

output.10.mp4

prompt = [
    "A monkey ridign a skateboard",
]

output.10.mp4

prompt = [
    "Several giant wooly mammoths threading through the meadow",
]

output.10.mp4

sayakpaul · 2025-10-14T15:25:17Z

Great, thanks for providing the examples! Does the model also do realistic generations? 👀

@linoytsaban @apolinario @asomoza in case you wanna test it?

leffff · 2025-10-14T15:56:13Z

Yes of course!

A stylish woman struts confidently down a rain-drenched Tokyo street, where vibrant neon signs flicker and pulse with electric color. She wears a sleek black leather jacket over a flowing red dress, paired with polished black boots and a matching black purse. Her sunglasses reflect the glowing cityscape as she moves with a calm, assured demeanor, red lipstick adding a bold contrast to her look. The wet pavement mirrors the dazzling lights, doubling the intensity of the urban glow around her. Pedestrians bustle along the sidewalks, their silhouettes blending into the dynamic, cinematic atmosphere of the neon-lit metropolis.

output.10.mp4

A cinematic movie trailer unfolds with a 30-year-old space man traversing a vast salt desert beneath a brilliant blue sky. He wears a uniquely styled red wool knitted motorcycle helmet, adding an eccentric yet rugged charm to his spacefaring look. As he rides a retro-futuristic vehicle across the shimmering white terrain, the wind kicks up clouds of glittering salt, creating a surreal atmosphere. The scene is captured in a vivid, cinematic style, shot on 35mm film to enhance the nostalgic and dramatic grain. Explosions of color and dynamic camera movements highlight the space man's daring escape from a collapsing alien base in the distance.

output.11.mp4

asomoza

thanks, looks cool! left some suggestions for unused imports

src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

Co-authored-by: Álvaro Somoza <[email protected]>

Co-authored-by: YiYi Xu <[email protected]>

leffff · 2025-10-17T20:29:01Z

@yiyixuxu
Done! All your fixes are added! Ready to merge!

asomoza · 2025-10-17T21:12:57Z

@leffff just want to let you know that I've been testing the 10s model and I'm really impressed with it, I like it a lot, congrats to the team. Can't wait for when you release the I2V one.

kangaroo.mp4

leffff · 2025-10-17T21:13:53Z

@asomoza Great! Gonna add them in the next iteration!

Co-authored-by: Charles <[email protected]>

src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

yiyixuxu

will merge once CI is green!

leffff · 2025-10-18T09:40:09Z

Hurrah!!!

yiyixuxu · 2025-10-18T18:03:32Z

@leffff
look forward to the follow-up PR for the 10s model!
We are very happy to help too - let me know if you need anything :)

leffff · 2025-10-20T15:36:25Z

Hi!
@yiyixuxu how can we make Kandinsky 5 appear here: https://huggingface.co/docs/diffusers/api/pipelines/overview?

sayakpaul · 2025-10-20T16:48:14Z

You need to add a page like: https://github.com/huggingface/diffusers/blob/main/docs/source/en/api/pipelines/kandinsky.md

leffff · 2025-10-20T17:01:57Z

Great! Thanks!

leffff · 2025-10-21T10:38:18Z

Just commenting to note that we support all kinds of different attention backends now. So, as long as we implement the attention class in this way, for example, swapping a backend from SDPA ("native" in our terminology) to "flex" should be very easy.
model.set_attention_backend("flex")

Yes, you are right. I tried doing

pipe.transformer.set_attention_backend("flex")

and it almost worked. You see, when I made separate processors, I did this:

class Kandinsky5NablaAttentionProcessor(nn.Module):
    """Custom attention processor for Nabla attention"""
    
    @torch.compile(mode="max-autotune-no-cudagraphs", dynamic=True)
    def __call__(
        self,
        attn,
        query,
        key,
        value,
        sparse_params=None,
        **kwargs,
    ):
        if sparse_params is None:
            raise ValueError("sparse_params is required for Nabla attention")

        query = query.transpose(1, 2).contiguous()
        key = key.transpose(1, 2).contiguous()
        value = value.transpose(1, 2).contiguous()

        block_mask = nablaT_v2(
            query,
            key,
            sparse_params["sta_mask"],
            thr=sparse_params["P"],
        )
        out = (
            flex_attention(query, key, value, block_mask=block_mask)
            .transpose(1, 2)
            .contiguous()
        )
        out = out.flatten(-2, -1)
        return out

sayakpaul · 2025-10-21T13:50:49Z

and it almost worked.

What do you mean? It didn't work as expected or are we good? 👀

leffff · 2025-10-21T14:31:02Z

It worked as expected, yet it's not everything. Flex requires additional compilation. Please see #12520

sayakpaul · 2025-10-21T14:37:10Z

I will reply to that PR.

leffff and others added 15 commits October 4, 2025 10:10

add transformer pipeline first version

d53f848

updates

7db6093

fix 5sec generation

a0cf07f

Merge branch 'huggingface:main' into main

0bd738f

rewrite Kandinsky5T2VPipeline to diffusers style

c8f3a36

Merge branch 'huggingface:main' into main

86b6c2b

add multiprompt support

723d149

remove prints in pipeline

22e14bd

add nabla attention

70fa62b

Merge branch 'huggingface:main' into main

07e11b2

Wrap Transformer in Diffusers style

45240a7

fix license

43bd1e8

Merge branch 'huggingface:main' into main

f35c279

fix prompt type

149fd53

Merge branch 'main' of https://github.com/leffff/diffusers

e3a3e9d

sayakpaul requested review from DN6 and yiyixuxu October 14, 2025 04:02

add gradient checkpointing and peft support

7af80e9

MeiYi-dev mentioned this pull request Oct 14, 2025

[Feature]: Kandinsky 5.0 videogen model support. vladmandic/sdnext#4264

Closed

add usage example

04efb19

Merge branch 'main' into main

4aa22f3

asomoza reviewed Oct 14, 2025

View reviewed changes

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

235f0d5

Co-authored-by: Álvaro Somoza <[email protected]>

leffff and others added 8 commits October 17, 2025 22:52

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

22c503f

Co-authored-by: YiYi Xu <[email protected]>

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

211d3dd

Co-authored-by: YiYi Xu <[email protected]>

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

70cfb9e

Co-authored-by: YiYi Xu <[email protected]>

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

6e83133

Co-authored-by: YiYi Xu <[email protected]>

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

7ad87f3

Co-authored-by: YiYi Xu <[email protected]>

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

bf229af

Co-authored-by: YiYi Xu <[email protected]>

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

06afd9b

Co-authored-by: YiYi Xu <[email protected]>

fixed

e1a635e

leffff and others added 4 commits October 18, 2025 01:56

Merge branch 'main' into main

e4856e5

style +copies

1bf19f0

Update src/diffusers/models/transformers/transformer_kandinsky.py

1746f6d

Co-authored-by: Charles <[email protected]>

more

5bb1657

yiyixuxu reviewed Oct 18, 2025

View reviewed changes

src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py Outdated Show resolved Hide resolved

yiyixuxu reviewed Oct 18, 2025

View reviewed changes

src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py Outdated Show resolved Hide resolved

Apply suggestions from code review

a26300f

yiyixuxu approved these changes Oct 18, 2025

View reviewed changes

add lora loader doc

ecbe522

yiyixuxu merged commit 23ebbb4 into huggingface:main Oct 18, 2025
28 of 31 checks passed

Kandinsky 5 is finally in Diffusers! #12478

Kandinsky 5 is finally in Diffusers! #12478

Uh oh!

Conversation

leffff commented Oct 13, 2025

What does this PR do?

Uh oh!

sayakpaul commented Oct 14, 2025

Uh oh!

leffff commented Oct 14, 2025

Uh oh!

leffff commented Oct 14, 2025

Uh oh!

leffff commented Oct 14, 2025

Uh oh!

leffff commented Oct 14, 2025 • edited by yiyixuxu Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sayakpaul commented Oct 14, 2025

Uh oh!

leffff commented Oct 14, 2025

Uh oh!

asomoza left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

leffff commented Oct 17, 2025

Uh oh!

asomoza commented Oct 17, 2025

Uh oh!

leffff commented Oct 17, 2025

Uh oh!

Uh oh!

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

leffff commented Oct 18, 2025

Uh oh!

yiyixuxu commented Oct 18, 2025

Uh oh!

leffff commented Oct 20, 2025

Uh oh!

sayakpaul commented Oct 20, 2025

Uh oh!

leffff commented Oct 20, 2025

Uh oh!

leffff commented Oct 21, 2025

Uh oh!

sayakpaul commented Oct 21, 2025

Uh oh!

leffff commented Oct 21, 2025

Uh oh!

sayakpaul commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

leffff commented Oct 14, 2025 •

edited by yiyixuxu

Loading