Kandinsky 5 10 sec (NABLA suport) #12520

leffff · 2025-10-21T10:44:16Z

This PR adds support for 10 sec Kandinsky 5.0 model herd.

import torch
from diffusers import Kandinsky5T2VPipeline
from diffusers.utils import export_to_video

# Load the pipeline
pipe = Kandinsky5T2VPipeline.from_pretrained(
    "ai-forever/Kandinsky-5.0-T2V-Lite-sft-10s-Diffusers", 
    torch_dtype=torch.bfloat16
)
pipe = pipe.to("cuda")

# Generate video
prompt = [
    "Photorealistic closeup video of two intricately detailed pirate ships locked in a fierce battle, complete with cannon fire and billowing sails, as they sail through the swirling waters of a steaming cup of coffee. The ships are miniature but highly realistic, with wooden textures and flags fluttering in the liquid breeze. Coffee splashes and foam ripple around them as they maneuver through the turbulent surface, dodging each other's attacks. A detailed reflection of the battle appears on the glossy surface of the coffee, adding to the dynamic realism. The camera pans and zooms to capture every dramatic moment of the high-seas clash within this tiny, unexpected world.",
    "Bad quality",
]
negative_prompt = "Static, 2D cartoon, cartoon, 2d animation, paintings, images, worst quality, low quality, ugly, deformed, walking backwards"

pipe.transformer.set_attention_backend("flex")

output = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=512,
    width=768,
    num_frames=241,
    num_inference_steps=50,
    guidance_scale=5.0,
    num_videos_per_prompt=1,
    generator=torch.Generator(42)
)

output.12.mp4

Co-authored-by: Álvaro Somoza <[email protected]>

Co-authored-by: YiYi Xu <[email protected]>

sayakpaul · 2025-10-22T00:35:19Z

Yes, that should cut it!

HuggingFaceDocBuilderDev · 2025-10-22T00:43:33Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

leffff · 2025-10-22T11:24:35Z

Okay, this seems to be working:

import torch
from diffusers import Kandinsky5T2VPipeline
from diffusers.utils import export_to_video

# Load the pipeline
pipe = Kandinsky5T2VPipeline.from_pretrained(
    "ai-forever/Kandinsky-5.0-T2V-Lite-sft-10s-Diffusers", 
    torch_dtype=torch.bfloat16
)
pipe = pipe.to("cuda")

pipe.transformer.set_attention_backend("flex")
pipe.transformer.compile(mode="max-autotune-no-cudagraphs", dynamic=True)

# Generate video
prompt = "A cat and a dog baking a cake together in a kitchen."
negative_prompt = "Static, 2D cartoon, cartoon, 2d animation, paintings, images, worst quality, low quality, ugly, deformed, walking backwards"

output = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    height=512,
    width=768,
    num_frames=241,
    num_inference_steps=50,
    guidance_scale=5.0,
).frames[0]

## Save the video
export_to_video(output, "output.mp4", fps=24, quality=9)

from: https://huggingface.co/ai-forever/Kandinsky-5.0-T2V-Lite-sft-10s-Diffusers

leffff · 2025-10-22T11:28:29Z

@yiyixuxu @sayakpaul
Please review, but I believe, we good. If is ok for this PR - we can close it.
Please check out this PR: #12527 we need to merge it ASAP.
Things left from me: Docs and Tests.

sayakpaul · 2025-10-22T16:31:04Z

@leffff let's add the tests and docs as well.

yiyixuxu · 2025-10-22T17:55:57Z

ok, let's just use this PR to add docs and tests?

leffff · 2025-10-22T18:27:41Z

Okay

leffff · 2025-10-23T15:12:27Z

Please checkout the docs

yiyixuxu

thanks!

docs/source/en/api/pipelines/kandinsky_v5.md

yiyixuxu · 2025-10-23T17:42:58Z

@bot /style

github-actions · 2025-10-23T17:43:32Z

Style bot fixed some files and pushed the changes.

leffff · 2025-10-24T12:10:18Z

@yiyixuxu plz check the new docs version!

yiyixuxu

looks really good! thanks!

sayakpaul · 2025-10-24T18:31:50Z

@leffff could you also add kandinsky_v5 to _toctree.yml?

leffff · 2025-10-24T19:05:00Z

Okay!

leffff · 2025-10-24T21:46:24Z

@sayakpaul @yiyixuxu done!

leffff and others added 30 commits October 4, 2025 10:10

add transformer pipeline first version

d53f848

updates

7db6093

fix 5sec generation

a0cf07f

Merge branch 'huggingface:main' into main

0bd738f

rewrite Kandinsky5T2VPipeline to diffusers style

c8f3a36

Merge branch 'huggingface:main' into main

86b6c2b

add multiprompt support

723d149

remove prints in pipeline

22e14bd

add nabla attention

70fa62b

Merge branch 'huggingface:main' into main

07e11b2

Wrap Transformer in Diffusers style

45240a7

fix license

43bd1e8

Merge branch 'huggingface:main' into main

f35c279

fix prompt type

149fd53

Merge branch 'main' of https://github.com/leffff/diffusers

e3a3e9d

add gradient checkpointing and peft support

7af80e9

add usage example

04efb19

Merge branch 'main' into main

4aa22f3

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

235f0d5

Co-authored-by: Álvaro Somoza <[email protected]>

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

88a8eea

Co-authored-by: Álvaro Somoza <[email protected]>

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

f52f3b4

Co-authored-by: Álvaro Somoza <[email protected]>

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

0190e55

Co-authored-by: Álvaro Somoza <[email protected]>

Update src/diffusers/models/transformers/transformer_kandinsky.py

d62dffc

Co-authored-by: Álvaro Somoza <[email protected]>

remove unused imports

7084106

Merge branch 'huggingface:main' into main

d5dcd94

add 10 second models support

b615d5c

Merge branch 'main' of https://github.com/leffff/diffusers

6a0233e

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

588c12a

Co-authored-by: YiYi Xu <[email protected]>

remove no_grad and simplified prompt paddings

327ab84

Update src/diffusers/pipelines/kandinsky5/pipeline_kandinsky.py

9b06afb

Co-authored-by: YiYi Xu <[email protected]>

leffff closed this Oct 21, 2025

leffff reopened this Oct 21, 2025

Merge branch 'main' into main

4ed2f53

leffff added 2 commits October 22, 2025 11:25

all needed changes for 10 sec models are added!

54e7757

Merge branch 'main' of https://github.com/leffff/diffusers

939f7d0

Merge branch 'huggingface:main' into main

91133e0

add docs

25f2e9c

Merge branch 'huggingface:main' into main

e45c036

yiyixuxu reviewed Oct 23, 2025

View reviewed changes

docs/source/en/api/pipelines/kandinsky_v5.md Outdated Show resolved Hide resolved

docs/source/en/api/pipelines/kandinsky_v5.md Outdated Show resolved Hide resolved

github-actions bot and others added 3 commits October 23, 2025 17:44

Apply style fixes

3bbc232

Merge branch 'huggingface:main' into main

e181f13

update docs

dd6bf39

Merge branch 'main' into main

add757b

yiyixuxu approved these changes Oct 24, 2025

View reviewed changes

leffff added 2 commits October 24, 2025 21:43

add kandinsky5 to toctree

5fb528b

Merge branch 'main' of https://github.com/leffff/diffusers

c9c1190

Kandinsky 5 10 sec (NABLA suport) #12520

Are you sure you want to change the base?

Kandinsky 5 10 sec (NABLA suport) #12520

Conversation

leffff commented Oct 21, 2025

Uh oh!

sayakpaul commented Oct 22, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Oct 22, 2025

Uh oh!

leffff commented Oct 22, 2025

Uh oh!

leffff commented Oct 22, 2025

Uh oh!

sayakpaul commented Oct 22, 2025

Uh oh!

yiyixuxu commented Oct 22, 2025

Uh oh!

leffff commented Oct 22, 2025

Uh oh!

leffff commented Oct 23, 2025

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yiyixuxu commented Oct 23, 2025

Uh oh!

github-actions bot commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leffff commented Oct 24, 2025

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented Oct 24, 2025

Uh oh!

leffff commented Oct 24, 2025

Uh oh!

leffff commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions bot commented Oct 23, 2025 •

edited

Loading