Fix: Remove hardcoded CUDA autocast in Kandinsky 5 to fix import warning #12814

adi776borate · 2025-12-09T15:09:57Z

What does this PR do?

Fixes #12809
This PR fixes it by:

Removing the @torch.autocast decorator (Fixes the import warning).
Explicitly casting inputs to float32 inside the forward method (Preserves the required numerical stability).
Casting the result back to weight.dtype before passing it to the Linear layers (Fixes the dtype mismatch crash).

Verification

I verified that the results remain stable before and after this change by generating images with a fixed seed (generator=torch.manual_seed(42)).

The results are almost the same with some minor differences.

Before Fix	After Fix

Reproduction Script

import torch
from diffusers import Kandinsky5T2IPipeline

model_id = "kandinskylab/Kandinsky-5.0-T2I-Lite-sft-Diffusers"
device = "cuda" if torch.cuda.is_available() else "cpu"

dtype = torch.bfloat16 if torch.cuda.is_bf16_supported() else torch.float32
pipe = Kandinsky5T2IPipeline.from_pretrained(model_id, torch_dtype=dtype)
pipe.to(device)

seed = 42
generator = torch.Generator(device=device).manual_seed(seed)

print("Generating image...")
output = pipe(
    prompt="A cat and a dog baking a cake together in a kitchen.",
    negative_prompt="",
    num_inference_steps=25, # Reduced for faster verification
    guidance_scale=3.5,
    height=1024,
    width=1024,
    generator=generator, 
)

image = output.image[0]
image.save("kandinsky_after_fix.png")

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@yiyixuxu @leffff
Anyone in the community is free to review the PR once the tests have passed.

leffff · 2025-12-09T18:19:31Z

Looks good to me!

knd0331 · 2025-12-10T00:21:55Z

Thanks for the quick fix! I didn't have time to submit a PR myself, so I really appreciate you jumping on this. 🙏
@adi776borate

adi776borate · 2025-12-11T10:05:23Z

@yiyixuxu @sayakpaul
A gentle ping to review

sayakpaul

Thank you! Could you also provide your testing script?

adi776borate · 2025-12-11T11:33:34Z

Thank you! Could you also provide your testing script?

The verification script is already provided in the PR description above.
If you want to test minimally, we can just do:

from diffusers.models.transformers import transformer_kandinsky
print("Import successful.")

Should print a UserWarning on main, but not on this branch.

yiyixuxu

thanks！

yiyixuxu · 2025-12-11T15:10:15Z

@bot /style

github-actions · 2025-12-11T15:10:37Z

Style bot fixed some files and pushed the changes.

HuggingFaceDocBuilderDev · 2025-12-11T15:19:49Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

yiyixuxu · 2025-12-11T16:28:57Z

src/diffusers/models/transformers/transformer_kandinsky.py

-    @torch.autocast(device_type="cuda", dtype=torch.float32)
    def forward(self, time):
-        args = torch.outer(time, self.freqs.to(device=time.device))
+        time = time.to(dtype=torch.float32)


Suggested change

time = time.to(dtype=torch.float32)

origintal_dtype = time.dtype

time = time.to(dtype=torch.float32)

yiyixuxu · 2025-12-11T16:29:12Z

src/diffusers/models/transformers/transformer_kandinsky.py

+        freqs = self.freqs.to(device=time.device, dtype=torch.float32)
+        args = torch.outer(time, freqs)
        time_embed = torch.cat([torch.cos(args), torch.sin(args)], dim=-1)
+        time_embed = time_embed.to(dtype=self.in_layer.weight.dtype)


Suggested change

time_embed = time_embed.to(dtype=self.in_layer.weight.dtype)

time_embed = time_embed.to(dtype=original_dtype)

The reason I cast to self.in_layer.weight.dtype instead of original_dtype is to prevent runtime crashes on backends like XPU as mentioned by @vladmandic here.
If users load the pipeline in float16, and we pass time_embed as float32, that will raise an error, won't it?
I might be wrong, correct me if so.

yiyixuxu · 2025-12-11T16:30:15Z

src/diffusers/models/transformers/transformer_kandinsky.py


-    @torch.autocast(device_type="cuda", dtype=torch.float32)
    def forward(self, x):
+        x = x.to(dtype=self.out_layer.weight.dtype)


umm actually this did not look correct to me - we want to upcast it to float32, no?

Similarly, if we force x to float32 here, we might hit the same mismatch crash if the out_layer weights are float16/bfloat16.

Fix: Remove hardcoded CUDA autocast in Kandinsky 5 to fix import warning

77ffd6a

sayakpaul reviewed Dec 11, 2025

View reviewed changes

sayakpaul requested a review from yiyixuxu December 11, 2025 12:00

yiyixuxu approved these changes Dec 11, 2025

View reviewed changes

Apply style fixes

9afba5f

yiyixuxu reviewed Dec 11, 2025

View reviewed changes

	time = time.to(dtype=torch.float32)
	origintal_dtype = time.dtype
	time = time.to(dtype=torch.float32)

	time_embed = time_embed.to(dtype=self.in_layer.weight.dtype)
	time_embed = time_embed.to(dtype=original_dtype)

Fix: Remove hardcoded CUDA autocast in Kandinsky 5 to fix import warning #12814

Are you sure you want to change the base?

Fix: Remove hardcoded CUDA autocast in Kandinsky 5 to fix import warning #12814

Conversation

adi776borate commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Verification

Before submitting

Who can review?

Uh oh!

leffff commented Dec 9, 2025

Uh oh!

knd0331 commented Dec 10, 2025

Uh oh!

adi776borate commented Dec 11, 2025

Uh oh!

sayakpaul left a comment

Choose a reason for hiding this comment

Uh oh!

adi776borate commented Dec 11, 2025

Uh oh!

yiyixuxu left a comment

Choose a reason for hiding this comment

Uh oh!

yiyixuxu commented Dec 11, 2025

Uh oh!

github-actions bot commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Dec 11, 2025

Uh oh!

yiyixuxu Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

adi776borate Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

yiyixuxu Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

adi776borate Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

adi776borate commented Dec 9, 2025 •

edited

Loading

github-actions bot commented Dec 11, 2025 •

edited

Loading