Skip to content

Conversation

@DN6
Copy link
Collaborator

@DN6 DN6 commented Mar 6, 2025

What does this PR do?

Fixes # (issue)

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@DN6 DN6 requested review from a-r-r-o-w and yiyixuxu March 6, 2025 16:11
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

self.norm_added_q = RMSNorm(dim_head * heads, eps=eps)
# Wan applies qk norm across all heads
# Wan also doesn't apply a q norm
self.norm_added_q = None
Copy link
Collaborator Author

@DN6 DN6 Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yiyixuxu Discovered an issue when running the slow test for the transformer. The diffusers implementation has this extra norm_added_q key which the original does not. When converting from the original checkpoint there is no weight to assign to this norm, so it remain a meta tensor, so we run into an error when setting the model to a device.

Removing this, and then adding norm_added_q to the _keys_to_ignore_on_load_unexpected in the transformer so that the warning about extra keys in the Diffusers version is suppressed.

Ideal solution is to update the weights in the model repo, but that could take time. This is a fix for the meantime

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good!

@DN6 DN6 merged commit 1357931 into main Mar 7, 2025
14 of 15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants