Skip to content

Conversation

@a-r-r-o-w
Copy link
Contributor

While training loras, found some things that slipped by when we merged HunyuanVideo

  • Enables gradient checkpointing in transformer
  • Fixes the encode implementation in VAE
  • Lowers the default frame tile size. A tile size of 64 requires ~24 GB memory, while a tile size of 16 requires only ~8 GB at 768x512 resolution
  • Adds docstring for transformer

@a-r-r-o-w a-r-r-o-w requested a review from yiyixuxu December 19, 2024 01:57
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@a-r-r-o-w
Copy link
Contributor Author

The test now also matches the VAE in model test with VAE in transformer test for consistency

@yiyixuxu
Copy link
Collaborator

feel free to merge!

@a-r-r-o-w a-r-r-o-w merged commit f781b8c into main Dec 19, 2024
14 of 15 checks passed
@a-r-r-o-w a-r-r-o-w deleted the hunyuan-fixes branch December 19, 2024 04:58
Foundsheep pushed a commit to Foundsheep/diffusers that referenced this pull request Dec 23, 2024
sayakpaul pushed a commit that referenced this pull request Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants