Skip to content

Conversation

@okaris
Copy link
Contributor

@okaris okaris commented Jul 29, 2025

What does this PR do?

https://huggingface.co/posts/a-r-r-o-w/278025275110164

currently this cache does not work with the latest diffusers. the reason being is the cache context not being properly initialised in the WanImageToVideoPipeline.

Fixes #12012 (which was prematurely closed)

Before submitting

Who can review?

@yiyixuxu @asomoza @a-r-r-o-w @DN6

@yiyixuxu
Copy link
Collaborator

It's not a bug, I think we just hadn't added the support to I2V pipeline yet

cc @a-r-r-o-w, I think it's ok to add here, no? I will merge this PR first #12006 and there will be conflicts though

@a-r-r-o-w
Copy link
Contributor

@yiyixuxu Yeah should be okay to add here after we merge your PR

@okaris
Copy link
Contributor Author

okaris commented Jul 30, 2025

@yiyixuxu @a-r-r-o-w i rebased my branch onto main so there shouldn't be conflicts anymore, hope it's more helpful. Thanks!

Copy link
Contributor

@a-r-r-o-w a-r-r-o-w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks for looking into this!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@a-r-r-o-w a-r-r-o-w merged commit 843e3f9 into huggingface:main Jul 30, 2025
10 of 11 checks passed
@gluttony-10
Copy link

@okaris @a-r-r-o-w
I use apply_first_block_cache(pipe.transformer, FirstBlockCacheConfig(threshold=0.2)) on Wan2.2 5B model, it works well.
I try to use apply_first_block_cache(pipe.transformer, FirstBlockCacheConfig(threshold=0.2))
apply_first_block_cache(pipe.transformer_2, FirstBlockCacheConfig(threshold=0.2)) on Wan2.2 A14B model, it works bad.
It can inference, but its speed hasn't increased, and its quality has deteriorated.
How can I fix it?

@okaris
Copy link
Contributor Author

okaris commented Jul 30, 2025

@gluttony-10 you can check my implementation here: https://github.com/inference-sh/grid/blob/main/video/wan2-2-i2v-a14b/inference.py

overall 0.2 is a bit too much, i find that 0.1 and 0.05 give better results with some time savings. it also depends on the content of the video, if there is a lot of movement the cache might not work a lot or it could deteriorate the results

@gluttony-10
Copy link

@okaris
Thanks a lot. It works.
I use pipe.transformer.disable_cache()
pipe.transformer_2.disable_cache()
pipe.transformer.enable_cache(FirstBlockCacheConfig(threshold=0.1))
pipe.transformer_2.enable_cache(FirstBlockCacheConfig(threshold=0.1))

@okaris
Copy link
Contributor Author

okaris commented Jul 30, 2025

@gluttony-10 we are building this new platform https://inference.sh for local ai generation, it might fit your channels content. lmk if you'd like to try

Beinsezii pushed a commit to Beinsezii/diffusers that referenced this pull request Aug 7, 2025
* enable caching for WanImageToVideoPipeline

* ruff format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

apply_first_block_cache with Wan 2.2 causes ValueError: No context is set. Please set a context before retrieving the state

5 participants