Skip to content

apply_first_block_cache with Wan 2.2 causes ValueError: No context is set. Please set a context before retrieving the stateΒ #12012

@luke14free

Description

@luke14free

Describe the bug

Following @a-r-r-o-w guide here https://huggingface.co/posts/a-r-r-o-w/278025275110164

I tried both

apply_first_block_cache(pipe.transformer, FirstBlockCacheConfig(threshold=0.2))

and

pipe.transformer.enable_cache(FirstBlockCacheConfig(threshold=input_data.cache_threshold))

but both yield the same issue I saw reported by another user in the FirstBlockCacheConfig PR about the context not being set

Reproduction

import torch
from diffusers import WanImageToVideoPipeline
from diffusers.hooks import FirstBlockCacheConfig
from PIL import Image

# Load the pipeline
model_id = "Wan-AI/Wan2.2-TI2V-5B-Diffusers"
pipe = WanImageToVideoPipeline.from_pretrained(
    model_id, 
    torch_dtype=torch.bfloat16
)

# Move to device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
pipe.to(device)

# Enable cache with FirstBlockCacheConfig
print("Attempting to enable cache...")
cache_config = FirstBlockCacheConfig(threshold=0.2)
pipe.transformer.enable_cache(cache_config)
print("Cache enabled successfully")

# Test basic inference
# Create a dummy image
dummy_image = Image.new('RGB', (512, 512), color='red')

# Generate video
output = pipe(
    image=dummy_image,
    prompt="A simple test",
    num_frames=25,
    height=512,
    width=512,
    num_inference_steps=10,  # Reduced for quick test
)
print("Inference completed successfully")

Logs

[t+49s920ms] [ERROR] Traceback (most recent call last):
[t+49s920ms]   File "/server/tasks.py", line 50, in run_task
[t+49s920ms]     output = await result
[t+49s920ms]              ^^^^^^^^^^^^
[t+49s920ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/src/inference.py", line 128, in run
[t+49s921ms]     output = self.pipe(
[t+49s921ms]              ^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
[t+49s921ms]     return func(*args, **kwargs)
[t+49s921ms]            ^^^^^^^^^^^^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/diffusers/pipelines/wan/pipeline_wan_i2v.py", line 764, in __call__
[t+49s921ms]     noise_pred = current_model(
[t+49s921ms]                  ^^^^^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
[t+49s921ms]     return self._call_impl(*args, **kwargs)
[t+49s921ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
[t+49s921ms]     return forward_call(*args, **kwargs)
[t+49s921ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/accelerate/hooks.py", line 175, in new_forward
[t+49s921ms]     output = module._old_forward(*args, **kwargs)
[t+49s921ms]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/diffusers/models/transformers/transformer_wan.py", line 517, in forward
[t+49s921ms]     hidden_states = block(hidden_states, encoder_hidden_states, timestep_proj, rotary_emb)
[t+49s921ms]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
[t+49s921ms]     return self._call_impl(*args, **kwargs)
[t+49s921ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
[t+49s921ms]     return forward_call(*args, **kwargs)
[t+49s921ms]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/diffusers/hooks/hooks.py", line 189, in new_forward
[t+49s921ms]     output = function_reference.forward(*args, **kwargs)
[t+49s921ms]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/diffusers/hooks/first_block_cache.py", line 90, in new_forward
[t+49s921ms]     shared_state: FBCSharedBlockState = self.state_manager.get_state()
[t+49s921ms]                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[t+49s921ms]   File "/inferencesh/apps/gpu/24vpjytqdbkfh2156cj7bnqexw/venv/3.12/lib/python3.12/site-packages/diffusers/hooks/hooks.py", line 44, in get_state
[t+49s921ms]     raise ValueError("No context is set. Please set a context before retrieving the state.")
[t+49s921ms] ValueError: No context is set. Please set a context before retrieving the state.

System Info

  • πŸ€— Diffusers version: 0.33.1
  • Platform: Linux-5.15.0-136-generic-x86_64-with-glibc2.35
  • Running on Google Colab?: No
  • Python version: 3.10.12
  • PyTorch version (GPU?): 2.7.1+cu126 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.33.0
  • Transformers version: 4.52.4
  • Accelerate version: 1.8.1
  • PEFT version: 0.16.0
  • Bitsandbytes version: not installed
  • Safetensors version: 0.5.3
  • xFormers version: not installed
  • Accelerator: NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
    NVIDIA A100-SXM4-80GB, 81920 MiB
  • Using GPU in script?: yes
  • Using distributed or parallel set-up in script?: no

Who can help?

@yiyixuxu @a-r-r-o-w @DN6

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions