fix: use eager attention for SDPA compatibility with transformers >=4.36 by majiayu000 · Pull Request #398 · resemble-ai/chatterbox

majiayu000 · 2025-12-23T03:08:36Z

Summary

Set attn_implementation='eager' when creating LlamaConfig
Ensures output_attentions=True works correctly

Problem

Fixes #339

In transformers >=4.36, SDPA became the default attention implementation. However, SDPA doesn't support output_attentions=True, which Chatterbox uses during inference with voice references.

This causes:

ValueError: The `output_attentions` attribute is not supported when using the `attn_implementation` set to sdpa.

Solution

Explicitly set attn_implementation='eager' in LlamaConfig. Eager attention fully supports all features including output_attentions.

Impact

Voice cloning and voice conversion features now work with modern transformers versions
No performance regression for typical use cases (inference already uses output_attentions)

Fixes resemble-ai#339 Set attn_implementation='eager' when creating LlamaConfig to avoid ValueError when using output_attentions=True with transformers >=4.36. SDPA (Scaled Dot Product Attention) became the default in transformers >=4.36 but doesn't support output_attentions=True. Using eager attention ensures compatibility with all features including voice cloning. Co-Authored-By: Claude <noreply@anthropic.com>

George0828Zhang · 2025-12-27T12:27:01Z

Looking at this comment

chatterbox/src/chatterbox/models/t3/inference/alignment_stream_analyzer.py

Line 60 in ed27b95

    
           # using it for all layers slows things down too much. We can apply it to just one layer

It is likely intended for a single layer. Is it possible to only apply eager to that layer only, and keep fast sdpa kernel on the other layers?

@George0828Zhang

Instead of globally disabling SDPA optimization for the entire model, this change applies eager attention only to the specific layers (9, 12, 13) that need attention weights for alignment analysis. Changes: - Remove global attn_implementation='eager' from LlamaConfig - Modify _add_attention_spy to wrap forward method of specific layers - Temporarily switch to eager attention only during those layers' forward pass - Restore original attn_implementation after each layer completes This preserves SDPA performance benefits for all other layers while still supporting output_attentions=True for alignment stream analysis. Addresses review feedback from @George0828Zhang 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

majiayu000 · 2025-12-27T14:54:36Z

@George0828Zhang Thanks for the great suggestion!

I've updated the implementation to apply eager attention only to the specific layers that need output_attentions=True.

Key changes:

Removed global attn_implementation='eager' from LlamaConfig
Modified _add_attention_spy() to wrap the forward method of layers 9, 12, 13 (based on LLAMA_ALIGNED_HEADS)
Each wrapped layer temporarily switches to eager attention, calls forward with output_attentions=True, then restores the original implementation

This preserves SDPA performance for all other layers while supporting alignment stream analysis.

Please take a look and let me know if this addresses your concern!

George0828Zhang · 2025-12-28T04:12:51Z

Thanks. I installed your commit and it seemed to work fine.

majiayu000 · 2025-12-28T06:50:55Z

Happy to help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use eager attention for SDPA compatibility with transformers >=4.36#398

fix: use eager attention for SDPA compatibility with transformers >=4.36#398
majiayu000 wants to merge 2 commits intoresemble-ai:masterfrom
majiayu000:fix/sdpa-attention-compatibility

majiayu000 commented Dec 23, 2025

Uh oh!

George0828Zhang commented Dec 27, 2025

Uh oh!

majiayu000 commented Dec 27, 2025

Uh oh!

George0828Zhang commented Dec 28, 2025

Uh oh!

majiayu000 commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

majiayu000 commented Dec 23, 2025

Summary

Problem

Solution

Impact

Uh oh!

George0828Zhang commented Dec 27, 2025

Uh oh!

majiayu000 commented Dec 27, 2025

Uh oh!

George0828Zhang commented Dec 28, 2025

Uh oh!

majiayu000 commented Dec 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants