-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: improve Claude Sonnet 4 1M context window beta header handling #7230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Fixed beta header array initialization to prevent mutation of model betas - Added proper deduplication of beta headers to avoid duplicates - Ensured context-1m-2025-08-07 beta is properly combined with prompt caching beta - Added comprehensive tests for 1M context window feature Fixes #7229
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing my own code is like debugging in a mirror - everything looks backwards but the bugs are still mine.
| betas.push("prompt-caching-2024-07-31") | ||
| } | ||
| // Only set headers if we have betas to include | ||
| return betas.length > 0 ? { headers: { "anthropic-beta": betas.join(",") } } : undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is returning undefined here intentional when betas.length is 0? The API might expect headers to always be present. Could we consider returning an empty headers object instead to avoid potential issues?
| let { id: modelId, betas: modelBetas, maxTokens, temperature, reasoning: thinking } = this.getModel() | ||
|
|
||
| // Initialize betas array properly | ||
| const betas: string[] = modelBetas ? [...modelBetas] : [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we extract this beta header initialization and deduplication logic into a separate method? Something like prepareBetaHeaders(modelBetas, modelId) would make this more maintainable and easier to extend for future beta features.
| let { id: modelId, betas: modelBetas, maxTokens, temperature, reasoning: thinking } = this.getModel() | ||
|
|
||
| // Initialize betas array properly | ||
| const betas: string[] = modelBetas ? [...modelBetas] : [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call on creating a copy here to prevent mutation! Might be worth adding a comment explaining why we need to copy modelBetas to prevent mutating the original model configuration.
| if (modelId === "claude-sonnet-4-20250514" && this.options.anthropicBeta1MContext) { | ||
| betas.push("context-1m-2025-08-07") | ||
| // Only add if not already present | ||
| if (!betas.includes("context-1m-2025-08-07")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using a Set for handling beta headers instead of checking with includes() and pushing. It would be more efficient and cleaner:
| if (!betas.includes("context-1m-2025-08-07")) { | |
| const betaSet = new Set(modelBetas || []); | |
| if (modelId === "claude-sonnet-4-20250514" && this.options.anthropicBeta1MContext) { | |
| betaSet.add("context-1m-2025-08-07"); | |
| } | |
| const betas = Array.from(betaSet); |
| }) | ||
| }) | ||
|
|
||
| describe("createMessage with 1M context beta", () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a test case to verify behavior when the beta header is already present in the model's betas array? This would ensure our deduplication logic works correctly and prevent duplicate headers from being sent.
This PR fixes an issue where the Claude Sonnet 4 1M context window beta feature was not working correctly, causing 429 rate limit errors at ~239K tokens even for Tier 4 users.
Problem
Users on Tier 4 were experiencing 429 rate limit errors when using Claude Sonnet 4 with the 1M context window beta enabled, hitting the error at around 239K tokens instead of being able to use the full 1M context window.
Solution
The issue was in how the beta headers were being handled in the AnthropicHandler:
betasarray was being initialized from the model's betas which could be mutatedChanges
context-1m-2025-08-07beta header is properly combined with the prompt caching betaTesting
Fixes #7229
Important
Fixes beta header handling in
AnthropicHandlerfor Claude Sonnet 4 1M context window, ensuring proper initialization and deduplication.AnthropicHandlerto prevent mutation of model betas and ensure proper deduplication.context-1m-2025-08-07beta header is combined with prompt caching beta.anthropic.spec.tsto verify 1M context window is enabled whenanthropicBeta1MContextis true.createMessageinanthropic.tsto initialize beta headers correctly and add deduplication checks.This description was created by
for 5006092. You can customize this summary. It will automatically update as commits are pushed.