-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: Update Vertex AI Claude Sonnet models to 1M context window #8685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Update contextWindow from 200k to 1M for claude-sonnet-4@20250514 - Update contextWindow from 200k to 1M for claude-sonnet-4-5@20250929 - Add tests to verify 1M context window configuration - Addresses issue #8671 per Google Vertex AI documentation
Review SummaryIssues Found
Changes Reviewed
Verification
|
|
@mrubens @cte @hannesrudolph Any update here? |
|
Need this asap! |
|
@mrubens @cte @hannesrudolph bump again! |
|
This should allow compatibility with both context windows. To enable 1M content in Vertex AI, it needs the The model names should have My configuration on Claude Code works with this configuration. |
|
@dot-agi do we know if that header is required for vertex though? and is it for both 4 and 4.5? I don't see that in any of the documentation |
|
@roomote per a footnote on this page claude requires the |
|
Thanks for the note. We will add dedicated [1m] variants for the Vertex AI Claude Sonnet models and send the required header when those are selected:\n\n- claude-sonnet-4@20250514[1m]\n- claude-sonnet-4-5@20250929[1m]\n\nWe will wire the anthropic-beta: context-1m-2025-08-07 header in the Vertex path and include tests. Will update this PR shortly. |
…hropic-beta header on Vertex; restore base to 200k and add tests
|
Summary: added explicit [1m] Vertex AI variants for Claude Sonnet and wired required header
Local provider tests pass. Monitoring CI checks now and will follow up once they complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found.
…atisfy ModelInfo DTS build
…m] or setting; keep Computer Use enabled on [1m]; normalize [1m] suffix for SDK
… with ModelInfo type
|
Summary addressing pricing, [1m] suffix, beta header, and Computer Use Changes pushed to feat/vertex-1m-context-sonnet:
Notes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found.
|
CI status update
Change scope in this PR is limited to: Behavioral updates per feedback:
Request
|
|
CI status: compile and unit tests passed; integration-test failed due to a single timeout in "Markdown List Rendering › Should render nested lists with proper hierarchy". Failure is unrelated to Vertex pricing updates in useSelectedModel.ts and provider logic in AnthropicVertexHandler.createMessage(). Integration job logs: https://github.com/RooCodeInc/Roo-Code/actions/runs/18879675443/job/53878976071 Requesting a maintainer rerun of the integration-test job. If preferred, I can push a trivial rebase to retrigger. |
|
CI failure details (appears unrelated to this PR)
Change scope in this PR is limited to: Summary of functional changes
Request
|
|
@roomote this "over 200k tokens input" pattern is becoming somewhat common across the industry gemini does it for gemini pro 2.5 make sure how you implement this is somewhat generalized and not specific to just the vertex AI provider |
|
Thanks for the note. Agreed that over 200k tokens input is becoming a cross provider pattern. I will generalize support across providers rather than focusing only on Vertex:
I will start by refactoring the pricing and capability normalization, then update the UI hook and tests. Will push to this branch shortly. |
…ply in selection logic (Vertex, Anthropic, Gemini, Bedrock)
|
Update: generalized the 'over 200k tokens input' large-input tier across providers (not just Vertex). What changed:
Files touched:
Validation:
This makes the implementation generic and future-proof for providers that introduce tiered pricing/context above 200k. |
| // Enable 1M context beta when using [1m] variants or when explicitly enabled via settings for Sonnet 4/4.5 | ||
| const use1m = | ||
| this.options.apiModelId?.endsWith("[1m]") === true || | ||
| ((id === "claude-sonnet-4@20250514" || id === "claude-sonnet-4-5@20250929") && | ||
| this.options.anthropicBeta1MContext === true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code checks this.options.anthropicBeta1MContext which doesn't exist in the Vertex provider settings schema. The vertexSchema in provider-settings.ts (line 238) doesn't include anthropicBeta1MContext - that property only exists in anthropicSchema (line 199). This creates a critical discrepancy: when users enable largeInputTierEnabled for Vertex Sonnet 4/4.5 models, the frontend will charge 1M pricing but the backend won't send the required anthropic-beta: context-1m-2025-08-07 header, resulting in users being charged for 1M context while only receiving 200K. The check should use this.options.largeInputTierEnabled instead to match the frontend logic in useSelectedModel.ts line 253.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical issue found: Backend does not check largeInputTierEnabled setting. This would cause users to be charged for 1M context while only receiving 200K. Changes required before merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-review complete. The critical issue regarding largeInputTierEnabled support in the backend remains unresolved. The latest commits improve Gemini tier handling but do not address the previously identified backend issue.
|
@roomote note that pricing changes in the same chat. One request is below 200k input requests (billed at the cheaper rate), next request over 200k (billed at the higher rate), next request below 200k (billed at the cheaper rate), etc. Throughout a session (at least with gemini 2.5 pro and the 2 sonnet models) with the same model identifier, there can (but not always will be) be a price shift. This appears to be a bug with the overall calculator for all these models. |
|
Thanks for the heads up. Confirmed: pricing tier can flip per request within the same chat for Gemini 2.5 Pro and Claude Sonnet across providers. I will update this PR to make the estimator tier aware per request, not per session:
|
…ecting named service tiers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re-review complete. The latest changes to cost.ts look good and do not introduce any new issues. However, the critical issue regarding anthropicBeta1MContext vs largeInputTierEnabled in src/api/providers/anthropic-vertex.ts remains unresolved and must be fixed before this PR can be merged.
|
Summary: pricing now switches tiers per-request within the same chat; CI green What changed
Validation
Effect
Commit on branch
|
|
@roomote try to move config stuff to packages/types/src/providers/vertex.ts (and the other providers folders) since this might be something that happens elsewhere on other models going forward |
|
Acknowledged, @nickcatal. I will move the model config into packages/types/src/providers/vertex.ts and mirror the pattern for other providers. Follow-up commits incoming on this branch. |
…elper and consume in webview Implements feedback to move config logic into packages/types for reuse across providers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest refactoring correctly centralizes Vertex pricing logic. No new issues found in this commit. However, the critical backend issue remains unfixed.
|
Update: Centralized Vertex Claude Sonnet pricing and 1M context window logic in shared types. Webview model selection now delegates to that helper. Region aware pricing for Sonnet 4.5 preserved and generic large input tier respected. Tests pass locally and CI checks are green. This establishes a reusable pattern for other providers. |
|
It seems like this PR needs to be created from scratch again |
This PR updates the context window for Claude Sonnet models in Vertex AI from 200k to 1M tokens.
Changes
claude-sonnet-4@20250514context window from 200,000 to 1,000,000 tokensclaude-sonnet-4-5@20250929context window from 200,000 to 1,000,000 tokensTesting
Context
This change aligns with the increased context window capacity now available for Claude Sonnet models on Vertex AI.
Requested via GitHub comment: "@roomote any way you can create a PR here?"
Important
Update Vertex AI Claude Sonnet models to support 1M context window and adjust pricing, with added tests for verification.
claude-sonnet-4@20250514andclaude-sonnet-4-5@20250929to 1,000,000 tokens invertex.ts.getVertexAdjustedModelInfo()invertex.ts.[1m]suffix orlargeInputTierEnabledinanthropic-vertex.ts.anthropic-vertex.spec.tsfor 1M context window handling.useSelectedModel.spec.tsfor pricing and context window adjustments.largeInputTierEnabledtoprovider-settings.tsfor toggling context window tiers.calculateApiCostInternal()incost.tsto handle tier-based pricing.This description was created by
for 549ed0b. You can customize this summary. It will automatically update as commits are pushed.