-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Closed
Labels
Issue - In ProgressSomeone is actively working on this. Should link to a PR soon.Someone is actively working on this. Should link to a PR soon.bugSomething isn't workingSomething isn't working
Description
Problem (one or two sentences)
When trying to use Sonnet 4.5 via LiteLLM via Google Vertex, this error shows up:
LiteLLM streaming error: 400 litellm.BadRequestError: VertexAIException BadRequestError - b'{"type":"error","error":{"type":"invalid_request_error","message":"max_tokens: 200000 > 64000, which is the maximum allowed number of output tokens for claude-sonnet-4-5-20250929"},"request_id":"req_vrtx_011CTeGWyomNL2s6LacBN6w5"}'. Received Model Group=claude-sonnet-4-5
The issue seems to be a confusion of max_tokens and max_output_tokens:
The problem is most likely in this line:
| maxTokens: modelInfo.max_tokens || 8192, |
I think in this line, max_output_tokens should be used if available, and max_tokens only as fallback.
Context (who is affected and when)
LiteLLM with Sonnet 4.5 on Google Vertex
Reproduction steps
- Add Sonnet 4.5 via LiteLLM via Google Vertex
Expected result
Prompts should work
Actual result
Prompts fail, because the requests ask for 200k output tokens, where the maximum is 64k
Variations tried (optional)
No response
App Version
3.28.14
API Provider (optional)
LiteLLM
Model Used (optional)
Sonnet 4.5 via Google Vertex
Roo Code Task Links (optional)
No response
Relevant logs or errors (optional)
davidwindell, fabb, embassem and akraines
Metadata
Metadata
Assignees
Labels
Issue - In ProgressSomeone is actively working on this. Should link to a PR soon.Someone is actively working on this. Should link to a PR soon.bugSomething isn't workingSomething isn't working
Type
Projects
Status
Done