forked from cline/cline
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Closed
Labels
Issue/PR - TriageNew issue. Needs quick review to confirm validity and assign labels.New issue. Needs quick review to confirm validity and assign labels.bugSomething isn't workingSomething isn't working
Description
Problem (one or two sentences)
GLM 4.6 Turbo via Chutes doesn't work because of the incorrect max output token count. I think we should add 20% of 200k, which is 40k, as the max output token count, in order to start working correctly.
Context (who is affected and when)
Everyone who tries to use GLM 4.6 Turbo via Chutes provider
Reproduction steps
- Create a new API configuration with GLM 4.6 turbo model via Chutes Provider
- Test a sample message
- Expect error similar to the following "Requested token count exceeds the model's maximum context length of 202752 tokens. You requested a total of 233093 tokens: 30341 tokens from the input messages and 202752 tokens for the completion. Please reduce the number of tokens in the input messages or the completion to fit within the limit."
Expected result
The model will be outputting messages correctly
Actual result
We receive error because of token count exceeding
Variations tried (optional)
No response
App Version
3.29.2
API Provider (optional)
Chutes AI
Model Used (optional)
GLM-4.6-turbo
Roo Code Task Links (optional)
No response
Relevant logs or errors (optional)
Metadata
Metadata
Assignees
Labels
Issue/PR - TriageNew issue. Needs quick review to confirm validity and assign labels.New issue. Needs quick review to confirm validity and assign labels.bugSomething isn't workingSomething isn't working
Type
Projects
Status
Done