-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: respect includeMaxTokens option in BaseOpenAiCompatibleProvider (#6936) #6938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Modified BaseOpenAiCompatibleProvider to only include max_tokens parameter when includeMaxTokens option is true - This fixes issue #6936 where Kimi K2 model output was being truncated to 1024 tokens - Updated tests for all providers that extend BaseOpenAiCompatibleProvider (groq, fireworks, chutes, sambanova, zai) - Added new test cases to verify max_tokens is not included by default and is included when includeMaxTokens is true Fixes #6936
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed my own code. Found bugs I introduced 10 minutes ago. Classic.
| // Only add max_tokens if includeMaxTokens is true | ||
| if (this.options.includeMaxTokens === true) { | ||
| // Use user-configured modelMaxTokens if available, otherwise fall back to model's default maxTokens | ||
| params.max_tokens = this.options.modelMaxTokens || info.maxTokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that OpenAiHandler uses max_completion_tokens (the modern parameter) instead of the deprecated max_tokens. Should we consider using max_completion_tokens here as well for consistency?
The OpenAI documentation indicates that max_tokens is deprecated in favor of max_completion_tokens. This might cause issues with newer API versions or certain providers that only support the modern parameter.
| } | ||
|
|
||
| // Only add max_tokens if includeMaxTokens is true | ||
| if (this.options.includeMaxTokens === true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a comment here explaining the behavior? Something like:
| }) | ||
|
|
||
| it("createMessage should pass correct parameters to Groq client", async () => { | ||
| it("createMessage should not include max_tokens by default", async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be helpful to add a test case that explicitly verifies the behavior when includeMaxTokens is set to false (not just undefined)? This would ensure complete coverage of all possible states.
| stream_options: { include_usage: true }, | ||
| } | ||
|
|
||
| // Only add max_tokens if includeMaxTokens is true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider extracting this logic into a private method like addMaxTokensIfNeeded() similar to the OpenAiHandler implementation. This would improve code organization and make it easier to maintain consistency across providers.
|
Closing #6936 (comment) |
This PR fixes issue #6936 where the Kimi K2 model output was being truncated to 1024 tokens when using the OpenAI Compatible API provider.
Problem
The
BaseOpenAiCompatibleProviderclass was always including themax_tokensparameter in API requests, regardless of theincludeMaxTokensoption setting. This caused issues with certain models like Kimi K2 that have different default token limits.Solution
BaseOpenAiCompatibleProviderto only include themax_tokensparameter whenincludeMaxTokensoption is explicitly set totrueOpenAiHandlerclass which already implements this patternChanges
src/api/providers/base-openai-compatible-provider.tsto conditionally includemax_tokensBaseOpenAiCompatibleProvider:groq.spec.tsfireworks.spec.tschutes.spec.tssambanova.spec.tszai.spec.tsmax_tokensis NOT included by defaultmax_tokensIS included whenincludeMaxTokensis truemodelMaxTokensvalue is used when providedTesting
All tests pass successfully:
Fixes #6936
Important
Fixes
max_tokensinclusion inBaseOpenAiCompatibleProviderto respectincludeMaxTokensoption, with updated tests.BaseOpenAiCompatibleProvidernow includesmax_tokensonly ifincludeMaxTokensis true.modelMaxTokensif provided, otherwise defaults to model'smaxTokens.createMessage()inbase-openai-compatible-provider.tsto conditionally includemax_tokens.groq.spec.ts,fireworks.spec.ts,chutes.spec.ts,sambanova.spec.ts, andzai.spec.ts.max_tokensinclusion based onincludeMaxTokensand custommodelMaxTokens.This description was created by
for 1f205d4. You can customize this summary. It will automatically update as commits are pushed.