Skip to content

Conversation

@MuriloFP
Copy link
Contributor

@MuriloFP MuriloFP commented Jul 16, 2025

PR Title: feat: Add Advanced Setting for Custom Max Tokens per Provider Profile (#5784)


Related GitHub Issue

Closes: #5784

Roo Code Task Context (Optional)

Description

This PR adds a new "Max Output Tokens" field in the Advanced Settings section of the provider configuration UI, allowing users to customize the max tokens per provider profile. Previously, Roo Code had a hard-coded limit of 8192 tokens for all API providers, which prevented users from fully utilizing models that support higher token limits.

Key implementation details:

  • Created a new MaxTokensControl component with numeric input validation
  • Integrated the control into the Advanced Settings section of ApiOptions
  • Updated getModelMaxOutputTokens() to respect user-configured modelMaxTokens for all models
  • Added comprehensive translations for all supported locales (23 languages)
  • Maintained backward compatibility with a default of 8192 tokens

Design choices:

  • Minimum value set to 1000 tokens (reasonable minimum for practical use)
  • Maximum value capped at model's actual max tokens or 200,000
  • Validation errors displayed inline with clear messages
  • Model's supported max tokens shown as a hint to guide users

Test Procedure

Automated Tests:

  • ✅ Backend API tests: 22 tests passing in src/shared/__tests__/api.spec.ts
  • ✅ UI Component tests: 10 tests passing in webview-ui/src/components/settings/__tests__/MaxTokensControl.spec.tsx
  • ✅ All ESLint and TypeScript checks passing

Manual Testing Steps:

  1. Navigate to Settings > API Provider configuration
  2. Select any provider (e.g., Anthropic, OpenAI)
  3. Expand the "Advanced Settings" section
  4. Verify the "Max Output Tokens" field appears with default value of 8192
  5. Enter a custom value (e.g., 16000) and save
  6. Switch to a different provider profile and verify it maintains its own value
  7. Try invalid values (< 1000 or > model max) and verify validation errors appear
  8. Use the provider with custom max tokens and verify it's applied in API requests

Pre-Submission Checklist

  • Issue Linked: This PR is linked to an approved GitHub Issue (see "Related GitHub Issue" above).
  • Scope: My changes are focused on the linked issue (one major feature/fix per PR).
  • Self-Review: I have performed a thorough self-review of my code.
  • Testing: New and/or updated tests have been added to cover my changes (if applicable).
  • Documentation Impact: I have considered if my changes require documentation updates (see "Documentation Updates" section below).
  • Contribution Guidelines: I have read and agree to the Contributor Guidelines.

Screenshots / Videos

[Screenshots should be added showing the new Max Output Tokens field in the Advanced Settings section]

Documentation Updates

  • Yes, documentation updates are required. The settings documentation should be updated to explain the new "Max Output Tokens" field and its impact on token distribution.

Additional Notes

This implementation leverages the existing modelMaxTokens field in ProviderSettings which was previously only used for reasoning models. The field is now extended to work with all models, providing a consistent experience across different provider types.

Get in Touch

@MuriloFP


Important

Introduces customizable max output tokens per provider profile in the UI, with backend support and comprehensive testing.

  • Behavior:
    • Adds "Max Output Tokens" field in provider configuration UI for custom token limits.
    • Default limit remains 8192 tokens; users can set between 1000 and model's max or 200,000.
    • Validation errors for invalid values are shown inline.
  • Components:
    • New MaxTokensControl component with numeric input validation.
    • Integrated into ApiOptions in ApiOptions.tsx.
  • Functions:
    • Updates getModelMaxOutputTokens() to use user-configured modelMaxTokens.
    • Applies changes in chutes.ts, gemini.ts, and glama.ts.
  • Testing:
    • 22 backend API tests in api.spec.ts.
    • 10 UI tests in MaxTokensControl.spec.tsx.
  • Translations:
    • Adds translations for 23 languages.
  • Misc:
    • Maintains backward compatibility with default 8192 tokens.

This description was created by Ellipsis for 8ade97f. You can customize this summary. It will automatically update as commits are pushed.

@MuriloFP MuriloFP requested review from cte, jr and mrubens as code owners July 16, 2025 18:14
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Jul 16, 2025
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request UI/UX UI/UX related or focused labels Jul 16, 2025
Copy link
Member

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to Issue [Unassigned] in Roo Code Roadmap Jul 31, 2025
@daniel-lxs daniel-lxs moved this from Issue [Unassigned] to PR [Needs Review] in Roo Code Roadmap Jul 31, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jul 31, 2025
@mrubens
Copy link
Collaborator

mrubens commented Jul 31, 2025

Hmm, not sure the setting is sticking correctly for me

Screenshot 2025-07-30 at 10 27 59 PM Screenshot 2025-07-30 at 10 28 15 PM

The issue was that OpenAI-compatible providers (Chutes, Groq) were directly using model.info.maxTokens instead of calling getModelMaxOutputTokens(). This meant that the user's custom modelMaxTokens setting was being ignored.

Fixed by:
- Updating BaseOpenAiCompatibleProvider to use getModelMaxOutputTokens()
- Updating ChutesHandler's getCompletionParams to use getModelMaxOutputTokens()

This ensures that when users set a custom max output tokens value in the settings, it will be properly applied to API requests for all OpenAI-compatible providers.
@MuriloFP MuriloFP force-pushed the feat/issue-5784-custom-max-tokens branch from 52a8eef to 6a4653a Compare July 31, 2025 03:27
@MuriloFP MuriloFP marked this pull request as draft July 31, 2025 03:41
@MuriloFP
Copy link
Contributor Author

Hmm, not sure the setting is sticking correctly for me

The providers I hadn't tested are getting the max output from the model directly, I need to change them all... The commit above fixed it for Chutes, but I still need to change the other occurrences for the other providers. Marked this as draft, will get it done ASAP, but I don't have access to all of the providers, will test the ones I can.

MuriloFP added 2 commits July 31, 2025 00:53
… all providers

- Updated BaseOpenAiCompatibleProvider to use getModelMaxOutputTokens()
- Fixed ChutesHandler to respect user's custom max tokens
- Fixed LiteLLM createMessage and completePrompt methods
- Fixed Glama createMessage and completePrompt methods
- Fixed Unbound createMessage and completePrompt methods
- Fixed Mistral getModel method to use getModelMaxOutputTokens()
- Fixed XAI to use getModelMaxOutputTokens()
- Fixed OpenAI addMaxTokensIfNeeded to use getModelMaxOutputTokens()
- Fixed Gemini to use maxTokens from getModel() which already applies user settings

This ensures that when users set a custom max output tokens value in their provider settings, it will be respected across all providers (capped to the model's actual maximum).
- Fixed test expectation to properly cap user's modelMaxTokens to model's actual capability
- Added new test case for when user sets lower max tokens than model supports
- Removed debug logging
@MuriloFP
Copy link
Contributor Author

MuriloFP commented Jul 31, 2025

Fixed the failing test by updating the test expectation to match the correct behavior.

The test was expecting that a user could set modelMaxTokens higher than what the model actually supports (32000 > 4096). However, the correct behavior is to cap the user's request to the model's actual capability. This ensures we don't request more tokens than the model can provide.

The implementation correctly uses Math.min(userSetting, modelMax) to handle this.

////

Still have to test the providers.

…or OpenAI compatible providers

- Reverted BaseOpenAiCompatibleProvider to use maxTokens directly from model info
- OpenAI compatible providers have their own server-side max output configuration
- Hidden the generic MaxTokensSlider for OpenAI compatible provider in the UI
- This ensures OpenAI compatible providers use their own max tokens configuration
@MuriloFP
Copy link
Contributor Author

@mrubens
I've made changes to the following providers:

Chutes
Glama
Mistral
xAI
Gemini
Unbound
LiteLLM

I've tested all of them except for Unbound and LiteLLM, they all seem to be working fine. Previously, they were getting the max tokens directly from the model, now it uses the custom setting.

@MuriloFP MuriloFP marked this pull request as ready for review July 31, 2025 16:19
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jul 31, 2025
@daniel-lxs daniel-lxs moved this from PR [Needs Review] to PR [Changes Requested] in Roo Code Roadmap Jul 31, 2025
@mechanicmuthu
Copy link

What happens if user setting of model max is decreased in between a conversation and the current context is more than the user setting?

@daniel-lxs daniel-lxs moved this from PR [Changes Requested] to PR [Draft / In Progress] in Roo Code Roadmap Aug 18, 2025
@daniel-lxs daniel-lxs marked this pull request as draft August 18, 2025 17:53
@github-project-automation github-project-automation bot moved this from PR [Draft / In Progress] to Done in Roo Code Roadmap Sep 23, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Sep 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lgtm This PR has been approved by a maintainer PR - Draft / In Progress size:XL This PR changes 500-999 lines, ignoring generated files. UI/UX UI/UX related or focused

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Add Advanced Setting for Custom Max Tokens per Provider Profile

5 participants