Skip to content

Conversation

@roomote
Copy link
Contributor

@roomote roomote bot commented Aug 11, 2025

This PR fixes the GPT-5 token limit issue by capping the max output tokens to 10k, preventing context window overflow when input approaches the 272k limit.

Problem

GPT-5 models have a max output of 128k tokens. When the input gets close to the 272k input limit, the model's output can exceed the total 400k context window, causing API errors.

Solution

As suggested in cline/cline#5474 (comment), this PR limits the max output tokens to 10k for all GPT-5 models (gpt-5, gpt-5-mini, gpt-5-nano).

Changes

  • Added special handling in getModelMaxOutputTokens() to detect GPT-5 models and cap their output at 10k tokens
  • Users can still override with a lower value via settings, but it will be capped at 10k maximum
  • Added comprehensive test coverage for the new behavior

Testing

  • All existing tests pass
  • Added 6 new tests specifically for GPT-5 token limiting behavior
  • Verified that non-GPT-5 models are not affected

Fixes #6856


Important

Caps GPT-5 models' max output tokens to 10k in getModelMaxOutputTokens() to prevent context window overflow, with comprehensive testing added.

  • Behavior:
    • Caps GPT-5 models' max output tokens to 10k in getModelMaxOutputTokens() to prevent context window overflow.
    • Allows user override for lower values but caps at 10k.
    • Non-GPT-5 models remain unaffected.
  • Testing:
    • Added 6 new tests in api.spec.ts for GPT-5 token limiting behavior.
    • Verified existing tests pass and non-GPT-5 models are unaffected.

This description was created by Ellipsis for 054ad59. You can customize this summary. It will automatically update as commits are pushed.

…verflow

- Added special handling for GPT-5 models in getModelMaxOutputTokens()
- Limits max output to 10k tokens as recommended in cline/cline#5474 (comment)
- Prevents context window overflow when input approaches 272k token limit
- Added comprehensive tests for GPT-5 token limiting behavior

Fixes #6856
@roomote roomote bot requested review from cte, jr and mrubens as code owners August 11, 2025 23:07
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels Aug 11, 2025
Copy link
Contributor Author

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing my own code is like debugging in a mirror - everything looks backward but the bugs are still mine.

// Allow user override via settings, but cap at 10k
const userMaxTokens = settings?.modelMaxTokens
if (userMaxTokens) {
return Math.min(userMaxTokens, 10000)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider extracting this magic number to a named constant like GPT5_MAX_OUTPUT_TOKENS = 10000 for better maintainability. Even I'm having to count zeros here.

}

// Special handling for GPT-5 models to prevent context window overflow
// Limit max output to 10k tokens as per https://github.com/cline/cline/issues/5474#issuecomment-3172109387
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment could be more descriptive about WHY we're limiting to 10k. Consider: // Limit max output to 10k tokens to prevent total context window overflow (400k) when input approaches the 272k limit

format: "openai",
})

expect(result).toBe(10_000)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent number formatting - the implementation uses 10000 but tests mix 10_000 and 10000. Consider standardizing to use underscore separators throughout for better readability.

expect(result).toBe(10_000)
})

test("should respect user override for GPT-5 models but cap at 10k", () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a test case for when settings.modelMaxTokens is undefined or 0 to ensure the default 10k is applied correctly in those edge cases.

@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Review] in Roo Code Roadmap Aug 12, 2025
@RooCodeInc RooCodeInc deleted a comment from roomote bot Aug 12, 2025
@RooCodeInc RooCodeInc deleted a comment from roomote bot Aug 12, 2025
@RooCodeInc RooCodeInc deleted a comment from roomote bot Aug 12, 2025
@daniel-lxs
Copy link
Member

@roomote-agent use switch_tool to switch to pr-fixer and fix this PR

@roomote
Copy link
Contributor Author

roomote bot commented Aug 12, 2025

I see your request to switch to pr-fixer mode. Let me analyze and fix this PR for you.

- Extract GPT5_MAX_OUTPUT_TOKENS as a named constant (10,000)
- Improve model detection with more specific regex pattern
- Add comprehensive documentation explaining the context window overflow issue
- Add tests for date-suffixed models and invalid model name patterns
- Update all test assertions to use the named constant
@roomote
Copy link
Contributor Author

roomote bot commented Aug 12, 2025

✅ PR has been successfully improved!

Summary of Changes

I have enhanced the GPT-5 token limit implementation with the following improvements:

🔧 Code Quality Improvements

  1. Extracted magic number as a named constant: Created GPT5_MAX_OUTPUT_TOKENS = 10_000 for better maintainability
  2. Improved model detection: Replaced simple startsWith() check with a more specific regex pattern /^gpt-5(-mini|-nano)?(-\d{4}-\d{2}-\d{2})?$/i that properly matches GPT-5 model variants
  3. Enhanced documentation: Added comprehensive JSDoc comments explaining the context window overflow issue and linking to the relevant GitHub issue

✅ Test Coverage

  • Updated all test assertions to use the named constant
  • Added new test cases for:
    • Date-suffixed GPT-5 models (e.g., gpt-5-2025-08-07)
    • Invalid model names that should NOT be treated as GPT-5 models

📊 CI Status

All checks are passing:

  • ✅ Compile
  • ✅ Unit tests (Ubuntu & Windows)
  • ✅ Integration tests
  • ✅ Code analysis
  • ✅ Translations
  • ✅ Knip
  • ✅ CodeQL

The PR is now ready for review with improved code quality and maintainability while preserving the original functionality.

- Removed complex regex pattern for detecting GPT-5 models
- Now using simple startsWith('gpt-5') check as requested
- Updated tests to reflect the new behavior where any model starting with 'gpt-5' is treated as a GPT-5 model
@daniel-lxs daniel-lxs closed this Aug 12, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Aug 12, 2025
@github-project-automation github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Aug 12, 2025
@daniel-lxs daniel-lxs reopened this Aug 12, 2025
@github-project-automation github-project-automation bot moved this from Done to New in Roo Code Roadmap Aug 12, 2025
@github-project-automation github-project-automation bot moved this from Done to Triage in Roo Code Roadmap Aug 12, 2025
@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 12, 2025
@daniel-lxs daniel-lxs closed this Aug 12, 2025
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Aug 12, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

[BUG] OpenAI GPT 5 Token Limits

4 participants