Skip to content

Conversation

@mrubens
Copy link
Collaborator

@mrubens mrubens commented Feb 25, 2025

#1173

  • This issue was caused by the 64000 max tokens highlighting bugs in the context window math. But in general, it was all messy and overcomplicated. This simplifies the math to just truncate by half if context + max tokens would extend beyond the window. In general, I don't think it's worth optimizing for caching vs non-caching models right now (especially since DeepSeek and OpenAI cache behind the scenes but are not set as prompt caching in our model lists)
  • However, there's another issue where the 64,000 (or in OpenRouter's case 128,000) max tokens would take up a ton of the context window. So, I centralized max token logic for the various Anthropic providers (at least most of them 😬) and set Sonnet 3.7 to 16k and the rest to 8k. (We previous had code to set all Anthropic models to an 8k max tokens, but were not using that same override in the truncation math).

Important

Dynamic handling of maxTokens for anthropic/ models and simplified truncation logic in sliding-window with updated tests.

  • Behavior:
    • maxTokens dynamically set using this.getModel().info.maxTokens for models starting with anthropic/ in GlamaHandler, OpenRouterHandler, and UnboundHandler.
    • Simplified truncateConversationIfNeeded() in sliding-window/index.ts to use a fixed truncation fraction of 0.5.
  • Tests:
    • Updated sliding-window.test.ts to reflect changes in truncation logic, including new test cases for different maxTokens scenarios.
  • Model Handling:
    • Updated ClineProvider.ts to dynamically set maxTokens for models starting with anthropic/ using a switch statement.
    • Adjusted anthropicModels in api.ts to reflect new maxTokens values.

This description was created by Ellipsis for 8e2b877d985723ac9609f7c68d37401e5efb1398. It will automatically update as commits are pushed.

@mrubens mrubens requested a review from cte as a code owner February 25, 2025 20:24
@changeset-bot
Copy link

changeset-bot bot commented Feb 25, 2025

⚠️ No Changeset found

Latest commit: 91ef9fb

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Feb 25, 2025
@mrubens mrubens force-pushed the fix_context_window_truncation_math branch from c7a264f to 8e2b877 Compare February 25, 2025 20:25
@mrubens mrubens force-pushed the fix_context_window_truncation_math branch from 8e2b877 to 91ef9fb Compare February 25, 2025 20:30
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 25, 2025
@mrubens mrubens merged commit 46576e0 into main Feb 25, 2025
11 checks passed
@mrubens mrubens deleted the fix_context_window_truncation_math branch February 25, 2025 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants