Claude Sonnet 3.7 exceeds 200k context window #1173 #3268

mosleyit · 2025-05-07T14:17:08Z

Related GitHub Issue

Closes: #1173

Description

This PR implements token counting for all Anthropic direct API models to prevent context window limit errors. The implementation:

Uses Anthropic's token counting API to accurately count tokens before sending requests
Proactively checks if the token count approaches the context window limit for each model
Implements adaptive truncation based on how far over the limit we are
Adds verification after truncation to ensure we stay under the limit
Uses a safety buffer (1k tokens) to prevent hitting exact limits

Key implementation details:

Added a new countMessageTokens method to count tokens for entire message requests
Modified the sliding window implementation to handle all Anthropic models
Implemented model-specific context window handling
Added comprehensive tests for multiple Anthropic models

Test Procedure

Added unit tests in src/api/providers/__tests__/anthropic-token-counting.test.ts that verify:
- Token counting for content blocks
- Token counting for complete messages
- Conversation truncation when token limits are exceeded
- Tests for multiple Anthropic models (Claude 3.7 Sonnet, Claude 3 Opus, Claude 3 Haiku)
Manual testing steps:
- Create a conversation with Claude 3.7 Sonnet that approaches the token limit
- Verify that the conversation is truncated appropriately
- Check console logs for token count warnings and truncation information
Run tests with: npx jest src/api/providers/__tests__/anthropic-token-counting.test.ts

Type of Change

🐛 Bug Fix: Non-breaking change that fixes an issue.
✨ New Feature: Non-breaking change that adds functionality.
💥 Breaking Change: Fix or feature that would cause existing functionality to not work as expected.
♻️ Refactor: Code change that neither fixes a bug nor adds a feature.
💅 Style: Changes that do not affect the meaning of the code (white-space, formatting, etc.).
📚 Documentation: Updates to documentation files.
⚙️ Build/CI: Changes to the build process or CI configuration.
🧹 Chore: Other changes that don't modify src or test files.

Pre-Submission Checklist

Screenshots / Videos

N/A - This change doesn't affect the UI.

Documentation Updates

No documentation updates are required.

Additional Notes

This implementation addresses the issue where Claude 3.7 Sonnet was exceeding its 200k context window limit. The solution now works for all Anthropic models by using their token counting API and implementing adaptive truncation based on each model's specific context window size.

Important

Implements token counting and adaptive truncation for Anthropic models to prevent exceeding context window limits, with comprehensive tests added.

Behavior:
- Implements token counting for Anthropic models using countMessageTokens in anthropic.ts.
- Adds adaptive truncation logic in createMessage() and completePrompt() to prevent exceeding context window limits.
- Introduces a safety buffer of 1k tokens below the context window limit.
Tests:
- Adds anthropic-token-counting.test.ts to test token counting and truncation for multiple models.
- Tests include scenarios for token counting, message truncation, and handling of different models.
Constants:
- Defines CLAUDE_MAX_SAFE_TOKEN_LIMIT in constants.ts and sliding-window/index.ts to avoid circular dependencies.
Misc:
- Updates truncateConversationIfNeeded() in sliding-window/index.ts to handle Anthropic models specifically.

^{This description was created by}^{for c9a8c27. You can customize this summary. It will automatically update as commits are pushed.}

changeset-bot · 2025-05-07T14:17:12Z

⚠️ No Changeset found

Latest commit: c9a8c27

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

mrubens · 2025-05-07T14:29:51Z

src/api/providers/anthropic.ts

+	 * @param model The model ID
+	 * @returns A promise resolving to the token count
+	 */
+	async countMessageTokens(


I think we already have this implemented above in the countTokens method

Thanks for the review! I see the confusion, but there's an important distinction between the existing countTokens method and my implementation:

The existing countTokens method only counts tokens for individual content blocks, not the entire message request. It takes an array of ContentBlockParam as input and wraps it in a single user message for counting.

My new countMessageTokens method counts tokens for the complete message request including the system prompt and all conversation messages. It takes the system prompt and an array of messages as input, providing a more accurate token count for the entire request.

This distinction is crucial because:

The context window limit applies to the entire request, not just individual content blocks

The issue in Claude Sonnet 3.7 exceeds 200k context window #1173 occurs when the complete message (system prompt + all messages) exceeds the 200k token limit

My implementation adds proactive token counting and adaptive truncation before sending the request

While both methods use the Anthropic API, my implementation provides a more comprehensive solution that specifically addresses the issue where Claude 3.7 Sonnet was exceeding its context window limit.

The existing countTokens method is still used as a fallback in my implementation if the API call fails, ensuring robustness.

ellipsis-dev · 2025-05-07T14:51:25Z

src/api/providers/__tests__/anthropic-token-counting.test.ts

+	}
+})
+
+describe("AnthropicHandler Token Counting", () => {


Consider adding tests that simulate failures in the token counting API (e.g. when countTokens rejects) to verify that the fallback logic in countMessageTokens is correctly used.

^{This comment was generated because it violated the following rules: mrule_oAUXVfj5l9XxF01R and mrule_OR1S8PRRHcvbdFib.}

ellipsis-dev · 2025-05-07T14:51:25Z

src/api/providers/anthropic.ts

+		const safeTokenLimit = Math.min(contextWindow - 1000, CLAUDE_MAX_SAFE_TOKEN_LIMIT)
+
+		// If token count exceeds the safe limit, truncate the conversation
+		if (tokenCount > safeTokenLimit) {


Consider using a structured logging mechanism (with proper log levels) rather than using console.log and console.warn directly, to improve production log clarity.

^{This comment was generated because it violated a code review rule: mrule_OR1S8PRRHcvbdFib.}

* base * changeset * Update src/core/controller/index.ts Co-authored-by: Ara <[email protected]> --------- Co-authored-by: Ara <[email protected]>

daniel-lxs

Hey @mosleyit, Thank you for the contribution. Sorry for taking so long to review your PR.

I Just had a couple of questions about some specific values in your implementation, nothing pops up for me as wrong.

Let me know if you want to discuss this further.

daniel-lxs · 2025-05-29T23:04:22Z

src/core/sliding-window/index.ts

+/**
+ * Maximum safe token limit for Claude 3.7 Sonnet (200k - 1k safety buffer)
+ * This is imported from constants.ts but redefined here to avoid circular dependencies
+ */


I see CLAUDE_MAX_SAFE_TOKEN_LIMIT is duplicated here and in constants.ts. What's the circular dependency that prevents importing from constants.ts? Any alternatives to avoid the duplication?

daniel-lxs · 2025-05-29T23:04:34Z

src/api/providers/anthropic.ts

+
+			// Determine truncation fraction based on excess tokens
+			// Start with 0.5 (50%) and increase if needed
+			let truncationFraction = 0.5


Just curious, how did you arrive at these specific values? Would it make sense to extract these as named constants with comments explaining the rationale?

daniel-lxs · 2025-05-29T23:05:09Z

src/api/providers/anthropic.ts

+			return response.input_tokens
+		} catch (error) {
+			// Log error but fallback to estimating tokens by counting each part separately
+			console.warn("Anthropic message token counting failed, using fallback", error)


When the Anthropic token counting API fails the fallback adds a fixed overhead of 5 tokens per message, is this estimate based on any specific data?

daniel-lxs · 2025-06-09T22:49:49Z

I'll be closing this PR as stale to cleanup our backlog.

If someone else wants to work on the linked issue please leave a comment on #1173 to have it assigned to you.

claude token counting

7de1eaa

github-project-automation bot added this to Roo Code Roadmap May 7, 2025

github-project-automation bot moved this to New in Roo Code Roadmap May 7, 2025

mrubens reviewed May 7, 2025

View reviewed changes

Fix test timeout in Cline.test.ts

c9a8c27

mosleyit marked this pull request as ready for review May 7, 2025 14:47

mosleyit requested a review from cte as a code owner May 7, 2025 14:47

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. bug Something isn't working labels May 7, 2025

mosleyit mentioned this pull request May 7, 2025

Claude Sonnet 3.7 exceeds 200k context window #1173

Closed

ellipsis-dev bot reviewed May 7, 2025

View reviewed changes

hannesrudolph moved this from New to PR [Pre Approval Review] in Roo Code Roadmap May 7, 2025

mrubens added this to Roo Code Roadmap May 20, 2025

github-project-automation bot moved this to New in Roo Code Roadmap May 20, 2025

hannesrudolph moved this from New to PR [Pre Approval Review] in Roo Code Roadmap May 20, 2025

hannesrudolph moved this from PR [Needs Review] to TEMP in Roo Code Roadmap May 26, 2025

daniel-lxs moved this from TEMP to PR [Needs Review] in Roo Code Roadmap May 27, 2025

daniel-lxs added the PR - Needs Preliminary Review label May 27, 2025

daniel-lxs reviewed May 29, 2025

View reviewed changes

daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap May 29, 2025

hannesrudolph added PR - Changes Requested and removed PR - Needs Preliminary Review labels May 30, 2025

daniel-lxs closed this Jun 9, 2025

github-project-automation bot moved this from PR [Pre Approval Review] to Done in Roo Code Roadmap Jun 9, 2025

github-project-automation bot moved this from PR [Changes Requested] to Done in Roo Code Roadmap Jun 9, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Claude Sonnet 3.7 exceeds 200k context window #1173 #3268

Claude Sonnet 3.7 exceeds 200k context window #1173 #3268

Uh oh!

mosleyit commented May 7, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

changeset-bot bot commented May 7, 2025 •

edited

Loading

Uh oh!

mrubens May 7, 2025

Uh oh!

mosleyit May 7, 2025

Uh oh!

ellipsis-dev bot May 7, 2025

Uh oh!

ellipsis-dev bot May 7, 2025

Uh oh!

daniel-lxs left a comment •

edited

Loading

Uh oh!

daniel-lxs May 29, 2025

Uh oh!

daniel-lxs May 29, 2025

Uh oh!

daniel-lxs May 29, 2025

Uh oh!

daniel-lxs commented Jun 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Claude Sonnet 3.7 exceeds 200k context window #1173 #3268

Claude Sonnet 3.7 exceeds 200k context window #1173 #3268

Uh oh!

Conversation

mosleyit commented May 7, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related GitHub Issue

Description

Test Procedure

Type of Change

Pre-Submission Checklist

Screenshots / Videos

Documentation Updates

Additional Notes

Uh oh!

changeset-bot bot commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

mrubens May 7, 2025

Choose a reason for hiding this comment

Uh oh!

mosleyit May 7, 2025

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot May 7, 2025

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot May 7, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

daniel-lxs May 29, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs May 29, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs May 29, 2025

Choose a reason for hiding this comment

Uh oh!

daniel-lxs commented Jun 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

mosleyit commented May 7, 2025 •

edited by ellipsis-dev bot

Loading

changeset-bot bot commented May 7, 2025 •

edited

Loading

daniel-lxs left a comment •

edited

Loading