Skip to content

fix(summarization): prevent context overflow and orphaned tool messages#228

Draft
christian-bromann wants to merge 1 commit intomainfrom
cb/summarization-improvements
Draft

fix(summarization): prevent context overflow and orphaned tool messages#228
christian-bromann wants to merge 1 commit intomainfrom
cb/summarization-improvements

Conversation

@christian-bromann
Copy link
Member

Three issues with the summarization middleware caused agent crashes:

  1. No post-summarization safety check: after cutting messages, a single oversized tool result (e.g. a 668K-char glob) in the preserved window could still exceed the model limit. Added emergency truncation both when summarization is not triggered and after summarization completes.

  2. Orphaned ToolMessages: determineCutoffIndex could split an AIMessage from its ToolMessages, leaving tool_result blocks without a preceding tool_use. Added adjustCutoffForToolMessages to advance the cutoff past any leading ToolMessages, applied at summarization time and defensively in getEffectiveMessages for backward compatibility.

  3. Token estimation included system message and tool schema overhead in the count (matching Python's approach), and the trigger was lowered from 170K to 130K after removing the arbitrary 1.25x safety factor from the proactive trigger path. The safety factor is now only used for hard-limit checks (emergency truncation).

Three issues with the summarization middleware caused agent crashes:

1. No post-summarization safety check: after cutting messages, a single
   oversized tool result (e.g. a 668K-char glob) in the preserved window
   could still exceed the model limit. Added emergency truncation both
   when summarization is not triggered and after summarization completes.

2. Orphaned ToolMessages: determineCutoffIndex could split an AIMessage
   from its ToolMessages, leaving tool_result blocks without a preceding
   tool_use. Added adjustCutoffForToolMessages to advance the cutoff past
   any leading ToolMessages, applied at summarization time and defensively
   in getEffectiveMessages for backward compatibility.

3. Token estimation included system message and tool schema overhead in
   the count (matching Python's approach), and the trigger was lowered
   from 170K to 130K after removing the arbitrary 1.25x safety factor
   from the proactive trigger path. The safety factor is now only used
   for hard-limit checks (emergency truncation).

Also cleaned up verbose per-call debug logging.

Co-authored-by: Cursor <cursoragent@cursor.com>
@changeset-bot
Copy link

changeset-bot bot commented Feb 13, 2026

⚠️ No Changeset found

Latest commit: 1ec3c83

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant