Skip to content

Conversation

@catrielmuller
Copy link
Contributor

@catrielmuller catrielmuller commented Nov 5, 2025

Improved version of: #3468

Context

Fixed a race condition where checkpoint messages (checkpoint_saved) could arrive during partial message updates, causing the message history to break and leaving partial messages orphaned with partial=true state.

Root Cause:

The checkpoint service runs git operations asynchronously in the background. When a checkpoint completes, it calls task.say("checkpoint_saved", ...) through an event handler, which can execute while Task.ask() or Task.say() are still updating partial messages. This resulted in checkpoint messages being inserted between a partial message's start and completion, breaking the message chain.

Previous Workaround:

The findPartialAskMessage() and findPartialSayMessage() functions were added to search backwards through messages to locate orphaned partials, but this was a symptom of the underlying race condition.

Implementation

Implemented a Message Insertion Guard pattern that serializes all message insertions without blocking checkpoint git operations.

Program Flow

Message Insertion Sequence (with guard):

  • Any code calls task.say() or task.ask()
  • Eventually calls addToClineMessages()
  • Guard waits if another insertion is in progress (~10-50ms max)
  • Guard locks to claim exclusive insertion rights
  • Message is inserted, saved, and UI updated
  • Guard releases (in finally block for error safety)
  • Next waiting insertion can proceed

Checkpoint Flow (unchanged, still async):

  • User sends message → checkpoint triggers in background
  • Git operations run async (100-1000ms)
  • Checkpoint completes → event handler fires
  • Handler calls task.say("checkpoint_saved", ...)
  • NEW: Guard ensures this waits for any in-flight partial updates
  • Message inserted in correct order after partial completes

How to Test

Prerequisites

  • Enable checkpoints in settings (default: enabled)
  • Have a workspace directory (not Desktop/Documents/Downloads)
  • Git must be installed

Test Scenario 1: Verify No Orphaned Partial Messages

  • Start a new task with a complex request that triggers multiple tool uses
  • While the assistant is streaming a response (you'll see partial messages updating in real-time)
  • The checkpoint system will save automatically in the background

Expected: Message history remains in correct order with no partial=true messages left behind
How to verify:

  • Check the task history JSON file at ~/Library/Application Support/Code/User/globalStorage/[extension-id]/tasks/[task-id]/ui_messages.json
  • All messages should have partial: false or no partial field
  • No checkpoint_saved messages should appear between a partial ask/say's start and completion

Test Scenario 2: Verify Checkpoint Performance

  • Start a task and send several user messages in succession
  • Each message triggers a checkpoint (suppressed from UI but still runs)
  • Expected: No noticeable lag or delay when sending messages
  • Performance metric: Checkpoint git ops still take 100-1000ms (unchanged), but users never wait for them

Test Scenario 3: Run Automated Tests

  • cd src && npx vitest run core/task/tests/message-insertion-guard.spec.ts
  • Expected: All 13 tests pass

@changeset-bot
Copy link

changeset-bot bot commented Nov 5, 2025

⚠️ No Changeset found

Latest commit: 01fdc8c

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a message insertion guard mechanism to prevent race conditions when checkpoint messages arrive during partial message updates in the Task system. The guard ensures messages are inserted sequentially, preventing message history order corruption.

Key changes:

  • Added MessageInsertionGuard class with acquire/release locking and wait-for-clearance mechanism
  • Integrated the guard into Task.addToClineMessages() to serialize message insertions
  • Added comprehensive test suite covering basic locking, concurrent access, error handling, and realistic checkpoint scenarios

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/core/kilocode/task/message-utils.ts Adds MessageInsertionGuard class to handle message insertion synchronization
src/core/task/Task.ts Integrates guard into addToClineMessages() method and imports the new guard class
src/core/task/tests/message-insertion-guard.spec.ts Comprehensive test suite for the message insertion guard functionality
Comments suppressed due to low confidence (1)

src/core/task/Task.ts:694

  • The updateClineMessage method modifies existing messages and calls postMessageToWebview without acquiring the insertion guard. If this executes concurrently with addToClineMessages, it could lead to inconsistent state updates or ordering issues in the webview. Consider protecting message updates with the guard as well.
	private async updateClineMessage(message: ClineMessage) {
		const provider = this.providerRef.deref()
		await provider?.postMessageToWebview({ type: "messageUpdated", clineMessage: message })
		this.emit(RooCodeEventName.Message, { action: "updated", message })

@catrielmuller catrielmuller force-pushed the catrielmuller/fix-partial-ask-msg-reviewed branch from 8b63efc to 01fdc8c Compare November 5, 2025 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants