Fix the race condition for continue-as-new with extended sessions enabled by sophiatev · Pull Request #1303 · Azure/durabletask

sophiatev · 2026-02-23T21:58:11Z

Currently we have a subtle race condition that can occur if a user has extended sessions enabled and an orchestration that attempts to continue-as-new. The flow is as follows

An orchestration continues-as-new with a new execution ID, and the TaskOrchestrationDispatcher calls CompleteTaskOrchestrationWorkItemAsync.
In the completion call, outbound messages are committed. Say one of these is a TaskScheduled event to start a new Activity.
The Activity completes and sends a TaskCompleted event back to the orchestration, all before the CompleteTaskOrchestrationWorkItemAsync has updated the orchestration's state in storage to reflect the new execution ID.
A call to LockNextTaskOrchestrationWorkItemAsync is made which retrieves the TaskCompleted event. The TaskCompleted event is addressed to the new execution ID, but since the orchestration's state has not yet been updated in storage, there is no record for that execution ID. This call to determine out of order messages should detect that this is potentially an "out of order" TaskCompleted message, since the instance does "not yet exist". However, IsOutOfOrderMessage uses the in-memory state of the session. The in-memory state (which exists, since extended sessions are enabled) has the information for the old execution ID, so it does not detect that the instance "does not yet exist". It thinks the message is valid, and proceeds.
Later on, in the LockNextTaskOrchestrationWorkItemAsync method, when we attempt to retrieve information about this orchestration instance with the new execution ID in storage, we find none, and fail at this point. We delete the TaskCompleted event, which leaves the orchestration permanently stuck in a running state.

The core of the issue is that the in-memory session state is not updated to reflect the new execution ID before outbound messages are committed, which prevents the IsOutOfOrderMessage logic from functioning correctly. This PR moves the placement of the session.UpdateRuntimeState call to be before outbound messages are committed to fix this issue.

Resolves #1302

…ommit any outbound messages

Copilot

Pull request overview

Fixes a race condition in the Azure Storage backend when extended sessions are enabled and an orchestration performs ContinueAsNew, where activity responses can arrive before the new execution ID is checkpointed, causing a stuck orchestration.

Changes:

Moves session.UpdateRuntimeState(runtimeState) earlier in CompleteTaskOrchestrationWorkItemAsync so the in-memory session reflects the new execution ID before outbound messages are committed.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-23T22:02:32Z

src/DurableTask.AzureStorage/AzureStorageOrchestrationService.cs

+            // update the runtime state and execution id stored in the session
+            session.UpdateRuntimeState(runtimeState);
+


This change fixes a subtle race in checkpointing order for ContinueAsNew + extended sessions, but there’s no regression test added to ensure the out-of-order TaskCompleted scenario is handled (message abandoned/retried rather than deleted and leaving the instance stuck). Consider adding an AzureStorage end-to-end test that enables extended sessions and forces trackingStore.UpdateStateAsync to be delayed while an activity completion message is delivered for the new execution ID, asserting the orchestration still completes and no control-queue message is lost.

moved placement of session.UpdatedRuntimeState call to be before we c…

fe92118

…ommit any outbound messages

Copilot AI review requested due to automatic review settings February 23, 2026 21:58

Copilot started reviewing on behalf of sophiatev February 23, 2026 21:59 View session

Copilot AI reviewed Feb 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fix the race condition for continue-as-new with extended sessions enabled#1303

Fix the race condition for continue-as-new with extended sessions enabled#1303
sophiatev wants to merge 1 commit intomainfrom
stevosyan/fix-continue-as-new-with-extended-sessions-race-condition

sophiatev commented Feb 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		// update the runtime state and execution id stored in the session
		session.UpdateRuntimeState(runtimeState);

Comments

Conversation

sophiatev commented Feb 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant