Skip to content

Commit 498508e

Browse files
CopilotYunchuWang
andcommitted
Fix race condition in WaitForInstanceAsync causing RestartAsync_EndToEnd test to fail
The original code had a TOCTOU race condition where a completion notification could be missed if the orchestration completed between checking completion status and adding the waiter. The fix reorders the operations to add the waiter first, then check for completion. Co-authored-by: YunchuWang <[email protected]>
1 parent 134a166 commit 498508e

File tree

1 file changed

+8
-3
lines changed

1 file changed

+8
-3
lines changed

src/InProcessTestHost/Sidecar/InMemoryOrchestrationService.cs

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -739,6 +739,12 @@ public void ReleaseLock(string instanceId)
739739

740740
public Task<OrchestrationState> WaitForInstanceAsync(string instanceId, CancellationToken cancellationToken)
741741
{
742+
// First, add the waiter before checking completion to avoid a race condition.
743+
// This ensures we don't miss a completion notification that happens between
744+
// checking the status and adding the waiter.
745+
var tcs = this.waiters.GetOrAdd(instanceId, _ => new TaskCompletionSource<OrchestrationState>());
746+
747+
// Now check if already completed - if so, complete the waiter immediately
742748
if (this.store.TryGetValue(instanceId, out SerializedInstanceState? state))
743749
{
744750
lock (state)
@@ -750,16 +756,15 @@ public Task<OrchestrationState> WaitForInstanceAsync(string instanceId, Cancella
750756
statusRecord.OrchestrationStatus == OrchestrationStatus.Failed ||
751757
statusRecord.OrchestrationStatus == OrchestrationStatus.Terminated)
752758
{
753-
// orchestration has already completed
754-
return Task.FromResult(statusRecord);
759+
// Orchestration has already completed - complete the waiter and return
760+
tcs.TrySetResult(statusRecord);
755761
}
756762
}
757763
}
758764
}
759765

760766
// Caller will be notified when the instance completes.
761767
// The ContinueWith is just to enable cancellation: https://stackoverflow.com/a/25652873/2069
762-
var tcs = this.waiters.GetOrAdd(instanceId, _ => new TaskCompletionSource<OrchestrationState>());
763768
return tcs.Task.ContinueWith(t => t.GetAwaiter().GetResult(), cancellationToken);
764769
}
765770

0 commit comments

Comments
 (0)