Skip to content
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -307,16 +307,16 @@ out DbConnectionInternal? internalConnection
Assert.NotNull(internalConnection);
}

// Use multiple ManualResetEventSlim to ensure proper ordering
using ManualResetEventSlim firstTaskReady = new(false);
using ManualResetEventSlim secondTaskReady = new(false);
using ManualResetEventSlim startRequests = new(false);
// Use TaskCompletionSource for coordination to avoid mixing async/await with native synchronization
TaskCompletionSource<bool> firstTaskReady = new();
TaskCompletionSource<bool> secondTaskReady = new();
TaskCompletionSource<bool> startRequests = new();

// Act
var recycledTask = Task.Run(() =>
var recycledTask = Task.Run(async () =>
{
firstTaskReady.Set();
startRequests.Wait();
firstTaskReady.SetResult(true);
await startRequests.Task;
pool.TryGetConnection(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think making these async introduced a deadlock. In some conditions, they'll hang on to threads and prevent future async operations from going through. I'm going to revert these changes other than the SpinWait -> Thread.Sleep()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be nice to get a better understanding of why the task completion source causes deadlocks in this case. I feel like this might be masking a bigger issue, or a lack of understanding of how the mechanisms are actually working.

new SqlConnection("Timeout=5000"),
null,
Expand All @@ -326,12 +326,14 @@ out DbConnectionInternal? recycledConnection
return recycledConnection;
});

var failedTask = Task.Run(() =>
var failedTask = Task.Run(async () =>
{
secondTaskReady.Set();
startRequests.Wait();
// Add a small delay to ensure this request comes after the first
Thread.Sleep(50);
secondTaskReady.SetResult(true);
await startRequests.Task;
// Add a small delay to ensure this request comes after the first.
// This is necessary because the channel-based pool queues requests in FIFO order,
// and we need to guarantee the order for this test to be deterministic.
await Task.Delay(50);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tough delay to choose. Is 50ms long enough to guarantee that recycledTask has started its call to TryGetConnection? Hard to say since it depends entirely on scheduling. How obvious will it be if this test fails because failedTask actually calls TryGetConnection first? I'm thinking about future pain diagnosing intermittent test failures. Is there any way to test this scenario without using delays - even if it means adding an internal API that lets you control things explicitly?

Copy link
Contributor Author

@mdaigle mdaigle Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard because there's no way to run something after getting in the queue. The thread or Task is blocked, so we have no way to signal to another thread that our request to the channel is in. Using another thread to do the signaling just introduces another level of scheduling uncertainty. We would need access to the internals of Channel itself.

pool.TryGetConnection(
new SqlConnection("Timeout=1"),
null,
Expand All @@ -342,16 +344,18 @@ out DbConnectionInternal? failedConnection
});

// Wait for both tasks to be ready before starting the requests
firstTaskReady.Wait();
secondTaskReady.Wait();
await firstTaskReady.Task;
await secondTaskReady.Task;

// Use SpinWait to ensure both tasks are actually waiting
SpinWait.SpinUntil(() => false, 100);
// Allow both tasks to reach their wait state before proceeding
await Task.Delay(100);

// Start both requests
startRequests.Set();
startRequests.SetResult(true);

// Give time for both requests to be queued
// Give time for both requests to be queued.
// This delay ensures that both TryGetConnection calls have been made and are waiting in the channel
// before we return the connection, which is necessary to test FIFO ordering.
await Task.Delay(200);

// Return the connection which should satisfy the first queued request
Expand Down Expand Up @@ -401,7 +405,9 @@ out DbConnectionInternal? internalConnection
out DbConnectionInternal? recycledConnection
);

// Ensure sufficient time for the recycled connection request to be fully queued
// Ensure sufficient time for the recycled connection request to be fully queued.
// This delay is necessary because the channel-based pool queues async requests,
// and we need to guarantee the first request is in the queue before the second one.
await Task.Delay(200);

var exceeded2 = pool.TryGetConnection(
Expand All @@ -411,7 +417,8 @@ out DbConnectionInternal? recycledConnection
out DbConnectionInternal? failedConnection
);

// Ensure the second request is also queued
// Ensure the second request is also queued before returning the connection.
// This guarantees that both requests are waiting in FIFO order.
await Task.Delay(100);

pool.ReturnInternalConnection(firstConnection!, firstOwningConnection);
Expand Down
Loading