Skip to content

Fix unconditional semaphore release in BasicPublishAsync when Cancell…#1901

Open
EvheniyHlushko wants to merge 8 commits intorabbitmq:mainfrom
EvheniyHlushko:fix-confirm-semaphore-cancellation
Open

Fix unconditional semaphore release in BasicPublishAsync when Cancell…#1901
EvheniyHlushko wants to merge 8 commits intorabbitmq:mainfrom
EvheniyHlushko:fix-confirm-semaphore-cancellation

Conversation

@EvheniyHlushko
Copy link

When CancellationToken is already cancelled during BasicPublishAsync, _confirmSemaphore.WaitAsync() fails but the finally block unconditionally releases the semaphore. This breaks the semaphore and allows concurrent access, corrupting _nextPublishSeqNo.

The fix makes the semaphore release conditional on successful acquisition.

Fixes #1900

@lukebakken lukebakken self-assigned this Feb 6, 2026
@lukebakken lukebakken added this to the 7.2.1 milestone Feb 6, 2026
…ed during BasicPublishAsync

When CancellationToken is already cancelled during BasicPublishAsync,
_confirmSemaphore.WaitAsync() fails but the finally block unconditionally
releases the semaphore. This breaks the semaphore and allows concurrent
access, corrupting _nextPublishSeqNo.

The fix makes the semaphore release conditional on successful acquisition.
If WaitAsync throws OperationCanceledException, the lease is cleaned up
and no PublisherConfirmationInfo is returned, so the semaphore is not released.
Move the semaphore ownership and release responsibility from MaybeEndPublisherConfirmationTrackingAsync
to the PublisherConfirmationInfo class itself. This improves encapsulation by making the PublisherConfirmationInfo
responsible for managing the semaphore resource it owns.

Changes:
- Add _semaphore field and _semaphoreReleased flag to PublisherConfirmationInfo
- Add ReleaseSemaphore() method with idempotence check
- Update constructor to accept SemaphoreSlim parameter
- Update Dispose() to call ReleaseSemaphore()
- Simplify control flow in MaybeStartPublisherConfirmationTrackingAsync (early return pattern)
- Update MaybeEndPublisherConfirmationTrackingAsync to use ReleaseSemaphore() from info object
@danielmarbach danielmarbach force-pushed the fix-confirm-semaphore-cancellation branch from 28e91de to e19652f Compare February 6, 2026 19:37
@danielmarbach
Copy link
Collaborator

@EvheniyHlushko I tried to pull the change apart because I had a hard time seeing the bug fix

@danielmarbach
Copy link
Collaborator

@lukebakken I did try to split it. I'm not a big fan of the second commit. At first sight it looks like it cleans things up but then the outer code still has semaphore handling in place. I would personally not pull in the second part and only the first if that fixes the problem

@danielmarbach
Copy link
Collaborator

@lukebakken last commit contains my preference. Please don't hesitate to revert or take whatever you prefer

danielmarbach
danielmarbach previously approved these changes Feb 6, 2026
@lukebakken
Copy link
Collaborator

I haven't even had a chance to look 😸 @danielmarbach so, my understanding is that this PR, in its current state, is what you'd prefer?

@danielmarbach
Copy link
Collaborator

@lukebakken yes but please don't hesitate to challenge or partially revert or whatever ;)

@EvheniyHlushko
Copy link
Author

@danielmarbach The idea is to use the semaphore together with a lease and release them together. It’s difficult to reason about correctness when the semaphore is acquired and released in different methods.

If changes are introduced to MaybeStartPublisherConfirmationTrackingAsync in the future and an exception occurs the semaphore may never be released. In your version confirmSemaphoreAcquired is not necessary.

I’ve pushed a new version that uses an explicit lock instead of mixing the responsibility with PublisherConfirmationInfo. Feel free to revert if you prefer the previous approach.

@danielmarbach
Copy link
Collaborator

There are quite a few things in this code base that are "difficult to reason about". Sometimes it is accidental and sometimes it comes from avoiding additional allocations.

I get your point @EvheniyHlushko. That being said now we have a factory that is IDisposable because it owns a semaphore, an additional class allocation and the responsibility is spread around plus the flag name also changes in the inner concept. For me this is more complex but I realize this is also very subjective. I'll defer to @lukebakken since he has the most context in this lib

@EvheniyHlushko
Copy link
Author

EvheniyHlushko commented Feb 7, 2026

@danielmarbach Thanks for the feedback, I understand your point about the added complexity with the factory approach.
Based on history, this isn't the first issue with the semaphore, and the same applies to the RateLimitLease - if cancellation occurs after acquiring the lease but before the semaphore, the lease is never disposed, which silently exhausts the rate limiter over time.

My main point isn't about the factory specifically - it can be implemented in different ways. The key idea is to keep acquire and release in the same method using the standard lock-try-finally pattern, which protects from future changes accidentally breaking the resource lifecycle.

I've pushed a version without the factory. Again this is my personal preference - happy to defer to you and @lukebakken on what fits best in this codebase.

@danielmarbach
Copy link
Collaborator

@EvheniyHlushko haha I was also thinking to incorporate some of your feedback and was concurrently working on the same thing. I branched off for now to not interfere #1902

@danielmarbach
Copy link
Collaborator

I like this and it looks sound (but hey I'm in weekend and vacation mode, so I don't trust my brain)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Publisher confirms: _confirmSemaphore broken by cancelled CancellationToken allows concurrent access, corrupts _nextPublishSeqNo

3 participants