Fix unconditional semaphore release in BasicPublishAsync when Cancell…#1901
Fix unconditional semaphore release in BasicPublishAsync when Cancell…#1901EvheniyHlushko wants to merge 8 commits intorabbitmq:mainfrom
Conversation
…ed during BasicPublishAsync When CancellationToken is already cancelled during BasicPublishAsync, _confirmSemaphore.WaitAsync() fails but the finally block unconditionally releases the semaphore. This breaks the semaphore and allows concurrent access, corrupting _nextPublishSeqNo. The fix makes the semaphore release conditional on successful acquisition. If WaitAsync throws OperationCanceledException, the lease is cleaned up and no PublisherConfirmationInfo is returned, so the semaphore is not released.
Move the semaphore ownership and release responsibility from MaybeEndPublisherConfirmationTrackingAsync to the PublisherConfirmationInfo class itself. This improves encapsulation by making the PublisherConfirmationInfo responsible for managing the semaphore resource it owns. Changes: - Add _semaphore field and _semaphoreReleased flag to PublisherConfirmationInfo - Add ReleaseSemaphore() method with idempotence check - Update constructor to accept SemaphoreSlim parameter - Update Dispose() to call ReleaseSemaphore() - Simplify control flow in MaybeStartPublisherConfirmationTrackingAsync (early return pattern) - Update MaybeEndPublisherConfirmationTrackingAsync to use ReleaseSemaphore() from info object
28e91de to
e19652f
Compare
|
@EvheniyHlushko I tried to pull the change apart because I had a hard time seeing the bug fix |
|
@lukebakken I did try to split it. I'm not a big fan of the second commit. At first sight it looks like it cleans things up but then the outer code still has semaphore handling in place. I would personally not pull in the second part and only the first if that fixes the problem |
|
@lukebakken last commit contains my preference. Please don't hesitate to revert or take whatever you prefer |
|
I haven't even had a chance to look 😸 @danielmarbach so, my understanding is that this PR, in its current state, is what you'd prefer? |
|
@lukebakken yes but please don't hesitate to challenge or partially revert or whatever ;) |
|
@danielmarbach The idea is to use the semaphore together with a lease and release them together. It’s difficult to reason about correctness when the semaphore is acquired and released in different methods. If changes are introduced to MaybeStartPublisherConfirmationTrackingAsync in the future and an exception occurs the semaphore may never be released. In your version confirmSemaphoreAcquired is not necessary. I’ve pushed a new version that uses an explicit lock instead of mixing the responsibility with PublisherConfirmationInfo. Feel free to revert if you prefer the previous approach. |
|
There are quite a few things in this code base that are "difficult to reason about". Sometimes it is accidental and sometimes it comes from avoiding additional allocations. I get your point @EvheniyHlushko. That being said now we have a factory that is IDisposable because it owns a semaphore, an additional class allocation and the responsibility is spread around plus the flag name also changes in the inner concept. For me this is more complex but I realize this is also very subjective. I'll defer to @lukebakken since he has the most context in this lib |
|
@danielmarbach Thanks for the feedback, I understand your point about the added complexity with the factory approach. My main point isn't about the factory specifically - it can be implemented in different ways. The key idea is to keep acquire and release in the same method using the standard lock-try-finally pattern, which protects from future changes accidentally breaking the resource lifecycle. I've pushed a version without the factory. Again this is my personal preference - happy to defer to you and @lukebakken on what fits best in this codebase. |
|
@EvheniyHlushko haha I was also thinking to incorporate some of your feedback and was concurrently working on the same thing. I branched off for now to not interfere #1902 |
|
I like this and it looks sound (but hey I'm in weekend and vacation mode, so I don't trust my brain) |
When
CancellationTokenis already cancelled duringBasicPublishAsync,_confirmSemaphore.WaitAsync()fails but thefinallyblock unconditionally releases the semaphore. This breaks the semaphore and allows concurrent access, corrupting_nextPublishSeqNo.The fix makes the semaphore release conditional on successful acquisition.
Fixes #1900