Skip to content

Conversation

arturmelanchyk
Copy link

Tiny optimization to avoid unneeded locking

@arturmelanchyk arturmelanchyk changed the title Replace mutex with atomic in queue.go perf: replace mutex with atomic in queue.go Aug 30, 2025
@jakebailey
Copy link
Member

Are you actually seeing a case where this matters?

@arturmelanchyk
Copy link
Author

Hi @jakebailey
This looks cleaner and comes with less overhead

@LukeAbby
Copy link
Contributor

LukeAbby commented Aug 30, 2025

I was looking into this code since it smelled fishy. It helped me uncover that Wait actually isn't okay (even without this PR). sync.WaitGroup really buries the lede and isn't always super explicit but it can actually panic in a number of cases. In this case it's at least documented and says this under (*sync.WaitGroup).Add:

Note that calls with a positive delta that occur when the counter is zero must happen before a Wait.

And if you look in the implementation you can see this, confirming that it's a panic:

	if w != 0 && delta > 0 && v == int32(delta) {
		panic("sync: WaitGroup misuse: Add called concurrently with Wait")
	}

(w is the number of workers and delta equals 1 when beginning a task or -1 when ending a task)

So when (*Queue).Wait says "It does not prevent new tasks from being enqueued while waiting." this is untrue if Wait is called while 0 workers are running (say they simply finished quickly) as if Enqueue is run in parallel at that point it will panic.

Now granted (*Queue).Wait is currently only called in WaitForBackgroundTasks which says "This is intended to be used only for testing purposes." but the occasional panic in test still seems bad. To make Wait truly safe it needs to be holding the write end of the *sync.RWMutex. It can't be safely done by making Wait close the queue with the atomic boolean since this sequence of events could occur:

Thread A calls Enqueue -> Thread A loads q.close and observes it as false ->
Thread B calls Wait -> Thread B sets q.close to true (expecting to prevent any other thread from being able to enqueue anything) -> Thread B calls q.wg.Wait ->
Thread A calls q.wg.Add thus causing a panic.

@jakebailey
Copy link
Member

@arturmelanchyk That doesn't really answer my question. While this is better, it's best to focus on perf and correctness issues seen in actual codebases...

@LukeAbby I think you're right here, the saving grace is that we never actually call Close until the test has already completed, as part of deferred/cleanup work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants