Skip to content

Avoid racing between session close + adding new streams#6

Merged
euank merged 1 commit intomainfrom
euan/muxado-deadlock
Mar 3, 2026
Merged

Avoid racing between session close + adding new streams#6
euank merged 1 commit intomainfrom
euan/muxado-deadlock

Conversation

@euank
Copy link
Contributor

@euank euank commented Feb 24, 2026

Until now, there was a subtle race between session.die and a new stream being created.

In session.die, we used streamMap.Each to close all streams. We also set local.goneAway in die, which in theory should mean new streams aren't added.

However, setting goneAway, iterating with Each, and adding a stream aren't adequately synchronized, so it's possible to end up in a state where we've got a goroutine already in handleSyn, about to add a new stream, and then we context switch to 'die' and cleanup existing streams, and then switch back and add the stream.

This puts the stream in a state where it will never get closed since die won't run again (due to dieOnce), however we won't process the stream properly since die closed the s.dead channel and cleaned up stuff related to that.

So, what's the fix?

Instead of using streamMap.Each during die, we use streamMap.Drain() to atomically clear the map and mark it drained, and then in Set we return whether we just added to a drained map (the race condition), and if we did we knock over the stream we just created.

The included test failed before this change.

The test was written by claude, the fix was written by me.

Until now, there was a subtle race between session.die and a new stream
being created.

In session.die, we used `streamMap.Each` to close all streams. We also
set `local.goneAway` in die, which in theory should mean new streams
aren't added.

However, setting `goneAway`, iterating with `Each`, and adding a stream
aren't adequately synchronized, so it's possible to end up in a state
where we've got a goroutine already in `handleSyn`, about to add a new
stream, and then we context switch to 'die' and cleanup existing
streams, and then switch back and add the stream.

This puts the stream in a state where it will never get closed since
`die` won't run again (due to `dieOnce`), however we won't process the
stream properly since `die` closed the `s.dead` channel and cleaned up
stuff related to that.

So, what's the fix?

Instead of using `streamMap.Each` during `die`, we use
`streamMap.Drain()` to atomically clear the map and mark it drained, and
then in Set we return whether we just added to a drained map (the race
condition), and if we did we knock over the stream we just created.

The included test failed before this change.

The test was written by claude, the fix was written by me.
@euank euank requested a review from bmpngrok February 27, 2026 06:40
Copy link

@bmpngrok bmpngrok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Called out one nit and one question, but I genuinely don't have a better suggestion after looking at the code that isn't a lot heavier than what you've got here

m.Lock()
defer m.Unlock()
streams := m.table
m.table = nil
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I accept this is the best solution, but since it does open us to NPEs, was there any alternative other than the nil set? (I strongly guess not, but just double-checking. I see the several checks you had to add for m.table == nil, and that makes me nervous.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, we could add a bool like m.frozen or such, but I feel like this struct is small and self-contained enough that it's easiest to just be careful about nils.

@euank euank merged commit b295f96 into main Mar 3, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants