Skip to content

fix: close done channel on nil response to prevent goroutine leak#766

Merged
ezynda3 merged 1 commit intomark3labs:mainfrom
Sim-hu:fix/streamable-http-race
Apr 4, 2026
Merged

fix: close done channel on nil response to prevent goroutine leak#766
ezynda3 merged 1 commit intomark3labs:mainfrom
Sim-hu:fix/streamable-http-race

Conversation

@Sim-hu
Copy link
Copy Markdown
Contributor

@Sim-hu Sim-hu commented Mar 26, 2026

Summary

Fixes a race condition panic in handlePost when processing notifications (nil responses).

When HandleMessage returns nil (e.g. for notifications/initialized), the handler sent 202 and returned without closing the done channel. The background notification pump goroutine kept running and could write to a dead ResponseWriter, causing panics like:

  • http: superfluous response.WriteHeader call
  • panic: nil pointer dereference in bufio.(*Writer).Flush

The fix closes the done channel under the mutex before returning, matching the pattern already used in the non-nil response path. Also checks upgradedHeader to avoid a superfluous WriteHeader if the goroutine already wrote SSE headers.

Includes a race-detector test (TestStreamableHTTPNotificationRace) that reproduces the issue.

Fixes #763

Summary by CodeRabbit

  • Bug Fixes
    • Fixed a race condition in notification handling that could cause response conflicts, ensuring reliable notification delivery without conflicting writes to the HTTP response.
  • Tests
    • Added a regression test to verify notification handling under concurrent conditions.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 26, 2026

Walkthrough

This change fixes a race condition in the HTTP Streamable Server where nil responses (JSON-RPC notifications) failed to properly coordinate goroutine shutdown. The fix closes the done channel and conditionally writes 202 Accepted status only if SSE headers have not already been sent, preventing concurrent writes to the HTTP response object.

Changes

Cohort / File(s) Summary
Race Condition Fix
server/streamable_http.go
Added mutex-protected closure of done channel and conditional status-code writing in the notification response path. Only writes http.StatusAccepted if SSE headers were not already sent (!upgradedHeader), preventing concurrent writes when streaming is active.
Regression Test
server/streamable_http_test.go
Added TestStreamableHTTPNotificationRace to validate handling of inbound notifications that trigger outbound notifications. Test makes 200 sequential POST requests and confirms responses are either 200 OK or 202 Accepted.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • #348: Directly addresses coordination of the notification goroutine, closing the done channel, and upgraded/upgradedHeader logic in handlePost.
  • #743: Addresses the same SSE upgrade race condition around upgradedHeader to ensure response writing accounts for already-upgraded headers.
  • #741: Modifies handlePost response-finalization logic to change when HTTP 202 Accepted vs. alternative status codes are written.

Suggested reviewers

  • dugenkui03
  • pottekkat
  • ezynda3
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main fix: closing the done channel on nil responses to prevent goroutine leak, which is the primary objective of the PR.
Description check ✅ Passed The PR description is comprehensive, covering the issue, root cause, solution, test addition, and linked issue reference. All key sections are addressed with sufficient detail.
Linked Issues check ✅ Passed The code changes directly address issue #763 by closing the done channel in the nil-response path and checking upgradedHeader to prevent superfluous WriteHeader calls, matching the issue's requirements.
Out of Scope Changes check ✅ Passed All changes are directly scoped to fixing the race condition described in issue #763: handling the nil-response path in handlePost and adding a regression test.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
server/streamable_http_test.go (1)

2861-2872: Use require here and give the client a timeout.

That keeps the loop fail-fast, aligns this test with the rest of the testify-based suite, and avoids wedging the package if one POST stops returning.

Suggested cleanup
-	client := &http.Client{Transport: &http.Transport{DisableKeepAlives: true}}
+	client := &http.Client{
+		Timeout:   2 * time.Second,
+		Transport: &http.Transport{DisableKeepAlives: true},
+	}
@@
-		resp, err := client.Post(ts.URL+"/mcp", "application/json", strings.NewReader(body))
-		if err != nil {
-			t.Fatalf("iteration %d: %v", i, err)
-		}
+		resp, err := client.Post(ts.URL+"/mcp", "application/json", strings.NewReader(body))
+		require.NoError(t, err, "iteration %d", i)
 		resp.Body.Close()
 
-		if resp.StatusCode != http.StatusAccepted && resp.StatusCode != http.StatusOK {
-			t.Fatalf("iteration %d: expected 200 or 202, got %d", i, resp.StatusCode)
-		}
+		require.True(
+			t,
+			resp.StatusCode == http.StatusAccepted || resp.StatusCode == http.StatusOK,
+			"iteration %d: expected 200 or 202, got %d",
+			i,
+			resp.StatusCode,
+		)
 	}

As per coding guidelines, **/*_test.go: Testing: use testify/assert and testify/require.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/streamable_http_test.go` around lines 2861 - 2872, Replace the plain
http.Client with one that has a timeout (e.g. client := &http.Client{Transport:
&http.Transport{DisableKeepAlives: true}, Timeout: 5*time.Second}) and switch
the loop's t.Fatalf checks to testify/require calls so the test fails fast: use
require.NoError(t, err, "iteration %d", i) for the POST error and
require.Truef(t, resp.StatusCode == http.StatusAccepted || resp.StatusCode ==
http.StatusOK, "iteration %d: expected 200 or 202, got %d", i, resp.StatusCode)
for status validation; import testify/require and time as needed and keep
resp.Body.Close().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/streamable_http_test.go`:
- Around line 2861-2872: Replace the plain http.Client with one that has a
timeout (e.g. client := &http.Client{Transport:
&http.Transport{DisableKeepAlives: true}, Timeout: 5*time.Second}) and switch
the loop's t.Fatalf checks to testify/require calls so the test fails fast: use
require.NoError(t, err, "iteration %d", i) for the POST error and
require.Truef(t, resp.StatusCode == http.StatusAccepted || resp.StatusCode ==
http.StatusOK, "iteration %d: expected 200 or 202, got %d", i, resp.StatusCode)
for status validation; import testify/require and time as needed and keep
resp.Body.Close().

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 0570b2bd-c4f2-420e-832a-01f6d7d7a19f

📥 Commits

Reviewing files that changed from the base of the PR and between 4713d74 and 5c0a254.

📒 Files selected for processing (2)
  • server/streamable_http.go
  • server/streamable_http_test.go

@michaelrios
Copy link
Copy Markdown

This is actively affecting me right now, so would love to get this merged. Is there anything I can do to help move this along?

@Shweta-Deshpande
Copy link
Copy Markdown

This error is blocking us right now:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x245aa8]

goroutine 53 [running]:
bufio.(*Writer).Flush(0x810340?)
	bufio/bufio.go:636 +0x18
net/http.(*chunkWriter).flush(0x774a3638bd68?)
	net/http/server.go:405 +0x48
net/http.(*response).FlushError(0x774a36396690)
	net/http/server.go:1729 +0x4c
	

Can this fix be merged soon?

@Sim-hu
Copy link
Copy Markdown
Contributor Author

Sim-hu commented Apr 4, 2026

Thanks for confirming the issue, @michaelrios @Shweta-Deshpande! The stack trace you shared matches exactly what this PR addresses. Hopefully a maintainer can take a look soon.

@ezynda3 ezynda3 merged commit 5b4d899 into mark3labs:main Apr 4, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Panic due to Race Condition in HTTP Streamable Server Transport for Empty Responses

4 participants