Skip to content

fix(client): make readSSE context-aware to prevent goroutine leaks and HTTP/2 hangs#780

Merged
ezynda3 merged 7 commits intomark3labs:mainfrom
samkeet:fix/context-aware-readsse
Apr 4, 2026
Merged

fix(client): make readSSE context-aware to prevent goroutine leaks and HTTP/2 hangs#780
ezynda3 merged 7 commits intomark3labs:mainfrom
samkeet:fix/context-aware-readsse

Conversation

@samkeet
Copy link
Copy Markdown
Contributor

@samkeet samkeet commented Apr 2, 2026

Description

Resolves #779

readSSE in streamable_http.go uses bufio.ReadString which is blocking I/O that does not respect Go context cancellation. When an SSE stream is open but idle (e.g. stateless MCP servers that never send data on GET connections, or servers using text/event-stream responses that keep the HTTP stream open after sending a single event), ReadString blocks indefinitely. The existing select { case <-ctx.Done() } check only runs between reads, never during a blocked read.

This causes:

  • Goroutine leaks: Every cancelled SSE stream leaves a stuck goroutine blocked in ReadString
  • HTTP/2 hangs: resp.Body.Close() on HTTP/2 tries to drain the stream, blocking indefinitely
  • Client shutdown hangs: Close() cannot cleanly shut down listenForever GET connections
  • Affects WithContinuousListening: The GET SSE stream is always open, making this the primary hang path

Additionally, createGETConnectionToServer had defer resp.Body.Close() without cancelling the context first, which can cause HTTP/2 body drain hangs on shutdown.

Fix

Close the reader when the context is cancelled, which causes ReadString to return immediately with an I/O error:

go func() {
    <-ctx.Done()
    reader.Close()
}()

The read loop then checks ctx.Err() to distinguish context cancellation from real I/O errors. The ineffective select { case <-ctx.Done() } wrapper around ReadString is removed since the goroutine approach actually interrupts the blocking call.

Also fixed createGETConnectionToServer to cancel the context before closing the body, matching the pattern used in SendRequest and SendNotification (as established in PR #769).

Test

Includes two new tests:

  • TestReadSSEContextCancellation — deterministic unit test using io.Pipe that verifies readSSE exits promptly when the context is cancelled while ReadString is blocked. Without the fix this hangs indefinitely; with the fix it passes in <50ms.
  • TestSendRequestSSEStreamStaysOpen — integration test using a mock stateless SSE server that responds with text/event-stream, sends a single event, but keeps the stream open. Verifies that multiple SendRequest calls complete without hanging.

All existing tests pass with -race.

Dependencies

This PR depends on #769 (fix/http2-close-before-cancel), which fixed the cancel() before resp.Body.Close() ordering in SendRequest/SendNotification/sendResponseToServer. This PR builds on that branch and addresses the remaining readSSE blocking I/O issue described in #769's review discussion.

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Fixed a potential hang condition during HTTP/2 Server-Sent Events (SSE) stream closure by reordering resource cleanup to ensure proper request context cancellation before response body closure.
  • Tests
    • Added comprehensive tests covering SSE stream cancellation scenarios, including blocking HTTP/2 response body closure and continuous listening operations.

samkeet and others added 6 commits April 1, 2026 11:41
On HTTP/2, resp.Body.Close() blocks in a select waiting for stream cleanup
(cs.donec) or context cancellation (cs.ctx.Done()). When cc.wmu is contended,
cs.donec may never close, making ctx.Done() the only exit path.

The previous defer ordering (LIFO) ran Close() before cancel(), so ctx.Done()
never fired — causing an indefinite hang when the MCP server leaves the SSE
stream open after sending its response (allowed by the spec: SHOULD, not MUST).

Fix: call cancel() before resp.Body.Close() in all three locations:
- SendRequest
- SendNotification
- sendResponseToServer

Includes a deterministic reproduction test using a mock transport that
simulates the HTTP/2 Close() blocking behavior.

Fixes mark3labs#768
Branch-Creation-Time: 2026-03-27T18:18:02+0000
…in test

- Return descriptive error when resp is nil instead of wrapping a nil err
- Add done channel to blockingSSEReader so goroutines are cleanly unblocked
  when h2BodySimulator.Close() is called, preventing leaks on test panics
- Delete path/to/your/file.go (placeholder that caused typecheck lint failure)
- Delete revert_commit.md (stray file)
- Remove unused t *testing.T field from mockH2Transport struct
- Add defer transport.Close() after Start() in regression test
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 2, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 61ae8d89-7f22-43ee-91ba-b95ffe66f507

📥 Commits

Reviewing files that changed from the base of the PR and between 19725ac and 2e100d6.

📒 Files selected for processing (3)
  • client/transport/streamable_http.go
  • client/transport/streamable_http_bodyclose_test.go
  • client/transport/streamable_http_test.go
✅ Files skipped from review due to trivial changes (2)
  • client/transport/streamable_http.go
  • client/transport/streamable_http_test.go

Walkthrough

Reorders context cancellation and HTTP response body closing in StreamableHTTP to avoid HTTP/2 hangs; modifies SSE reader to close the reader on context cancellation to unblock blocked reads. Adds deterministic tests simulating HTTP/2 body-close hang and multiple SSE cancellation scenarios.

Changes

Cohort / File(s) Summary
StreamableHTTP core
client/transport/streamable_http.go
Reordered cleanup: explicit cancel() on error paths and ensure cancel() runs before resp.Body.Close(); return explicit error when resp == nil; changed readSSE to spawn a goroutine that closes the reader on ctx.Done() to interrupt blocking reads and rely on read errors to terminate the loop.
SSE & body-close tests
client/transport/streamable_http_bodyclose_test.go, client/transport/streamable_http_test.go
Added deterministic HTTP/2 body-close hang simulator (h2BodySimulator), mock transport, and tests verifying: resp.Body.Close() does not block waiting for ctx.Done(), readSSE exits promptly on context cancellation, and long-lived/continuous SSE listeners do not block subsequent requests.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

Suggested labels

type: bug

Suggested reviewers

  • ezynda3
  • pottekkat
  • robert-jackson-glean
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(client): make readSSE context-aware to prevent goroutine leaks and HTTP/2 hangs' clearly and specifically describes the main change: making readSSE context-aware to prevent multiple issues.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering the problem (blocking I/O in readSSE), the fix (spawning goroutine to close reader on context cancellation), test coverage (two new tests), and dependencies (PR #769).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@samkeet samkeet changed the title fix: make readSSE context-aware to prevent goroutine leaks and HTTP/2 hangs fix(client): make readSSE context-aware to prevent goroutine leaks and HTTP/2 hangs Apr 2, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
client/transport/streamable_http_test.go (1)

1330-1331: Global retryInterval mutation may cause test interference.

Setting retryInterval = 10 * time.Millisecond modifies a package-level variable. If tests run in parallel (t.Parallel()), this could cause flaky behavior. The same mutation exists at line 749 in TestContinuousListening.

Consider using t.Cleanup to restore the original value, or pass the interval via a constructor option:

♻️ Suggested fix: Restore retryInterval after test
 func TestSendRequestSSEStreamStaysOpenWithContinuousListening(t *testing.T) {
+	origRetryInterval := retryInterval
 	retryInterval = 10 * time.Millisecond
+	t.Cleanup(func() { retryInterval = origRetryInterval })
+
 	serverURL, closeServer := startMockStatelessSSEServer()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/transport/streamable_http_test.go` around lines 1330 - 1331, The test
mutates the package-level retryInterval causing cross-test interference; in
TestSendRequestSSEStreamStaysOpenWithContinuousListening (and similarly
TestContinuousListening) capture the original value at start, set retryInterval
= 10*time.Millisecond for the test, and register t.Cleanup to restore the
original retryInterval when the test finishes (or refactor the code to accept an
injected retry interval via constructor/option and use that instead of the
global). Ensure you reference and update the global symbol retryInterval and the
test functions TestSendRequestSSEStreamStaysOpenWithContinuousListening /
TestContinuousListening when applying the cleanup or injection fix.
client/transport/streamable_http.go (1)

517-523: Goroutine may outlive readSSE if reader returns EOF before context cancellation.

When readSSE returns due to EOF (line 543), the goroutine spawned at line 520 continues waiting on ctx.Done(). While this is typically cleaned up when the caller's context is eventually cancelled, it creates an unnecessary lingering goroutine between EOF and context cancellation.

Consider using a local done channel to signal the goroutine to exit when readSSE returns:

♻️ Optional: Signal goroutine to exit on readSSE return
 func (c *StreamableHTTP) readSSE(ctx context.Context, reader io.ReadCloser, handler func(event, data string)) {
+	// done signals the close-goroutine to exit when readSSE returns normally.
+	done := make(chan struct{})
+	defer close(done)
+
 	// Close the reader when context is cancelled to interrupt blocking reads.
 	// This ensures ReadString returns immediately with an error instead of
 	// blocking indefinitely when the SSE stream is open but idle.
 	go func() {
-		<-ctx.Done()
-		reader.Close()
+		select {
+		case <-ctx.Done():
+			reader.Close()
+		case <-done:
+			// readSSE returned normally; no need to close (caller handles it)
+		}
 	}()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@client/transport/streamable_http.go` around lines 517 - 523, The goroutine
that closes reader waits only on ctx.Done() and can leak after readSSE returns
(e.g., on EOF); modify the logic in readSSE to create a local done channel
(e.g., done := make(chan struct{})), have the goroutine select between
ctx.Done() and done to decide to close reader, and close(done) right before
readSSE returns so the goroutine exits promptly; reference the readSSE function,
the reader variable, and ctx.Done() when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@client/transport/streamable_http.go`:
- Around line 686-688: The comment above the deferred resp.Body.Close() is stale
— it says "Cancel the context before closing the body" although this function
doesn't own a cancel function; update the comment to accurately describe
behavior (e.g., "Close response body; context cancellation is the caller's
responsibility" or similar) and remove the reference to cancel(), while
optionally noting similarity to SendRequest/SendNotification but not implying
ownership of ctx; locate the defer containing resp.Body.Close() in this file and
replace the misleading comment accordingly.

---

Nitpick comments:
In `@client/transport/streamable_http_test.go`:
- Around line 1330-1331: The test mutates the package-level retryInterval
causing cross-test interference; in
TestSendRequestSSEStreamStaysOpenWithContinuousListening (and similarly
TestContinuousListening) capture the original value at start, set retryInterval
= 10*time.Millisecond for the test, and register t.Cleanup to restore the
original retryInterval when the test finishes (or refactor the code to accept an
injected retry interval via constructor/option and use that instead of the
global). Ensure you reference and update the global symbol retryInterval and the
test functions TestSendRequestSSEStreamStaysOpenWithContinuousListening /
TestContinuousListening when applying the cleanup or injection fix.

In `@client/transport/streamable_http.go`:
- Around line 517-523: The goroutine that closes reader waits only on ctx.Done()
and can leak after readSSE returns (e.g., on EOF); modify the logic in readSSE
to create a local done channel (e.g., done := make(chan struct{})), have the
goroutine select between ctx.Done() and done to decide to close reader, and
close(done) right before readSSE returns so the goroutine exits promptly;
reference the readSSE function, the reader variable, and ctx.Done() when making
the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: d65d0675-33db-4117-b90f-3411a74a3832

📥 Commits

Reviewing files that changed from the base of the PR and between d52df1a and 19725ac.

📒 Files selected for processing (3)
  • client/transport/streamable_http.go
  • client/transport/streamable_http_bodyclose_test.go
  • client/transport/streamable_http_test.go

readSSE uses bufio.ReadString which is blocking I/O that does not respect
context cancellation. When an SSE stream is open but idle (e.g. stateless
MCP servers that never send data on GET connections), ReadString blocks
indefinitely. The select{case <-ctx.Done()} check only runs between reads,
not during a blocked read.

Fix: spawn a goroutine that closes the reader when ctx is cancelled. This
causes ReadString to return immediately with an error, allowing readSSE
to exit promptly.

Also fix createGETConnectionToServer to use the same cancel-before-close
pattern as SendRequest and SendNotification, preventing HTTP/2 body drain
hangs on shutdown.
Branch-Creation-Time: 2026-04-01T23:12:16+0000
@samkeet samkeet force-pushed the fix/context-aware-readsse branch from 19725ac to 2e100d6 Compare April 2, 2026 16:42
@ezynda3 ezynda3 merged commit 231ba4d into mark3labs:main Apr 4, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: readSSE blocks indefinitely on context cancellation — blocking I/O ignores ctx.Done()

2 participants