-
Notifications
You must be signed in to change notification settings - Fork 2.4k
fix: resolve MCP server connection race condition after marketplace installation #8469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add connection readiness check before marking server as connected - Implement retry logic with exponential backoff for initial tool list fetch - Add retry mechanism for tool calls to handle connection initialization race - Wait for server to be fully ready before showing 'Running' status This fixes the issue where the first MCP tool call fails with 'Connection closed' error after marketplace installation, ensuring servers are fully initialized before accepting tool calls. Fixes #8468
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Self-review: inspecting my own diffs like a deterministic mirror—flagging a couple reflections to fix.
| } catch (readinessError) { | ||
| // If we can't fetch tools, the connection isn't ready | ||
| console.error(`Server ${name} failed readiness check:`, readinessError) | ||
| connection.server.status = "disconnected" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P1] Readiness failure leaves transport/client open and a 'connected'-typed entry in a 'disconnected' state. Close the transport/client (or call deleteConnection) before throwing to avoid resource leaks and inconsistent state.
| } | ||
|
|
||
| // Calculate exponential backoff delay | ||
| const delay = baseDelay * Math.pow(2, attempt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P3] The local variable 'delay' shadows the imported delay() helper, which can cause confusion and accidental misuse later. Consider renaming to 'backoffMs'.
| const delay = baseDelay * Math.pow(2, attempt) | |
| const backoffMs = baseDelay * Math.pow(2, attempt) | |
| console.log(`Retrying tools fetch for ${serverName} after ${backoffMs}ms (attempt ${attempt + 1}/${maxRetries})`) | |
| await new Promise((resolve) => setTimeout(resolve, backoffMs)) |
| ) | ||
| } catch (error: any) { | ||
| const isLastAttempt = attempt === maxRetries - 1 | ||
| const isConnectionError = error?.code === -32000 && error?.message?.includes("Connection closed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[P2] Retry predicate is narrowly scoped to '-32000' + 'Connection closed'. In practice, transient connection closures can surface as ECONNRESET, EPIPE, or AbortError from transports. Broadening this to catch common transient cases will make the first-call robustness closer to intent.
| const isConnectionError = error?.code === -32000 && error?.message?.includes("Connection closed") | |
| const isConnectionError = | |
| (error?.code === -32000 && error?.message?.includes(\"Connection closed\")) || | |
| /ECONNRESET|EPIPE|connection.*(closed|reset)/i.test(String(error?.message)) || | |
| error?.name === \"AbortError\" |
Summary
This PR fixes the issue where the first MCP tool call fails with "Connection closed" error immediately after installing a server from the marketplace.
Problem
When installing an MCP server from the marketplace, the server shows "Running" status in the UI, but the first tool call fails with "MCP error -32000: Connection closed". Users have to retry the exact same tool call for it to succeed.
Solution
The fix implements:
Changes
McpHub.connectToServer()to wait for server readiness before marking as connectedfetchToolsListWithRetry()method with exponential backoff (500ms, 1s, 2s)callTool()with retry logic for connection errors (2 retries with 1s delay)Testing
Fixes #8468
Important
Fixes MCP server connection race condition by adding readiness checks and retry mechanisms in
McpHubclass.McpHub.connectToServer().fetchToolsListWithRetry()with exponential backoff (500ms, 1s, 2s) for connection readiness.callTool()for connection errors (2 retries with 1s delay).connectToServer()to verify connection readiness before marking as connected.fetchToolsListWithRetry()to handle initial connection delays.callTool()with retry mechanism for connection errors.This description was created by
for 794ddfa. You can customize this summary. It will automatically update as commits are pushed.