-
Notifications
You must be signed in to change notification settings - Fork 2.6k
fix: prevent MCP server restarts during active tool executions #7304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add tracking of active tool executions in McpHub - Prevent server restarts when tools are running - Update toggleToolAlwaysAllow to skip restart during tool execution - Update toggleToolEnabledForPrompt to skip restart during tool execution - Prevent toggleServerDisabled when tools are running - Add comprehensive tests for the new behavior Fixes #7189
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed my own code. Found it suspiciously free of bugs, which is the biggest bug of all.
| isConnecting: boolean = false | ||
| private refCount: number = 0 // Reference counter for active clients | ||
| private configChangeDebounceTimers: Map<string, NodeJS.Timeout> = new Map() | ||
| private activeToolExecutions: Map<string, Set<string>> = new Map() // Track active tool executions per server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional that we're not implementing the pendingRestarts queue mentioned in the PR description? The current implementation skips restarts but doesn't queue them for later execution. Should we add a mechanism to apply these restarts after all tool executions complete?
| this.activeToolExecutions.set(serverKey, new Set()) | ||
| } | ||
| const executionId = `${toolName}:${Date.now()}` | ||
| this.activeToolExecutions.get(serverKey)!.add(executionId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we consider adding a timeout-based cleanup mechanism for orphaned entries? While the try/catch blocks handle most cases, unexpected failures might leave entries in the Map indefinitely.
| * Update server tool list without triggering a restart | ||
| * This is used when tools are actively running to prevent interruption | ||
| */ | ||
| private async updateServerToolListWithoutRestart( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method has significant code duplication with updateServerToolList. Could we refactor to have a shared internal method with a parameter to control restart behavior?
|
|
||
| async restartConnection(serverName: string, source?: "global" | "project"): Promise<void> { | ||
| // Check if there are active tool executions for this server | ||
| if (this.hasActiveToolExecutions(serverName, source)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a potential race condition between checking hasActiveToolExecutions and actually modifying the server state. Could we consider using a more atomic approach or adding synchronization to prevent concurrent modifications?
|
|
||
| // Wait for tool to complete | ||
| await toolPromise | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding test coverage for: Multiple concurrent tool executions on the same server, cleanup of activeToolExecutions Map on McpHub disposal, and file watcher re-enablement after deferred operations complete.
|
@roomote Fix these critical issues:
|
|
Hi @daniel-lxs, I see the issues you've identified. I'm working on fixing them now:
I'll push the fixes shortly. |
- Add pendingRestarts processing after tool execution completes - Add cleanup for activeToolExecutions with timeout mechanism to prevent memory leaks - Refactor updateServerToolList to reduce code duplication using updateServerToolListInternal - Fix race condition in watcher re-enable logic by checking disposal state - Add comprehensive test coverage for all fixes
|
@daniel-lxs All four critical issues have been fixed and pushed to the PR branch! ✅ Summary of Fixes:1. ✅ Missing pendingRestarts implementation
2. ✅ Memory leak in activeToolExecutions
3. ✅ Duplicate code between updateServerToolList functions
4. ✅ Race condition in watcher re-enable logic
Testing:
CI Status:All checks have passed successfully! ✅ The implementation ensures MCP servers remain stable during tool executions while preventing memory leaks and race conditions. Let me know if you need any clarification on the changes! |
This PR fixes an issue where MCP servers could be restarted while tools were actively executing, which could cause tool failures and inconsistent state.
Problem
When users toggled tool permissions or server states during active tool executions, the MCP servers would restart immediately, causing:
Solution
Implemented execution tracking in McpHub to:
Changes
activeExecutionsMap to track running tools per serverpendingRestartsSet to queue servers needing restarttoggleToolAlwaysAllow()to defer restarts during executiontoggleToolEnabledForPrompt()to defer restarts during executiontoggleServerDisabled()to prevent disabling during executionTesting
Fixes #7189
Important
Prevents MCP server restarts during active tool executions by deferring restarts until completion, with comprehensive test coverage added.
toggleToolAlwaysAllow(),toggleToolEnabledForPrompt(), andtoggleServerDisabled()to defer restarts.activeToolExecutionsandpendingRestartsinMcpHubto track executions and queue restarts.McpHub.spec.tsfor deferring restarts, handling concurrent executions, and edge cases.McpHub.tsto include execution tracking and restart deferral logic.pendingRestarts.This description was created by
for e835658. You can customize this summary. It will automatically update as commits are pushed.