[fix] catch CancelledError in MCP reconnection and remove redundant build_tools#6631
[fix] catch CancelledError in MCP reconnection and remove redundant build_tools#6631hztBUAA wants to merge 5 commits intoagno-agi:mainfrom
Conversation
…ild_tools When an MCP server disconnects and reconnection is attempted, asyncio.CancelledError was being silently swallowed by broad BaseException catches, preventing proper task cancellation propagation. Additionally, build_tools was called redundantly during reconnection since connect(force=True) already calls initialize() -> build_tools() internally. Changes: - Add explicit asyncio.CancelledError handling in MCPTools.connect(), build_tools(), and initialize() to re-raise instead of swallowing - Add same CancelledError handling in MultiMCPTools.connect() and __aenter__() - Restructure refresh_connection logic in agent/_tools.py and team/_tools.py to only call build_tools() when connection is alive (avoiding duplicate call after reconnection) - Add asyncio.CancelledError re-raise in the refresh connection exception handlers
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 35aba6d6c0
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if not is_alive: | ||
| # connect(force=True) internally calls initialize() -> build_tools(), | ||
| # so no separate build_tools() call is needed after reconnection. | ||
| await tool.connect(force=True) # type: ignore |
There was a problem hiding this comment.
Avoid appending stale MCP tools after failed reconnect
In this if not is_alive branch, connect(force=True) is now the only refresh step, but MCPTools.connect() swallows non-cancellation connection failures and returns without raising, so execution falls through to tool inclusion when check_mcp_tools=False. That flag is used by schema endpoints (e.g. os/routers/agents/schema.py), so a dead MCP server can now leave a stale toolkit (with old functions) in the returned tool list instead of being skipped as before when the post-connect build_tools() failure triggered continue.
Useful? React with 👍 / 👎.
…ools Without this guard, a task cancellation during send_ping() would be caught by the broad `except BaseException` handler, returning False instead of propagating the cancellation signal. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Fixes asyncio.CancelledError being silently swallowed during MCP reconnection, and removes redundant build_tools calls after forced reconnection.
Fixes #6235
Type of change
Checklist
./scripts/format.shand./scripts/validate.shAdditional Notes
Root cause: Broad
except (RuntimeError, BaseException)handlers in MCPTools swallowedasyncio.CancelledError, preventing proper task cancellation. Additionally,refresh_connectioncalledconnect(force=True)(which runs build_tools internally) followed by another explicitbuild_tools().Fix: Added explicit
CancelledErrorre-raise before broad handlers in MCPTools/MultiMCPTools. Restructured refresh logic to skip redundant build_tools after reconnection.