Skip to content

Conversation

mattvaughan
Copy link

Description

When an MCP server connection dies, or the client receives a response that isn't part of the MCP protocol (like a 400 for an unrecognized session id), the caller is never notified of these errors, and hangs waiting for a result on the call_tool_(a)sync Future.

An easy way to reproduce this "hang" is to have an agent invoke a tool on an MCP server once, kill the server, then have it try to invoke the same MCP tool again.

This PR aims to address this issue by holding a copy of any Futures that don't yet have a result, and causing them to fail fast instead of hang when there is an issue with the MCP connection.

I've intentionally left out any "retry" logic, leaving the reinitialization of the tool and its connection up to the caller.

Example logs with fix:

[2025-09-30 15:59:11.770] httpx [pid:376743] - INFO - HTTP Request: POST http://localhost:8888/mcp "HTTP/1.1 400 Bad Request"                                  
[2025-09-30 15:59:11.771] strands.tools.mcp.mcp_client [pid:376743] - ERROR - MCP background thread failure during async tool call: MCP background thread died:
 unhandled errors in a TaskGroup (1 sub-exception)                                                                                                             
Traceback (most recent call last):                                                                                                                             
  File "/path/to/project/sdk-python/src/strands/tools/mcp/mcp_client.py", line 343, in call_tool_async                                                      
    call_tool_result: MCPCallToolResult = await asyncio.wrap_future(future)    

Related Issues

#792

Documentation PR

N/A

Type of Change

Bug fix

Testing

This shouldn't impact any existing functionality. It only enhances how errors in the MCP client <> server background connection are surfaced to callers.

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

- Add MCPConnectionError exception for runtime connection failures
- Track pending futures in MCPClient to prevent hangs
- Fail pending futures when background thread encounters errors
- Add proper exception handling for MCPConnectionError in tool calls
- Clean up pending futures when client context exits

This prevents callers from hanging forever when the MCP background
thread dies due to connection issues or other runtime errors.
@bbckr
Copy link

bbckr commented Oct 6, 2025

cc @dbschmigelski

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants