Summary
In the stateful streamable-HTTP server transport, two concurrent POSTs on the same session that carry the same JSON-RPC request id cross-wire: the later request receives the earlier request's response (entire envelope, wrong arity and all), the later request's own response is silently dropped, and the earlier request's SSE stream hangs forever (keep-alive pings defeat client read timeouts).
Duplicate in-flight ids violate the spec ("The request ID MUST NOT have been previously used by the requestor within the same session"), but a real production client — claude.ai's custom-connector MCP client sends every request with id: 1 — triggers this constantly, and the server neither rejects nor tolerates the violation: it silently mis-routes user data. We observed one user's tool response delivered to a different conversation's request in production before isolating this mechanism.
Mechanism (v1.27.0; the same code is present on current main)
mcp/server/streamable_http.py:
- POST handler (~L534–537):
request_id = str(message.root.id) then self._request_streams[request_id] = anyio.create_memory_object_stream(...) — the routing table is keyed by request id alone. A second concurrent POST with the same id silently overwrites the first's slot; the first POST's reader keeps a now-orphaned stream.
message_router (~L997–1045): the handler's response is routed by response_id to whichever POST currently owns the slot — i.e. the latest arrival, regardless of which request it answers.
sse_writer cleanup (~L597–616): after delivering one response the slot is popped, so the second response finds no slot and is dropped (debug log: Request stream 1 not found).
Result for two overlapping requests A (id=1, slow) then B (id=1): B receives A's response; B's response is dropped; A hangs indefinitely.
mcp/shared/session.py (~L375) has the same single-key assumption in _in_flight[responder.request_id], which additionally breaks cancellation targeting under duplicate ids.
Reproduction
Toy server (echo tool with random 0–2s sleep):
"""repro_server.py"""
import random
import anyio
from mcp.server.fastmcp import FastMCP
mcp = FastMCP(name="repro", host="127.0.0.1", port=18999)
@mcp.tool()
async def echo(sentinel: str) -> str:
await anyio.sleep(random.uniform(0.0, 2.0))
return f"ECHO:{sentinel}"
if __name__ == "__main__":
mcp.run(transport="streamable-http")
Client: one initialized session, 12 concurrent tools/call POSTs, each with a unique sentinel; mode same_id uses id: 1 for all (mimicking claude.ai), mode unique_id is the control:
"""repro_client.py — run: python repro_client.py same_id unique_id"""
import asyncio, json, sys
import httpx
BASE = "http://127.0.0.1:18999/mcp"
HEADERS = {"accept": "application/json, text/event-stream", "content-type": "application/json"}
N, TIMEOUT = 12, 20.0
def parse_sse_for_response(text):
for line in text.splitlines():
if line.startswith("data:"):
try:
obj = json.loads(line[5:].strip())
except json.JSONDecodeError:
continue
if "result" in obj or "error" in obj:
return obj
return None
async def initialize(client):
r = await client.post(BASE, headers=HEADERS, json={
"jsonrpc": "2.0", "id": 0, "method": "initialize",
"params": {"protocolVersion": "2025-06-18", "capabilities": {},
"clientInfo": {"name": "repro", "version": "0"}}})
r.raise_for_status()
sid = r.headers.get("mcp-session-id")
hdrs = {**HEADERS, **({"mcp-session-id": sid} if sid else {})}
r2 = await client.post(BASE, headers=hdrs,
json={"jsonrpc": "2.0", "method": "notifications/initialized"})
assert r2.status_code in (200, 202)
return sid
def _sid_headers(sid):
return {**HEADERS, **({"mcp-session-id": sid} if sid else {})}
async def call_tool(client, sid, rpc_id, sentinel):
body = {"jsonrpc": "2.0", "id": rpc_id, "method": "tools/call",
"params": {"name": "echo", "arguments": {"sentinel": sentinel}}}
try:
# wall-clock timeout: SSE pings defeat httpx read timeouts on orphaned streams
r = await asyncio.wait_for(
client.post(BASE, headers=_sid_headers(sid), json=body, timeout=TIMEOUT),
timeout=TIMEOUT)
except (httpx.TimeoutException, asyncio.TimeoutError):
return (sentinel, "TIMEOUT", None)
obj = parse_sse_for_response(r.text)
if obj is None or "error" in obj:
return (sentinel, "OTHER", r.text[:120])
got = json.dumps(obj["result"])
return (sentinel, "OK" if f"ECHO:{sentinel}" in got else "SWAPPED", got[:160])
async def run_burst(mode):
async with httpx.AsyncClient() as client:
sid = await initialize(client)
tasks = [call_tool(client, sid, 1 if mode == "same_id" else 100 + i, f"SENT_{mode}_{i:02d}")
for i in range(N)]
results = await asyncio.gather(*tasks)
print(f"=== mode={mode} ===")
print("OK=%d SWAPPED=%d TIMEOUT=%d" % (
sum(s == "OK" for _, s, _ in results),
sum(s == "SWAPPED" for _, s, _ in results),
sum(s == "TIMEOUT" for _, s, _ in results)))
for sent, status, detail in results:
if status != "OK":
print(f" {sent}: {status} {detail or ''}")
async def main():
for mode in sys.argv[1:] or ["same_id", "unique_id"]:
await run_burst(mode)
asyncio.run(main())
Observed results (mcp 1.27.0, Python 3.12, Linux)
| mode |
OK |
SWAPPED |
TIMEOUT |
same_id (12 concurrent, all id: 1) |
0 |
1 |
11 |
unique_id control |
12 |
0 |
0 |
Pairwise variant (A slow, B posted 300ms later, both id: 1, 10 trials): A never receives a response; B receives A's payload in 7/10 trials (B's own in the rest). This exactly matched our production incident, including the response-arity mismatch.
stateless_http=True is immune by construction (per-request transport): same burst is 12/12 OK.
Suggested fix
At minimum, reject a POST whose request id is already in flight on the session with a JSON-RPC -32600 error instead of silently overwriting the routing slot — a protocol violation should fail loudly, not deliver one user's data to another request. (Queueing/serializing duplicate-id requests would also work and keeps the misbehaving-but-widespread client functional.)
The TypeScript SDK server has the same unguarded pattern (_requestToStreamMapping.set(message.id, streamId)); filing separately there. The client-side id-reuse is also being reported to Anthropic.
Summary
In the stateful streamable-HTTP server transport, two concurrent POSTs on the same session that carry the same JSON-RPC request id cross-wire: the later request receives the earlier request's response (entire envelope, wrong arity and all), the later request's own response is silently dropped, and the earlier request's SSE stream hangs forever (keep-alive pings defeat client read timeouts).
Duplicate in-flight ids violate the spec ("The request ID MUST NOT have been previously used by the requestor within the same session"), but a real production client — claude.ai's custom-connector MCP client sends every request with
id: 1— triggers this constantly, and the server neither rejects nor tolerates the violation: it silently mis-routes user data. We observed one user's tool response delivered to a different conversation's request in production before isolating this mechanism.Mechanism (v1.27.0; the same code is present on current
main)mcp/server/streamable_http.py:request_id = str(message.root.id)thenself._request_streams[request_id] = anyio.create_memory_object_stream(...)— the routing table is keyed by request id alone. A second concurrent POST with the same id silently overwrites the first's slot; the first POST's reader keeps a now-orphaned stream.message_router(~L997–1045): the handler's response is routed byresponse_idto whichever POST currently owns the slot — i.e. the latest arrival, regardless of which request it answers.sse_writercleanup (~L597–616): after delivering one response the slot is popped, so the second response finds no slot and is dropped (debug log:Request stream 1 not found).Result for two overlapping requests A (id=1, slow) then B (id=1): B receives A's response; B's response is dropped; A hangs indefinitely.
mcp/shared/session.py(~L375) has the same single-key assumption in_in_flight[responder.request_id], which additionally breaks cancellation targeting under duplicate ids.Reproduction
Toy server (echo tool with random 0–2s sleep):
Client: one initialized session, 12 concurrent
tools/callPOSTs, each with a unique sentinel; modesame_idusesid: 1for all (mimicking claude.ai), modeunique_idis the control:Observed results (mcp 1.27.0, Python 3.12, Linux)
same_id(12 concurrent, allid: 1)unique_idcontrolPairwise variant (A slow, B posted 300ms later, both
id: 1, 10 trials): A never receives a response; B receives A's payload in 7/10 trials (B's own in the rest). This exactly matched our production incident, including the response-arity mismatch.stateless_http=Trueis immune by construction (per-request transport): same burst is 12/12 OK.Suggested fix
At minimum, reject a POST whose request id is already in flight on the session with a JSON-RPC
-32600error instead of silently overwriting the routing slot — a protocol violation should fail loudly, not deliver one user's data to another request. (Queueing/serializing duplicate-id requests would also work and keeps the misbehaving-but-widespread client functional.)The TypeScript SDK server has the same unguarded pattern (
_requestToStreamMapping.set(message.id, streamId)); filing separately there. The client-side id-reuse is also being reported to Anthropic.