Skip to content

streamable HTTP (stateful): concurrent requests with duplicate JSON-RPC ids cross-wire responses — one request receives another's payload, the other hangs #3060

Description

@scottmsilver

Summary

In the stateful streamable-HTTP server transport, two concurrent POSTs on the same session that carry the same JSON-RPC request id cross-wire: the later request receives the earlier request's response (entire envelope, wrong arity and all), the later request's own response is silently dropped, and the earlier request's SSE stream hangs forever (keep-alive pings defeat client read timeouts).

Duplicate in-flight ids violate the spec ("The request ID MUST NOT have been previously used by the requestor within the same session"), but a real production client — claude.ai's custom-connector MCP client sends every request with id: 1 — triggers this constantly, and the server neither rejects nor tolerates the violation: it silently mis-routes user data. We observed one user's tool response delivered to a different conversation's request in production before isolating this mechanism.

Mechanism (v1.27.0; the same code is present on current main)

mcp/server/streamable_http.py:

  1. POST handler (~L534–537): request_id = str(message.root.id) then self._request_streams[request_id] = anyio.create_memory_object_stream(...) — the routing table is keyed by request id alone. A second concurrent POST with the same id silently overwrites the first's slot; the first POST's reader keeps a now-orphaned stream.
  2. message_router (~L997–1045): the handler's response is routed by response_id to whichever POST currently owns the slot — i.e. the latest arrival, regardless of which request it answers.
  3. sse_writer cleanup (~L597–616): after delivering one response the slot is popped, so the second response finds no slot and is dropped (debug log: Request stream 1 not found).

Result for two overlapping requests A (id=1, slow) then B (id=1): B receives A's response; B's response is dropped; A hangs indefinitely.

mcp/shared/session.py (~L375) has the same single-key assumption in _in_flight[responder.request_id], which additionally breaks cancellation targeting under duplicate ids.

Reproduction

Toy server (echo tool with random 0–2s sleep):

"""repro_server.py"""
import random
import anyio
from mcp.server.fastmcp import FastMCP

mcp = FastMCP(name="repro", host="127.0.0.1", port=18999)

@mcp.tool()
async def echo(sentinel: str) -> str:
    await anyio.sleep(random.uniform(0.0, 2.0))
    return f"ECHO:{sentinel}"

if __name__ == "__main__":
    mcp.run(transport="streamable-http")

Client: one initialized session, 12 concurrent tools/call POSTs, each with a unique sentinel; mode same_id uses id: 1 for all (mimicking claude.ai), mode unique_id is the control:

"""repro_client.py — run: python repro_client.py same_id unique_id"""
import asyncio, json, sys
import httpx

BASE = "http://127.0.0.1:18999/mcp"
HEADERS = {"accept": "application/json, text/event-stream", "content-type": "application/json"}
N, TIMEOUT = 12, 20.0

def parse_sse_for_response(text):
    for line in text.splitlines():
        if line.startswith("data:"):
            try:
                obj = json.loads(line[5:].strip())
            except json.JSONDecodeError:
                continue
            if "result" in obj or "error" in obj:
                return obj
    return None

async def initialize(client):
    r = await client.post(BASE, headers=HEADERS, json={
        "jsonrpc": "2.0", "id": 0, "method": "initialize",
        "params": {"protocolVersion": "2025-06-18", "capabilities": {},
                   "clientInfo": {"name": "repro", "version": "0"}}})
    r.raise_for_status()
    sid = r.headers.get("mcp-session-id")
    hdrs = {**HEADERS, **({"mcp-session-id": sid} if sid else {})}
    r2 = await client.post(BASE, headers=hdrs,
                           json={"jsonrpc": "2.0", "method": "notifications/initialized"})
    assert r2.status_code in (200, 202)
    return sid

def _sid_headers(sid):
    return {**HEADERS, **({"mcp-session-id": sid} if sid else {})}

async def call_tool(client, sid, rpc_id, sentinel):
    body = {"jsonrpc": "2.0", "id": rpc_id, "method": "tools/call",
            "params": {"name": "echo", "arguments": {"sentinel": sentinel}}}
    try:
        # wall-clock timeout: SSE pings defeat httpx read timeouts on orphaned streams
        r = await asyncio.wait_for(
            client.post(BASE, headers=_sid_headers(sid), json=body, timeout=TIMEOUT),
            timeout=TIMEOUT)
    except (httpx.TimeoutException, asyncio.TimeoutError):
        return (sentinel, "TIMEOUT", None)
    obj = parse_sse_for_response(r.text)
    if obj is None or "error" in obj:
        return (sentinel, "OTHER", r.text[:120])
    got = json.dumps(obj["result"])
    return (sentinel, "OK" if f"ECHO:{sentinel}" in got else "SWAPPED", got[:160])

async def run_burst(mode):
    async with httpx.AsyncClient() as client:
        sid = await initialize(client)
        tasks = [call_tool(client, sid, 1 if mode == "same_id" else 100 + i, f"SENT_{mode}_{i:02d}")
                 for i in range(N)]
        results = await asyncio.gather(*tasks)
    print(f"=== mode={mode} ===")
    print("OK=%d SWAPPED=%d TIMEOUT=%d" % (
        sum(s == "OK" for _, s, _ in results),
        sum(s == "SWAPPED" for _, s, _ in results),
        sum(s == "TIMEOUT" for _, s, _ in results)))
    for sent, status, detail in results:
        if status != "OK":
            print(f"  {sent}: {status} {detail or ''}")

async def main():
    for mode in sys.argv[1:] or ["same_id", "unique_id"]:
        await run_burst(mode)

asyncio.run(main())

Observed results (mcp 1.27.0, Python 3.12, Linux)

mode OK SWAPPED TIMEOUT
same_id (12 concurrent, all id: 1) 0 1 11
unique_id control 12 0 0

Pairwise variant (A slow, B posted 300ms later, both id: 1, 10 trials): A never receives a response; B receives A's payload in 7/10 trials (B's own in the rest). This exactly matched our production incident, including the response-arity mismatch.

stateless_http=True is immune by construction (per-request transport): same burst is 12/12 OK.

Suggested fix

At minimum, reject a POST whose request id is already in flight on the session with a JSON-RPC -32600 error instead of silently overwriting the routing slot — a protocol violation should fail loudly, not deliver one user's data to another request. (Queueing/serializing duplicate-id requests would also work and keeps the misbehaving-but-widespread client functional.)

The TypeScript SDK server has the same unguarded pattern (_requestToStreamMapping.set(message.id, streamId)); filing separately there. The client-side id-reuse is also being reported to Anthropic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions