fix(mcp): async tool handlers + pre-computed embeddings — parallel-store hang by Huntehhh · Pull Request #318 · buildingjoshbetter/TrueMemory

Huntehhh · 2026-05-14T22:16:14Z

Summary

Resolves the 10-15 second Claude Code harness hang when 3+ parallel truememory_store (or truememory_search) MCP calls are issued in a single tool batch. Server-side writes complete in ~60ms — the hang is at the MCP transport + engine-lock layers.

Two-layer fix:

MCP layer: 7 hot-path @mcp.tool() handlers changed def → async def. Engine calls wrapped in asyncio.to_thread() so FastMCP's single event-loop thread stays free for concurrent JSON-RPC requests. Also fixes @tracked to return an async wrapper for coroutine functions (otherwise the async-ification is silently defeated by the existing sync wrapper).
Engine layer: TrueMemoryEngine.add() pre-computes both embeddings OUTSIDE _write_lock. Previously the lock was held during ~10-50ms of model.encode() per store, serializing all concurrent writes. PyTorch releases the GIL inside .encode(), so concurrent stores now overlap on inference; they only contend at the actual INSERTs (μs).

Changes

File	Change
`truememory/mcp_server.py`	7 tool handlers `def → async def`; engine calls via `await asyncio.to_thread(...)`
`truememory/engine.py`	`add()` pre-computes content + separation embeddings outside `_write_lock`
`truememory/telemetry.py`	`@tracked` detects coroutine functions and returns an async wrapper for them
`tests/test_health_stats.py`	Update `test_truememory_stats_includes_health` to await the now-async handler
`tests/test_concurrent_store_hang.py`	NEW — 3 regression tests (threaded `engine.add()`, handler-shape check, `asyncio.gather()` end-to-end)

truememory_configure intentionally stays sync — heavy state mutation, called once at setup, not on the hot path.

Test Plan

New regression tests pass: pytest tests/test_concurrent_store_hang.py (3/3)
Health stats tests pass: pytest tests/test_health_stats.py (11/11)
Threading tests pass: pytest tests/test_ensure_connection_threading.py (2/2)
Wider suite on Windows: 572 pass / 17 pre-existing failures unrelated to this change (test_platform_compat.py POSIX-path tests, test_spawn_gate.py system-memory dependent, test_cli_help.py locale issues, etc.)
Manual verification needed on POSIX — pytest on Linux/macOS to confirm no regression in the Windows-failing tests

Breaking Changes

The 7 MCP tool handlers are now async def. Transparent for MCP clients (JSON-RPC wire protocol unchanged). Direct Python callers must update:

- result = truememory_store(content="...", user_id="...")
+ result = await truememory_store(content="...", user_id="...")
+ # or, from sync code:  result = asyncio.run(truememory_store(content="...", user_id="..."))

Co-Authored-By: claude-opus-4-7 wontreply@getfucked.ai

…ore hang Resolves the 10-15s harness hang when 3+ truememory_store or search MCP calls fire in parallel. Three layered changes: 1. mcp_server.py — 7 hot-path @mcp.tool() handlers (store / search / search_deep / get / forget / stats / entity_profile) changed from sync `def` to `async def`. Engine calls run via `await asyncio.to_thread(...)` so FastMCP's event-loop thread stays free for concurrent JSON-RPC requests. truememory_configure stays sync — heavy state mutation, called once at setup. 2. telemetry.py — `@tracked` is now async-aware. Wrapping an `async def` in the old sync wrapper produced an unawaited coroutine object that silently defeated the async-ification. 3. engine.py — `add()` pre-computes both content + separation embeddings OUTSIDE `_write_lock`. Previously the lock was held during the two ~10-50ms model.encode() calls, serializing all concurrent stores. PyTorch releases the GIL inside .encode(), so concurrent stores can now overlap on inference; they only contend at the INSERTs (μs). Tests: - tests/test_concurrent_store_hang.py (new): three regression locks — threaded engine.add(), MCP handler-shape check, asyncio.gather() end-to-end. - tests/test_health_stats.py: wrap the now-async truememory_stats() in asyncio.run(). Co-Authored-By: claude-opus-4-7 <wontreply@getfucked.ai>

This was referenced May 16, 2026

fix(mcp): explicit queries list + 60s timeout for search tools (closes #328) #329

Open

fix(mcp): cold-start resilience — async handlers, reranker timeout, Windows portability #344

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mcp): async tool handlers + pre-computed embeddings — parallel-store hang#318

fix(mcp): async tool handlers + pre-computed embeddings — parallel-store hang#318
Huntehhh wants to merge 1 commit into
buildingjoshbetter:mainfrom
Huntehhh:fix/async-mcp-handlers-parallel-store-hang

Huntehhh commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Huntehhh commented May 14, 2026

Summary

Changes

Test Plan

Breaking Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant