Symptom
Every hivemind memory recall is slow whenever the sessions table has more than ~10 seconds of latency, even if the answer is already in the memory (summaries) table after ~1.5 seconds. Agents see Search failed: Query timeout after 10000ms and either fall back to other tools (hivemind_index + hivemind_read, ~2× more round-trips) or — worse — fail the whole call. End-user-visible: telegram responses come ~15-20s late, Claude Code Grep for memory hits the SDK abort, etc.
Where
Shared across every hivemind agent path. src/shell/grep-core.ts:searchDeeplakeTables is the single primitive. Every consumer routes through it:
| Agent path |
Call site |
Claude Code / codex / hermes — Grep PreToolUse intercept |
src/shell/grep-interceptor.ts:131,150 |
Cursor — grep/rg PreToolUse intercept |
src/hooks/cursor/pre-tool-use.ts:1 |
| Direct hook (Claude Code grep-direct path) |
src/hooks/grep-direct.ts:330 |
Openclaw hivemind_search agent tool |
openclaw/src/index.ts:792, 908 |
MCP server memory_search tool (Cline / Roo / Kilo / etc.) |
src/mcp/server.ts:81 |
The function's own header confirms the design intent:
Runs both halves in a single UNION ALL query so each grep = one round-trip. … semantic catches conceptual matches that lexical can't express. De-duplicate by path in the outer layer; when a path appears in both halves, the semantic score wins.
i.e. one round-trip is by design, optimized for fewer Deeplake calls. That assumption breaks once sessions query latency exceeds the SDK timeout.
Why both tables are slow vs fast — measured
Captured live against org=activeloop, ws=hivemind (2026-05-17 02:33 UTC) using raw curl, bypassing the SDK timeout:
SELECT MAX(creation_date) FROM "sessions" → 2.27 s ✓
SELECT * FROM "sessions" LIMIT 0 → 0.20 s ✓ (schema only)
SELECT path, message::text FROM "sessions" WHERE message::text ILIKE → 16.02 s ✗ exceeds 10s SDK timeout
(memory UNION ALL sessions) ILIKE → 10.48 s ✗
So the UNION ALL itself takes ~10.5s — JUST slightly over the 10s AbortSignal.timeout(10000) in src/deeplake-api.ts:44. The hivemind plugin sees a timeout and reports Search failed: Query timeout after 10000ms, even though Deeplake returns HTTP 200 with valid data on the server side. The memory branch alone would finish in ~1.5s.
Tested with hits from kw = "openclaw" — both tables had matches, both contributed rows in the (otherwise discarded) response.
Why this matters now
Two PRs in flight have made the symptom more visible:
Proposed: tier the search
Move from one-call UNION ALL to a two-phase pattern. Phase A returns immediately on memory hit; Phase B fires only on miss.
// Sketch — src/shell/grep-core.ts
export async function searchDeeplakeTables(api, memoryTable, sessionsTable, opts) {
// Phase A: small/fast table only
const memoryOpts = { ...opts, scope: "memory-only" };
const memoryRows = await api.query(buildMemoryOnlySql(memoryOpts));
if (memoryRows.length >= (opts.minHitThreshold ?? 3) || opts.scope === "memory-only") {
return memoryRows;
}
// Phase B: only on miss / few hits, with a higher per-call timeout
// (env or config-overridable, separate from the per-table memory timeout).
const sessionRows = await api.query(buildSessionsOnlySql(opts), { timeoutMs: 30_000 });
return dedupeByPath([...memoryRows, ...sessionRows]);
}
Net effect for the common case:
- Memory has a relevant hit → return in ~1.5s instead of ~10.5s
- Memory is sparse → fall through to sessions, no functionality lost
Trade-offs:
- Adds one extra round-trip in the sparse case (acceptable — that case is rare and already slow)
- Score-blending becomes per-phase (current code does
semantic > lexical deduping in the outer layer; needs a re-think when results come from two separate queries)
searchOpts.contentScanOnly regex-path filtering (line 793 in openclaw bundle, line ~620 in source) needs to stay correct across both phases
Alternative ideas
- Add
scope: "memory" | "sessions" | "both" to SearchOptions. Have the agent's tool-call (or the prompt nudge) prefer scope: memory first; the LLM can fall through to scope: sessions if memory is empty. Same effect, smaller refactor, but pushes the decision into the agent instead of the library.
- Server-side fix (Deeplake team). Index
message::text for ILIKE OR use message_embedding for semantic-only search. Removes the underlying slowness, no client change. But out of our control; long lead time.
- Per-table timeouts in the SDK (
HIVEMIND_QUERY_TIMEOUT_MEMORY_MS vs ..._SESSIONS_MS). Doesn't fix UNION ALL waiting for both branches — even if sessions gets a 30s budget, memory results still wait 10s to come back. Only helps if we already split the query.
Acceptance criteria
Related
Notes for the implementer
searchDeeplakeTables's comment already calls out the hybrid lexical+semantic branch is "one round-trip by design." That's a meaningful axis to preserve — if we move to two-phase, both the lexical hybrid AND the pure-lexical fast path should still be one-round-trip per phase. The constraint relaxes from "one round-trip" to "one round-trip per table."
Telegram round-trip measurement on the live gateway (2026-05-17): a memory-recall question hit hivemind_search first, timed out at 10s, fell back to hivemind_index (succeeded in ~2s) and hivemind_read on a specific summary (~1s), and the agent answered correctly. Total latency ~15s. With tiered search it would be ~3s.
Symptom
Every hivemind memory recall is slow whenever the
sessionstable has more than ~10 seconds of latency, even if the answer is already in thememory(summaries) table after ~1.5 seconds. Agents seeSearch failed: Query timeout after 10000msand either fall back to other tools (hivemind_index+hivemind_read, ~2× more round-trips) or — worse — fail the whole call. End-user-visible: telegram responses come ~15-20s late, Claude CodeGrepfor memory hits the SDK abort, etc.Where
Shared across every hivemind agent path.
src/shell/grep-core.ts:searchDeeplakeTablesis the single primitive. Every consumer routes through it:GrepPreToolUse interceptsrc/shell/grep-interceptor.ts:131,150grep/rgPreToolUse interceptsrc/hooks/cursor/pre-tool-use.ts:1src/hooks/grep-direct.ts:330hivemind_searchagent toolopenclaw/src/index.ts:792, 908memory_searchtool (Cline / Roo / Kilo / etc.)src/mcp/server.ts:81The function's own header confirms the design intent:
i.e. one round-trip is by design, optimized for fewer Deeplake calls. That assumption breaks once
sessionsquery latency exceeds the SDK timeout.Why both tables are slow vs fast — measured
Captured live against
org=activeloop, ws=hivemind(2026-05-17 02:33 UTC) using raw curl, bypassing the SDK timeout:So the UNION ALL itself takes ~10.5s — JUST slightly over the 10s
AbortSignal.timeout(10000)insrc/deeplake-api.ts:44. The hivemind plugin sees a timeout and reportsSearch failed: Query timeout after 10000ms, even though Deeplake returns HTTP 200 with valid data on the server side. The memory branch alone would finish in ~1.5s.Tested with hits from
kw = "openclaw"— both tables had matches, both contributed rows in the (otherwise discarded) response.Why this matters now
Two PRs in flight have made the symptom more visible:
hivemind_searchtool. Lazy/tool-only flow exposes the per-tool-call latency directly to the user instead of hiding it inside a system-prompt build step. Net win (the agent can fall back), but the slow case is now user-visible.process.env.HIVEMIND_QUERY_TIMEOUT_MStoundefinedin the openclaw bundle. So the existing env-var override I added to mitigate this no longer works on openclaw. Bump-the-timeout escape hatch is gone — we need a real fix.Proposed: tier the search
Move from one-call UNION ALL to a two-phase pattern. Phase A returns immediately on memory hit; Phase B fires only on miss.
Net effect for the common case:
Trade-offs:
semantic > lexicaldeduping in the outer layer; needs a re-think when results come from two separate queries)searchOpts.contentScanOnlyregex-path filtering (line 793 in openclaw bundle, line ~620 in source) needs to stay correct across both phasesAlternative ideas
scope: "memory" | "sessions" | "both"toSearchOptions. Have the agent's tool-call (or the prompt nudge) preferscope: memoryfirst; the LLM can fall through toscope: sessionsif memory is empty. Same effect, smaller refactor, but pushes the decision into the agent instead of the library.message::textfor ILIKE OR usemessage_embeddingfor semantic-only search. Removes the underlying slowness, no client change. But out of our control; long lead time.HIVEMIND_QUERY_TIMEOUT_MEMORY_MSvs..._SESSIONS_MS). Doesn't fix UNION ALL waiting for both branches — even ifsessionsgets a 30s budget, memory results still wait 10s to come back. Only helps if we already split the query.Acceptance criteria
hivemind_search "openclaw"returns hits within ~3s on the same dataset where today's UNION ALL takes ~10.5s--use-semanticcallers)Related
message_embedding. If we ever do semantic-only on the sessions table, that gap matters more.HIVEMIND_QUERY_TIMEOUT_MSfor openclaw. Once this tiered design lands, that env-var override is moot for the recall path (timeouts move into the SDK call options, not a global env).hivemind_searchtool calls; this issue tightens the latency of those tool calls.Notes for the implementer
searchDeeplakeTables's comment already calls out the hybrid lexical+semantic branch is "one round-trip by design." That's a meaningful axis to preserve — if we move to two-phase, both the lexical hybrid AND the pure-lexical fast path should still be one-round-trip per phase. The constraint relaxes from "one round-trip" to "one round-trip per table."Telegram round-trip measurement on the live gateway (2026-05-17): a memory-recall question hit
hivemind_searchfirst, timed out at 10s, fell back tohivemind_index(succeeded in ~2s) andhivemind_readon a specific summary (~1s), and the agent answered correctly. Total latency ~15s. With tiered search it would be ~3s.