feat(memory): recall_memory tool — deliberate full-store semantic lookup (#47) by mattmezza · Pull Request #49 · mattmezza/mpa

mattmezza · 2026-06-28T21:08:41Z

Closes #47. Phase 2 of #41.

Why

Memory injection (PR #43) always injects a small relevance-ranked top-k into each turn. That covers the obvious facts, but as the store grows top-k can't surface everything — and the agent can't ask for a fact it doesn't suspect it has. The robust shape is hybrid: always-inject top-k plus a deliberate recall tool for the long tail.

What

A recall_memory tool the agent calls on demand to semantically search its full long-term store.

MemoryStore.recall(query, limit, scope) — embeds the query and ranks the entire long-term store (archived rows included) by pure semantic relevance, returning the best matches above a small floor. Falls back to lexical token-overlap when embeddings are disabled, so it always works.
Scope-aware (Two-tier scoped memory — shared owner pool + per-persona private memory #42) — filtered to scope IN ('', <active persona>) exactly like the injection readers, so a persona only recalls shared facts + its own private ones, never another persona's. The scope is the per-turn persona_name; the default identity recalls shared only.
Recall revives matches — reinforces them (access_count / last_accessed) and un-archives any that were archived (a fact the agent looked up and used is warm again).
Tool wiring — read-only (ALWAYS permission, no approval prompt), always retained per persona like load_skill (memory injection is always-on too, so recall exposes nothing extra). Nudged in the memory prompt instruction and the memory skill.
recall_top_k config knob (default 10), beside injection_top_k, hot-reloaded in patch_config.

Notes

Branched off main and rebased onto current main (which now carries scoped memory Two-tier scoped memory — shared owner pool + per-persona private memory #42 + per-turn injection fix(agent): inject memory + reflections per-turn, not in frozen snapshot (#41) #43). An adversarial multi-agent review flagged that an unscoped recall query would have silently become a cross-persona leak on the scope-aware base — fixed by making recall() scope-aware and threading the persona's scope through the tool; covered by a new scope-isolation test.

Tests

uv run pytest — 471 passed. New coverage: semantic ranking, relevance-floor exclusion, archived-row search + un-archive/reinforce, limit cap, lexical fallback, per-persona scope isolation, tool dispatch with scope plumbing, blank-query guard. Docs (docs/memory.mdx, skills/memory.md) updated.

…kup (#47) Phase 2 of #41. Always-on top-k injection covers the obvious facts but can't surface everything as the store grows, and the agent can't ask for a fact it doesn't suspect it has. Pair it with a recall_memory tool the agent calls on demand: - MemoryStore.recall(query, limit, scope): semantic search over the agent's ENTIRE long-term store (archived included), ranked purely by relevance above a small floor; lexical-overlap fallback when embeddings are off. - Scope-aware (#42): filtered to scope IN ('', <persona>) exactly like the injection readers, so a persona only recalls shared + its own private memories, never another persona's. Scope is the per-turn persona_name. - Recalling revives matches — reinforces them and un-archives any that were archived (looked up + matched = warm again). - recall_memory tool: read-only (ALWAYS permission, no prompt), always retained per persona like load_skill (memory injection is always-on too, so recall exposes nothing extra); nudged in the memory prompt. - recall_top_k config knob (default 10), hot-reloaded in patch_config.

…ch (#47) Covers semantic ranking, relevance-floor exclusion, archived-row search + un-archive/reinforce, limit cap, lexical fallback without an embedder, per-persona scope isolation (a persona never recalls another's private rows), tool dispatch with scope plumbing, and the blank-query guard.

#47) The floor-exclusion assertion lived on the embedding path, where the test _HashEmbedder buckets tokens via salted hash() into 64 dims — collisions pushed the unrelated row above the floor for ~20% of PYTHONHASHSEEDs (e.g. seed 42), failing intermittently. Move the exclusion check to the lexical-fallback path, where zero token overlap is deterministic and still guards _RECALL_MIN_RELEVANCE against being lowered to 0. Verified green across seeds 0/1/2/7/42/123/999. Also drop a placeholder (#) markdown link in the memory docs.

Sibling of injection_top_k: surfaces memory.embedding.recall_top_k via the config GET (emb_recall_top_k) and a number input next to the existing top-k field, capped at 25 (the recall hard limit) so a set value is never silently clamped. Saved through the same embedding-settings PATCH, which already hot-applies it to the running store — no restart.

mattmezza added 5 commits June 28, 2026 23:07

docs(memory): document recall_memory tool + recall_top_k (#47)

aa77992

mattmezza merged commit 9b26415 into main Jun 28, 2026
1 check passed

mattmezza deleted the feat/recall-memory branch June 28, 2026 21:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(memory): recall_memory tool — deliberate full-store semantic lookup (#47)#49

feat(memory): recall_memory tool — deliberate full-store semantic lookup (#47)#49
mattmezza merged 5 commits into
mainfrom
feat/recall-memory

mattmezza commented Jun 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mattmezza commented Jun 28, 2026

Why

What

Notes

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant