fix(agent): inject memory + reflections per-turn, not in frozen snapshot (#41) by mattmezza · Pull Request #43 · mattmezza/mpa

mattmezza · 2026-06-27T22:48:19Z

Closes #41.

Problem

In session mode the static system prompt is snapshotted once per session so the cacheable prefix stays stable and the provider's prompt cache hits. But memories and task reflections were baked into that snapshot, so anything extracted mid-session was invisible to the model until /new rebuilt the snapshot. It bites hardest:

After compaction — a fact extracted early, then compacted out of raw context and absent from the frozen snapshot, becomes genuinely invisible.
Cross-chat — a fact taught in chat B lands in the global store but chat A's open session can't see it until A resets.
Reflections — a lesson from a tool failure is most useful on the next similar task, often in the same session.

Fix (the cheap one)

Move memory + reflections out of the frozen snapshot and into the existing per-turn, never-cached preamble (_turn_preamble), which already injects live date/time + the execution plan onto the current user message.

Static snapshot keeps only truly static content (persona / skills / instructions) — still snapshotted once, cache intact.
Memory is fetched fresh each turn via memory.format_for_prompt(query=message) — already query-aware — so it is both fresh and per-turn relevance-ranked.
The preamble rides on the new user turn, which is uncached anyway, so there is no extra cache miss on the static prefix or prior messages. Cost is only the (bounded, top-k) memory block's own tokens, paid on a turn you'd pay for regardless — far cheaper than rebuilding the whole prompt.
Resolved the issue's open question as preamble for both modes → one code path; injection mode also gets query-relevant-per-turn memory.

Reflections are injected fresh too (still query-less / most-recent-N — making them query-aware is the noted follow-up). The recall_memory tool is the issue's explicitly-deferred phase 2 and is not in this PR.

Self-check

test_mid_session_memory_visible_next_turn_without_new asserts a long-term memory written mid-session appears in the next turn's preamble — without /new — while the static snapshot stays frozen.

Docs

Updated architecture.mdx and memory.mdx to describe per-turn injection (dropped the stale "snapshot holds memories" / "first message is the query" wording).

…hot (#41) Session mode snapshots the static system prompt once per session so the cacheable prefix stays stable. But memories and task reflections were baked into that snapshot, so anything extracted mid-session stayed invisible to the model until /new — hurting most after compaction, across chats, and for reflections meant for the next similar task in the same session. Move both into the per-turn preamble (already an uncached seam carrying the live date/time + execution plan). The static snapshot now holds only truly static content (persona/skills/instructions); memory is fetched fresh each turn via format_for_prompt(query=message), so it is both current and relevance-ranked per turn. Cost is only the block's own (bounded, top-k) tokens on the new turn — far cheaper than rebuilding the whole prefix.

#41) Asserts a long-term memory written mid-session appears in the next turn's preamble — without /new and without rebuilding the frozen session snapshot.

…n cost (#41) Adversarial review: extend the mid-session self-check to insert a task reflection too and assert <task_reflections> appears in the next preamble — covering the third staleness case the issue names. Add a comment flagging that session mode now retrieves per turn (intended; phase-2 recall tool if the store grows huge).

…ate (#42) Memory was global to the owner: every persona read and wrote one pool, so a fitness coach could surface facts only ever told to the finance assistant, and group multi-agent (#30) would silently share private memory. Add a `scope` column to long_term + short_term ('' = shared owner-level, '<persona>' = private to that persona), mirroring the secrets vault's two-tier shared/scoped model (#19). Additive ALTER TABLE migration — existing rows become shared, the correct default. - Retrieval (format_for_prompt / get_relevant_long_term / get_short_term) filters scope IN ('', <active persona>); the default identity sees shared only. - Extraction is tagged: the active persona is plumbed through; the extractor may mark a fact private to that persona, defaulting to shared when unsure. - Dedup/UPDATE/DELETE candidates are bounded to shared + own scope, so a private fact can never merge into or delete another persona's memory. - Consolidation promotes per scope; hygiene clusters within a scope only — no cross-persona merge. Composes with the per-turn memory seam from the preceding change (#41): scope is resolved from the active persona and passed into the same preamble injection.

…42) Show each memory's scope (shared vs the owning persona) in the admin Memory tables and the /memory/long-term + /memory/short-term JSON. Migrate-on-read so a legacy DB gains the column even when no agent is running.

#42) Covers the hard invariant: a persona's private memory is invisible to other personas and the default identity across retrieval, dedup candidates, and the hygiene pass (no cross-scope merge); legacy rows default to shared.

feat(memory): two-tier scoped memory — shared pool + per-persona private (#42)

mattmezza added 4 commits June 28, 2026 22:28

test(agent): async preamble + mid-session memory visibility self-check (

6dd2442

#41) Asserts a long-term memory written mid-session appears in the next turn's preamble — without /new and without rebuilding the frozen session snapshot.

docs: memory/reflections inject per-turn, not in the snapshot (#41)

c906362

mattmezza force-pushed the feat/per-turn-memory branch from cd73b75 to efd61da Compare June 28, 2026 20:29

mattmezza added 5 commits June 28, 2026 22:30

docs(memory): document two-tier scoped memory (#42)

ae8bd33

Merge pull request #45 from mattmezza/feat/scoped-memory

80e5f37

feat(memory): two-tier scoped memory — shared pool + per-persona private (#42)

mattmezza merged commit 634b16e into main Jun 28, 2026
1 check passed

mattmezza deleted the feat/per-turn-memory branch June 28, 2026 20:40

This was referenced Jun 28, 2026

fix(agent): inject skills index per-turn, not in the frozen snapshot (#46) #48

Merged

feat(memory): recall_memory tool — deliberate full-store semantic lookup (#47) #49

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): inject memory + reflections per-turn, not in frozen snapshot (#41)#43

fix(agent): inject memory + reflections per-turn, not in frozen snapshot (#41)#43
mattmezza merged 9 commits into
mainfrom
feat/per-turn-memory

mattmezza commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mattmezza commented Jun 27, 2026

Problem

Fix (the cheap one)

Self-check

Docs

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant