fix(agent): inject memory + reflections per-turn, not in frozen snapshot (#41)#43
Merged
Conversation
…hot (#41) Session mode snapshots the static system prompt once per session so the cacheable prefix stays stable. But memories and task reflections were baked into that snapshot, so anything extracted mid-session stayed invisible to the model until /new — hurting most after compaction, across chats, and for reflections meant for the next similar task in the same session. Move both into the per-turn preamble (already an uncached seam carrying the live date/time + execution plan). The static snapshot now holds only truly static content (persona/skills/instructions); memory is fetched fresh each turn via format_for_prompt(query=message), so it is both current and relevance-ranked per turn. Cost is only the block's own (bounded, top-k) tokens on the new turn — far cheaper than rebuilding the whole prefix.
#41) Asserts a long-term memory written mid-session appears in the next turn's preamble — without /new and without rebuilding the frozen session snapshot.
…n cost (#41) Adversarial review: extend the mid-session self-check to insert a task reflection too and assert <task_reflections> appears in the next preamble — covering the third staleness case the issue names. Add a comment flagging that session mode now retrieves per turn (intended; phase-2 recall tool if the store grows huge).
cd73b75 to
efd61da
Compare
…ate (#42) Memory was global to the owner: every persona read and wrote one pool, so a fitness coach could surface facts only ever told to the finance assistant, and group multi-agent (#30) would silently share private memory. Add a `scope` column to long_term + short_term ('' = shared owner-level, '<persona>' = private to that persona), mirroring the secrets vault's two-tier shared/scoped model (#19). Additive ALTER TABLE migration — existing rows become shared, the correct default. - Retrieval (format_for_prompt / get_relevant_long_term / get_short_term) filters scope IN ('', <active persona>); the default identity sees shared only. - Extraction is tagged: the active persona is plumbed through; the extractor may mark a fact private to that persona, defaulting to shared when unsure. - Dedup/UPDATE/DELETE candidates are bounded to shared + own scope, so a private fact can never merge into or delete another persona's memory. - Consolidation promotes per scope; hygiene clusters within a scope only — no cross-persona merge. Composes with the per-turn memory seam from the preceding change (#41): scope is resolved from the active persona and passed into the same preamble injection.
…42) Show each memory's scope (shared vs the owning persona) in the admin Memory tables and the /memory/long-term + /memory/short-term JSON. Migrate-on-read so a legacy DB gains the column even when no agent is running.
#42) Covers the hard invariant: a persona's private memory is invisible to other personas and the default identity across retrieval, dedup candidates, and the hygiene pass (no cross-scope merge); legacy rows default to shared.
feat(memory): two-tier scoped memory — shared pool + per-persona private (#42)
This was referenced Jun 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #41.
Problem
In session mode the static system prompt is snapshotted once per session so the cacheable prefix stays stable and the provider's prompt cache hits. But memories and task reflections were baked into that snapshot, so anything extracted mid-session was invisible to the model until
/newrebuilt the snapshot. It bites hardest:Fix (the cheap one)
Move memory + reflections out of the frozen snapshot and into the existing per-turn, never-cached preamble (
_turn_preamble), which already injects live date/time + the execution plan onto the current user message.memory.format_for_prompt(query=message)— already query-aware — so it is both fresh and per-turn relevance-ranked.Reflections are injected fresh too (still query-less / most-recent-N — making them query-aware is the noted follow-up). The
recall_memorytool is the issue's explicitly-deferred phase 2 and is not in this PR.Self-check
test_mid_session_memory_visible_next_turn_without_newasserts a long-term memory written mid-session appears in the next turn's preamble — without/new— while the static snapshot stays frozen.Docs
Updated
architecture.mdxandmemory.mdxto describe per-turn injection (dropped the stale "snapshot holds memories" / "first message is the query" wording).