Skip to content

RFC: tapes memory layer #81

@bdougie

Description

@bdougie

The current search pipeline is unstructured recall, embed nodes, query by similarity, return branches. It answers "what past conversations are similar to X?" but doesn't synthesize or prioritize.

What a memory layer could add

A memory config, modeled after the search config pattern, could introduce structured recall:

Short-term memory: branch-scoped context

  • Recent nodes within the current proxy branch
  • Sliding window or token budget (e.g., last N nodes)
  • Cheap, fast, ephemeral: lives in the worker pool or in-memory storage driver

Long-term memory: persistent, cross-branch knowledge

  • Extracting durable facts/patterns from conversations into a knowledge graph
  • Goes beyond vector similarity
  • Could use a graph store (Cognee-style) or a curated vector collection with metadata

How it could map to tapes' config pattern

[memory]
provider = "cognee"       # or "local", "graph"
target = "http://..."

[memory.short_term]
enabled = true
window = 10               # last N nodes in current branch

[memory.long_term]
enabled = true
provider = "cognee"       # knowledge graph extraction

This follows the same optional-feature pattern as search:

  • Zero config — proxy works without it
  • Graceful degradation — returns "memory not configured" if missing
  • Driver interfacememory.Driver with Store(), Recall(), Forget()
  • Pluggable backends — local graph, Cognee, or custom

Key difference from search

Search is query-driven (user asks, system retrieves). Memory would be context-driven: the proxy automatically injects relevant recalled context into LLM requests before forwarding them upstream.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions