OpenClaw Memory — Product Specification

One-liner

Give your OpenClaw agent real memory — three-layer, self-maintaining, locally searchable.

Problem

OpenClaw's compaction loses detail. The tech preferences you told your agent last week, the decisions you made, the names you mentioned — gone after compression. The native MEMORY.md is plain text with keyword-only search that mostly misses. Embeddings require remote API calls that cost money and leak data.

Solution

An OpenClaw plugin that stores memories in local chDB (embedded ClickHouse), generates vectors with Qwen3-Embedding-0.6B on-device, and runs semantic search via ClickHouse's native HNSW index. Fully local, zero API cost, zero data leakage.

Three-Layer Memory Model

This is the core of the entire system. The three layers are not tag categories — they are storage tiers with fundamentally different lifecycles, each with its own write rules, injection strategy, decay mechanics, and capacity constraints.

┌─────────────────────────────────────────────────────────────────┐
│                                                                  │
│  L0  Working Memory  (current focus)                             │
│  ┌────────────────────────────────────────────────────────────┐  │
│  │ "User is building memory system Phase 2, last discussed    │  │
│  │  HNSW index config"                                        │  │
│  │ "Follow-up: user said demo due by Friday"                  │  │
│  └────────────────────────────────────────────────────────────┘  │
│  Always injected · Overwritten after every conversation · ≤500t  │
│                                                                  │
├──────────────────────── ▲ overwrite ────────────────────────────┤
│                                                                  │
│  L1  Episodic Memory  (event stream)                             │
│  ┌────────────────────────────────────────────────────────────┐  │
│  │ 03-04 14:30  Decided on Python core + JS thin-shell arch   │  │
│  │ 03-04 10:15  Aligned API design with Alice, adopted gRPC   │  │
│  │ 03-03 16:00  Researched chDB HNSW index, confirmed 25.8 GA│  │
│  │ 03-01 09:00  Project kickoff, goal: replace sqlite-vec     │  │
│  │ ...                                                        │  │
│  └────────────────────────────────────────────────────────────┘  │
│  Retrieved on demand · Time-decayed · Compressed after 30d       │
│  · 500 entries/month cap                                         │
│                                                                  │
├──────── ▲ promote (recurring patterns)  ▼ compress (old → sum) ─┤
│                                                                  │
│  L2  Semantic Memory  (durable knowledge)                        │
│  ┌────────────────────────────────────────────────────────────┐  │
│  │ [preference]  Prefers SwiftUI over UIKit                   │  │
│  │ [preference]  Keep answers concise, skip basic explanations│  │
│  │ [knowledge]   iOS developer, based in Singapore            │  │
│  │ [person]      Alice is the backend lead                    │  │
│  │ [project]     Memory system project, goal: replace native  │  │
│  │ [todo]        Demo due by April 15                         │  │
│  └────────────────────────────────────────────────────────────┘  │
│  Always injected · Rarely changed · Never auto-deleted           │
│  · Overwritten only on contradiction                             │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Layer Specifications

L0 Working Memory — Current Focus

Property	Spec
Nature	The agent's "scratchpad" — what it's doing right now
Capacity	Strictly ≤ 500 tokens (~3–5 sentences)
Injection	Always injected in full at the top of the system prompt
Write timing	LLM rewrites (not appends — overwrites) at end of every conversation
Decay	None. Each rewrite naturally keeps it fresh
Persistence	Stored in chDB, but only the latest 1 entry retained (per agent)
Typical content	Current task focus, where the last conversation left off, pending follow-ups, interaction style notes

L1 Episodic Memory — Event Stream

Property	Spec
Nature	Timeline of events — "what happened when"
Capacity	500 entries/month cap; exceeding triggers compression
Injection	Retrieved on demand via semantic matching against current conversation
Injection budget	≤ 3000 tokens
Write timing	Extracted at end of conversation + emergency flush before compaction
Decay	Exponential decay, half-life 60 days. 120 days with no access and access_count=0 → auto-cleaned
Compression	Entries older than 30 days: LLM generates monthly summary to replace originals
Promotion	A tag/pattern appears ≥ 3 times → LLM decides whether to distill into a semantic memory
Typical content	Decisions made, meetings held, problems encountered, milestones reached

L2 Semantic Memory — Durable Knowledge

Property	Spec
Nature	Persistent facts about the user — "who the user is"
Capacity	No hard limit, but recommended ≤ 200 entries (keep it lean)
Injection	Always injected in full into system prompt (after working memory)
Injection budget	≤ 2000 tokens
Write timing	Promoted from episodic / new persistent facts discovered in conversation / manual user input
Decay	No automatic decay. LLM reviews once a month for stale information
Deletion	Only on contradiction (delete old + add new)
Typical content	User profile, tech preferences, people, project overviews, long-term todos

Inter-Layer Flow

Conversation input
  │
  ├── direct write ──▶ L0 (overwritten at end of every conversation)
  │
  ├── extract ──▶ L1 (specific events, decisions)
  │
  └── extract ──▶ L2 (newly discovered persistent facts — less frequent)

L1 ── compress ──▶ L1 summary entry (30+ day old entries merged into monthly summary)

L1 ── promote ──▶ L2 (tag recurs ≥ 3 times, LLM decides, then distills)

L2 ── contradiction override ──▶ L2 (new info conflicts with old → delete old + add new)

L1 ── decay cleanup ──▶ deleted (120 days without access + access_count=0)

Context Injection Example

What the agent sees at the start of every conversation:

[Working Memory — Current Focus]
User is building memory system Phase 2, last discussed HNSW index config.
Follow-up: user said demo due by Friday.

[Semantic Memory — User Profile]
- [preference] Prefers SwiftUI over UIKit
- [preference] Keep answers concise, skip basic explanations
- [knowledge] iOS developer, based in Singapore
- [person] Alice is the backend lead
- [project] Memory system project, goal: replace OpenClaw native memory
- [todo] Demo due by April 15

[Relevant Episodic — Events Related to This Conversation]
- 03-04 Decided on Python core + JS thin-shell architecture (score=0.85)
- 03-03 Researched chDB HNSW index, confirmed 25.8 GA (score=0.78)
- 03-01 Project kickoff, goal: replace sqlite-vec (score=0.71)

Injection order and budget:

Section	Source	Injection method	Token budget
Working Memory	L0 latest entry	Always injected in full	≤ 500
Semantic Memory	All active L2 entries	Always injected in full	≤ 2000
Relevant Episodic	L1 semantic search top-K	Retrieved on demand	≤ 3000
Total			≤ 5500

Architecture

You (Telegram / WhatsApp / CLI)
 │
 ▼
OpenClaw Gateway ──── agent loop ──── LLM (Claude / GPT / local)
 │                        │
 │  plugin hooks          │  tool calls
 │                        │
 ▼                        ▼
┌─────────────────────────────────────────────┐
│  openclaw-memory  plugin                     │
│                                              │
│  ┌──────────────┐    ┌───────────────────┐  │
│  │  JS bridge    │───▶│  Python core      │  │
│  │  (plugin SDK) │◀───│  memory-core      │  │
│  └──────────────┘    └────────┬──────────┘  │
│       OpenClaw hooks          │              │
│       before_agent_start      │              │
│       agent_end               ▼              │
│       before_compaction  ┌─────────┐         │
│                          │  chDB   │         │
│                          │ (local) │         │
│                          └────┬────┘         │
│                               │              │
│                          ┌────▼─────┐        │
│                          │ Qwen3    │        │
│                          │ Embed    │        │
│                          │ 0.6B     │        │
│                          │ (local)  │        │
│                          └──────────┘        │
└─────────────────────────────────────────────┘
         │
         ▼
  ~/.openclaw/memory/
    chdb-data/           ← MergeTree compressed storage (L0+L1+L2 same table, layer column)
    MEMORY.md            ← Human-readable mirror of L2 (auto-synced)
    memory/2026-03-04.md ← Human-readable mirror of L1 daily entries

Data Flow

Timing	Action
Conversation start	Load L0 full + L2 full + L1 semantic search → inject into system prompt
During conversation	Agent can call `memory recall` / `memory remember` tools
Conversation end	LLM analyzes conversation → rewrite L0 → extract L1/L2 entries → write to chDB → sync .md
Approaching compaction	Emergency extract key information into L1 to prevent compaction loss
Periodic maintenance	L1 decay cleanup → L1 monthly compression → L1→L2 promotion → physical cleanup

Storage

Single memories table; the layer column distinguishes the three tiers. Vectors are embedded as a column; ClickHouse HNSW index accelerates semantic search.

Field	Description
id	UUID
layer	`working` / `episodic` / `semantic` — the core field that determines lifecycle
category	`decision` / `preference` / `event` / `person` / `project` / `knowledge` / `todo` / `insight`
content	Natural-language memory text
tags	Tag array
entities	Extracted entity name array
embedding	Float32 vector (256-dim, HNSW bf16 quantized index)
session_id	Source session
source	`agent` / `cli` / `user_edit` / `compaction_flush` / `maintenance`
created_at / updated_at / accessed_at	Timestamps
access_count	Retrieval count (basis for L1 decay)

Auxiliary table session_log: raw conversation archive, 90-day TTL auto-cleanup.

CLI Reference

Commands are designed as natural-language verbs, usable by both agents and humans. All commands support --json.

remember — Store a memory

# Write to L2 semantic (default)
$ memory remember "User prefers SwiftUI over UIKit" \
    --category preference --tags swift,ui

✓ Stored [semantic/preference]: User prefers SwiftUI over UIKit
  id=a1b2c3d4  tags=swift,ui

# Write to L1 episodic
$ memory remember "Decided to replace SQLite with chDB" \
    --layer episodic --category decision --tags chdb,storage

✓ Stored [episodic/decision]: Decided to replace SQLite with chDB
  id=e5f6g7h8  tags=chdb,storage

# Write to L0 working (overwrite)
$ memory remember "Debugging HNSW index config, follow up on bf16 quantization results" \
    --layer working

✓ Working memory updated.

# Agent-friendly: JSON output
$ memory remember "Alice is the backend lead" --category person --json
{"id":"q7r8s9t0","layer":"semantic","category":"person","status":"stored"}

recall — Semantic search

$ memory recall "tech stack decisions"

── Semantic ──────────────────────────────────────
  [preference] Prefers SwiftUI over UIKit  (score=0.82)
  [knowledge]  iOS developer, based in Singapore  (score=0.65)

── Episodic ──────────────────────────────────────
  03-04 Decided on Python core + JS thin-shell arch  (score=0.85)
  03-03 Researched chDB HNSW index  (score=0.78)
  03-01 Project kickoff, goal: replace sqlite-vec  (score=0.71)

# Filter by layer
$ memory recall "Alice" --layer semantic

── Semantic ──────────────────────────────────────
  [person] Alice is the backend lead  (score=0.93)

$ memory recall "decisions made last week" --layer episodic --category decision

── Episodic ──────────────────────────────────────
  03-04 Decided on Python core + JS thin-shell arch  (score=0.88)
  03-01 Decided to replace SQLite with chDB  (score=0.76)
  02-28 Decided against graph modeling, keep it simple  (score=0.72)

# Agent invocation
$ memory recall "user's coding habits" --json | jq '.[].content'
"Prefers SwiftUI over UIKit"
"Code style: concise, avoids over-abstraction"
"Prefers declarative programming"

forget — Delete a memory

$ memory forget a1b2c3d4
✓ Forgotten: a1b2c3d4 [semantic/preference] Prefers SwiftUI over UIKit

$ memory forget a1b2 --json    # ID prefix supported
{"id":"a1b2c3d4-...","status":"deleted"}

review — Browse by layer

# View L0 current focus
$ memory review --layer working

[Working Memory]
Debugging HNSW index config, follow up on bf16 quantization results
Updated: 2026-03-04 15:30

# View all L2 durable knowledge
$ memory review --layer semantic

┌──────────┬────────────┬──────────────────────────────────────────┬────────────┐
│ ID       │ Category   │ Content                                  │ Updated    │
├──────────┼────────────┼──────────────────────────────────────────┼────────────┤
│ q7r8s9t0 │ person     │ Alice is the backend lead                │ 2026-03-04 │
│ a1b2c3d4 │ preference │ Prefers SwiftUI over UIKit               │ 2026-03-04 │
│ k8l9m0n1 │ knowledge  │ iOS developer, based in Singapore        │ 2026-02-20 │
│ i9j0k1l2 │ todo       │ Demo due by April 15                     │ 2026-03-04 │
│ p2q3r4s5 │ project    │ Memory system project, replace native    │ 2026-03-01 │
└──────────┴────────────┴──────────────────────────────────────────┴────────────┘

# View recent L1 episodic events
$ memory review --layer episodic --limit 5

┌──────────┬──────────┬────────────────────────────────────────────┬──────────────────┐
│ ID       │ Category │ Content                                    │ Created          │
├──────────┼──────────┼────────────────────────────────────────────┼──────────────────┤
│ e5f6g7h8 │ decision │ Decided on Python core + JS thin-shell     │ 2026-03-04 14:30 │
│ f6g7h8i9 │ event    │ Aligned API design with Alice, adopted gRPC│ 2026-03-04 10:15 │
│ g7h8i9j0 │ insight  │ chDB HNSW underperforms brute-force at low │ 2026-03-03 16:00 │
│          │          │ volume                                     │                  │
│ h8i9j0k1 │ decision │ No graph modeling, keep it simple          │ 2026-02-28 11:00 │
│ j0k1l2m3 │ event    │ Project kickoff, goal: replace sqlite-vec  │ 2026-03-01 09:00 │
└──────────┴──────────┴────────────────────────────────────────────┴──────────────────┘

status — Per-layer statistics

$ memory status

Embedding: qwen3-0.6b (256d, bf16) ✓ loaded
Storage: 6.2 MB (chdb-data/)

L0 Working     1 entry      0.4 KB
L1 Episodic   26 entries   17.4 KB   oldest=2026-02-01  newest=2026-03-04
L2 Semantic   15 entries    5.1 KB   oldest=2026-01-20  newest=2026-03-04
─────────────────────────────────────
Total         42 entries   22.9 KB

L1 Breakdown:
  event=12  decision=8  insight=4  summary=2

L2 Breakdown:
  preference=5  knowledge=4  person=2  project=2  todo=2

sql — Direct query

$ memory sql "SELECT layer, count() c FROM memories \
    WHERE is_active=1 GROUP BY layer ORDER BY layer"
┌───────────┬────┐
│ layer     │  c │
├───────────┼────┤
│ working   │  1 │
│ episodic  │ 26 │
│ semantic  │ 15 │
└───────────┴────┘

$ memory sql "SELECT content, access_count FROM memories \
    WHERE layer='semantic' ORDER BY access_count DESC LIMIT 5"
┌──────────────────────────────────────────┬──────────────┐
│ content                                  │ access_count │
├──────────────────────────────────────────┼──────────────┤
│ iOS developer, based in Singapore        │           47 │
│ Keep answers concise, skip basics        │           35 │
│ Code style: concise, avoids abstraction  │           22 │
└──────────────────────────────────────────┴──────────────┘

$ memory sql "SELECT formatDateTime(created_at,'%Y-%m') month, \
    count() cnt FROM memories WHERE layer='episodic' \
    GROUP BY month ORDER BY month"
┌─────────┬─────┐
│ month   │ cnt │
├─────────┼─────┤
│ 2026-01 │   3 │
│ 2026-02 │  12 │
│ 2026-03 │  11 │
└─────────┴─────┘

maintain — Maintenance

$ memory maintain --dry-run

L1 Episodic:
  Stale (120+ days, 0 accesses): 3 entries to clean
    - Discussed weather... (125d)
    - Tested an API endpoint... (121d)
    - Chatted about weekend plans... (130d)
  Compress: 2026-01 has 47 entries → would generate monthly summary

L1 → L2 Promotion candidates:
  "chdb" appeared 5× in episodic → candidate for semantic/knowledge
  "deadline" appeared 3× → candidate for semantic/todo

$ memory maintain
✓ L1: Cleaned 3 stale entries
✓ L1: Compressed 2026-01 (47 → 1 summary)
✓ L1→L2: Promoted 2 patterns to semantic
✓ Storage optimized

Python API Reference

MemoryDB

from memory_core import MemoryDB, Memory

db = MemoryDB("~/.openclaw/memory/chdb-data")

# ---- L0 Working: overwrite ----
db.set_working("Debugging HNSW index config, follow up on bf16 quantization")
db.get_working()  # → "Debugging HNSW index config..."

# ---- L1 Episodic: append ----
db.insert(Memory(
    content="Decided to replace SQLite with chDB",
    layer="episodic", category="decision",
    tags=["chdb", "storage"],
))

# ---- L2 Semantic: persistent facts ----
db.insert(Memory(
    content="User prefers SwiftUI over UIKit",
    layer="semantic", category="preference",
    tags=["swift", "ui"],
))

# ---- Update (insert new version + deactivate old) ----
db.update_content(memory_id, "User prefers SwiftUI but is also fluent in UIKit")

# ---- Soft delete ----
db.deactivate(memory_id)

# ---- Per-layer queries ----
db.count()              # → 42
db.count_by_layer()     # → {"working": 1, "episodic": 26, "semantic": 15}
db.stats()              # → grouped by layer × category

# ---- Raw SQL ----
db.query("SELECT content FROM memories WHERE layer='semantic' AND hasAny(tags, ['swift'])")

EmbeddingEngine

from memory_core import EmbeddingEngine

emb = EmbeddingEngine("~/.openclaw/models/qwen3-embedding-0.6b-q4.gguf", dimension=256)
emb.load()

query_vec  = emb.encode_query("tech stack decisions")    # with instruct prefix
doc_vec    = emb.encode_document("Prefers SwiftUI")      # without instruct
batch_vecs = emb.encode_batch(["text1", "text2"])

Retrieval

from memory_core import hybrid_search, RetrievalConfig

results = hybrid_search(
    db, emb,
    query="tech stack decisions",
    cfg=RetrievalConfig(
        top_k=10,
        w_vector=0.5, w_keyword=0.5,
        decay_days=60, mmr_lambda=0.7,
    ),
)

for r in results:
    print(r["layer"], r["category"], r["content"], r["final_score"])

Retrieval strategy auto-switches:

Condition	Strategy
Embedding available + memories < 50K	Single SQL hybrid (brute-force cosineDistance + keywords)
Embedding available + memories ≥ 50K	Two-stage (HNSW recall top-100 → rerank)
No embedding	Keywords only + tag matching + time decay

Note: Retrieval only applies to L1 Episodic. L0 and L2 are injected in full — they bypass retrieval entirely.

Extraction

from memory_core import MemoryExtractor

extractor = MemoryExtractor(db, emb)

# End of conversation: auto-extract and write to appropriate layers
new_ids = extractor.extract(
    messages=[...],
    llm_complete=my_llm_function,
    session_id="sess-001",
)
# LLM automatically determines which layer (L0/L1/L2) each memory belongs to

# Before compaction: emergency extract → write to L1
new_ids = extractor.emergency_flush(
    context="...", llm_complete=my_llm_function,
)

Maintenance

from memory_core import maintenance

maintenance.run_all(db, llm_complete=my_llm, emb=emb)

# Or run individually:
maintenance.cleanup_stale(db, decay_days=120)           # L1: clean long-unaccessed
maintenance.purge_deleted(db, days=7)                    # Physical deletion
maintenance.compress_episodic(db, my_llm, emb,           # L1: monthly compression
                              month="2026-01")
maintenance.promote_to_semantic(db, my_llm, emb)         # L1 → L2: pattern promotion
maintenance.review_semantic(db, my_llm)                  # L2: review stale information

.md Sync

from memory_core import md_sync

md_sync.export_memory_md(db, workspace_path)    # L2 → MEMORY.md
md_sync.export_daily_md(db, workspace_path)     # L1 today → memory/2026-03-04.md

OpenClaw Integration

Hook	Timing	Behavior
`before_agent_start`	Conversation start	Load L0 + L2 in full, L1 semantic search → inject into system prompt
`agent_end`	Conversation end	LLM analyzes conversation → rewrite L0 → extract L1/L2 → write → sync .md
`before_compaction`	Approaching context limit	Emergency extract to L1 to prevent compaction loss
CLI `memory *`	Anytime	Agent and humans operate memory directly via CLI
Background service	Continuously running	.md sync + periodic maintenance

.md Compatibility

MEMORY.md = human-readable mirror of L2 Semantic
memory/YYYY-MM-DD.md = human-readable mirror of L1 Episodic daily entries
OpenClaw native memory_search / memory_get still work
Plugin disabled → zero impact, seamless fallback

Specifications Summary

Embedding

Property	Value
Model	Qwen3-Embedding-0.6B (GGUF Q4_K_M)
Disk	~350 MB
RAM	~500 MB (during inference)
Dimensions	256 (MRL truncation, configurable 128/256/512/1024)
HNSW quantization	bf16 (halves index size)

Storage

Memory count	Text (LZ4)	Vectors (bf16)	Total
1,000	~0.2 MB	~0.5 MB	~0.7 MB
5,000	~1 MB	~2.5 MB	~3.5 MB
10,000	~2 MB	~5 MB	~7 MB

Retrieval Parameters

Parameter	Default	Description
top_k	15	Number of L1 results returned
w_vector	0.5	Vector similarity weight
w_keyword	0.5	Keyword hit rate weight
decay_days	60	L1 time decay half-life (days)
mmr_lambda	0.7	MMR diversity (1.0 = pure relevance, 0.0 = pure diversity)

Automatic Maintenance

Task	Target layer	Trigger	Behavior
Decay cleanup	L1	120+ days unaccessed, access_count=0	Soft delete
Physical cleanup	All	7 days after soft delete	Remove from disk
Monthly compression	L1	Month exceeds 500 entries	LLM generates summary to replace originals
Pattern promotion	L1→L2	A tag appears ≥ 3 times	LLM decides whether to distill into semantic
Staleness review	L2	Monthly	LLM reviews semantic memories for accuracy
Index optimization	All	Every maintenance run	OPTIMIZE TABLE FINAL

Dependencies

Component	Dependency	Size
chDB	`pip install chdb`	~100 MB
Qwen3 Embedding	GGUF model + llama-cpp-python	~350 MB + ~50 MB
CLI	typer + rich	~2 MB
JS plugin	Node.js built-in modules only	0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenClaw Memory — Product Specification

One-liner

Problem

Solution

Three-Layer Memory Model

Layer Specifications

L0 Working Memory — Current Focus

L1 Episodic Memory — Event Stream

L2 Semantic Memory — Durable Knowledge

Inter-Layer Flow

Context Injection Example

Architecture

Data Flow

Storage

CLI Reference

remember — Store a memory

recall — Semantic search

forget — Delete a memory

review — Browse by layer

status — Per-layer statistics

sql — Direct query

maintain — Maintenance

Python API Reference

MemoryDB

EmbeddingEngine

Retrieval

Extraction

Maintenance

.md Sync

OpenClaw Integration

.md Compatibility

Specifications Summary

Embedding

Storage

Retrieval Parameters

Automatic Maintenance

Dependencies

FilesExpand file tree

design.md

Latest commit

History

design.md

File metadata and controls

OpenClaw Memory — Product Specification

One-liner

Problem

Solution

Three-Layer Memory Model

Layer Specifications

L0 Working Memory — Current Focus

L1 Episodic Memory — Event Stream

L2 Semantic Memory — Durable Knowledge

Inter-Layer Flow

Context Injection Example

Architecture

Data Flow

Storage

CLI Reference

remember — Store a memory

recall — Semantic search

forget — Delete a memory

review — Browse by layer

status — Per-layer statistics

sql — Direct query

maintain — Maintenance

Python API Reference

MemoryDB

EmbeddingEngine

Retrieval

Extraction

Maintenance

.md Sync

OpenClaw Integration

.md Compatibility

Specifications Summary

Embedding

Storage

Retrieval Parameters

Automatic Maintenance

Dependencies