Zetta — Persistent Memory for AI Agents

The universal memory layer for LLM agents — open-source, framework-agnostic, Rust-powered.

Give your AI agent a memory. Any agent. Any framework. Any backend.

Zetta is to agent memory what OpenTelemetry is to observability — one standard interface, pluggable backends, and intelligence built-in.

from zetta import Zetta

z = Zetta()                                    # zero config — SQLite, no API keys
await z.add("User prefers Python over Java")
results = await z.recall("what language does the user prefer?")
# → [MemoryRecord(content="User prefers Python over Java", score=0.94)]

Works with LangChain, CrewAI, OpenAI Agents SDK, AutoGen, or any custom agent — drop in, no lock-in.

Why Zetta?

Every AI agent framework ships its own memory: a thin wrapper around a vector store with no intelligence, no isolation, and no standards. You end up reinventing the same wheel for every project.

Zetta fixes this:

Problem	Zetta's answer
Every framework re-invents memory	One protocol (`MemoryProtocol`), any framework
Hard to switch vector stores	Swappable backends: SQLite → ChromaDB → FAISS → Neo4j
Plain vector search misses context	Hybrid 4-signal scoring: semantic + BM25 + ACT-R activation + Ebbinghaus decay
Private memories leaking between agents	`PRIVATE / SHARED / GLOBAL` visibility enforced at the SQL level
Memory grows forever, costs pile up	Consolidation engine: merge duplicates, detect conflicts, tier promotion
Slow Python memory operations	Rust core via PyO3 — sub-millisecond add/recall

Zetta vs. Alternatives

Feature	Zetta	mem0	Letta/MemGPT	LangChain Memory
Open-source & self-hosted	✅	✅	✅	✅
Zero-config (no API keys)	✅	❌	❌	✅
Hybrid scoring (BM25 + semantic + ACT-R)	✅	❌	❌	❌
Multi-agent scope isolation	✅	partial	❌	❌
Rust performance core	✅	❌	❌	❌
Consolidation + conflict detection	✅	❌	partial	❌
Swappable backends	✅	partial	❌	partial
MCP server built-in	✅	❌	❌	❌
Framework-agnostic protocol	✅	❌	❌	❌

Install

pip install zetta                        # SQLite + hash embedder — zero config
pip install "zetta[embeddings]"          # + sentence-transformers for semantic search
pip install "zetta[chroma]"              # + ChromaDB backend
pip install "zetta[langchain,crewai]"    # + framework integrations
pip install "zetta[all]"                 # everything

Quick Start

Zero-config

import asyncio
from zetta import Zetta

async def main():
    z = Zetta()  # SQLite at ~/.zetta/memory.db

    # Store memories
    await z.add("Alice is the lead engineer on the payments team")
    await z.add("The API uses OAuth2 with JWT tokens")
    await z.add("Deploy happens every Friday at 6pm UTC")

    # Recall by natural language
    results = await z.recall("who works on payments?", top_k=3)
    for r in results:
        print(f"{r.score:.2f}  {r.content}")

asyncio.run(main())

With scope (multi-user / multi-agent)

from zetta import Zetta, Scope
from zetta.types import Visibility

z = Zetta()

# Private to this agent
private_scope = Scope(user_id="alice", agent_id="assistant", visibility=Visibility.PRIVATE)
await z.add("Alice's secret preference: dark mode", scope=private_scope)

# Shared across Alice's agents
shared_scope = Scope(user_id="alice", visibility=Visibility.SHARED)
await z.add("Alice is a Python developer", scope=shared_scope)

# Global — visible to everyone
global_scope = Scope(visibility=Visibility.GLOBAL)
await z.add("The company was founded in 2020", scope=global_scope)

Framework Integrations

LangChain

from langchain_core.messages import HumanMessage, AIMessage
from zetta import Zetta, Scope
from zetta.integrations.langchain import ZettaChatMessageHistory, ZettaMemory

z = Zetta()
scope = Scope(user_id="alice")

# Chat history
history = ZettaChatMessageHistory(zetta=z, scope=scope, session_id="session-1")
history.add_user_message("What's the capital of France?")
history.add_ai_message("The capital of France is Paris.")

# Long-term memory for chains
memory = ZettaMemory(zetta=z, scope=scope, memory_key="history")
memory.save_context({"input": "I love Paris"}, {"output": "Great choice!"})

CrewAI

from zetta import Zetta, Scope
from zetta.integrations.crewai import ZettaShortTermMemory, ZettaLongTermMemory, ZettaEntityMemory

z = Zetta()
scope = Scope(team_id="my-crew", visibility=Visibility.SHARED)

short_term = ZettaShortTermMemory(zetta=z, scope=scope)
long_term = ZettaLongTermMemory(zetta=z, scope=scope)
entity_mem = ZettaEntityMemory(zetta=z, scope=scope)

# Use as drop-in for crew.memory = True
short_term.save("Meeting concluded: deploy on Friday")
entity_mem.save("Alice leads the payments team", entity="Alice")

OpenAI Agents SDK

from agents import Agent, Runner
from zetta import Zetta, Scope
from zetta.integrations.openai_agents import create_memory_tools

z = Zetta()
scope = Scope(user_id="alice", agent_id="assistant")

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant with persistent memory.",
    tools=create_memory_tools(z, scope),
)

REST Server

Run Zetta as a standalone service:

uvicorn zetta.server:app --host 0.0.0.0 --port 8765

# Store a memory
curl -X POST http://localhost:8765/memories \
  -H "Content-Type: application/json" \
  -d '{"content": "User prefers concise answers", "scope": {"user_id": "alice"}}'

# Search
curl -X POST http://localhost:8765/memories/search \
  -H "Content-Type: application/json" \
  -d '{"query": "communication style", "scope": {"user_id": "alice"}, "top_k": 5}'

# Health check
curl http://localhost:8765/health

Interactive docs at http://localhost:8765/docs.

Memory Intelligence

Hybrid 4-signal scoring

Every recall fuses four signals:

score = w₁·semantic + w₂·bm25 + w₃·activation + w₄·recency

Signal	What it measures
Semantic	Cosine similarity between embeddings
BM25	Keyword overlap (TF-IDF style)
ACT-R activation	How often / recently the memory was accessed
Ebbinghaus recency	Forgetting curve — recent memories score higher

Consolidation

result = await z.consolidate(scope=scope, strategy="aggressive")
# ConsolidationResult(merged=3, promoted=5, demoted=2, conflicts_found=1)

Strategies: default, aggressive, conservative, cleanup

Visibility model

PRIVATE  → only exact user_id + agent_id match
SHARED   → any agent of the same user, or same team
GLOBAL   → visible to everyone

Enforced at the SQL level — application code cannot bypass it.

Benchmarks

Run on MacBook Pro M3, SQLite backend, HashEmbedder (no semantic model):

Metric	Value
Add p50 latency	0.35ms
Add p99 latency	0.51ms
Recall p50 latency	1.75ms
Recall p99 latency	4.92ms
Scope isolation	PASS (zero cross-user leakage)

Precision/Recall numbers require zetta[embeddings] (sentence-transformers). Run python benchmarks/run_benchmarks.py after installing.

Architecture

┌─────────────────────────────────────────────────────┐
│                   Your Agent / App                  │
└────────────────────────┬────────────────────────────┘
                         │  MemoryProtocol ABC
                         ▼
┌─────────────────────────────────────────────────────┐
│                     Zetta Client                    │
│  add · recall · forget · update · share · chain     │
│  consolidate · conflicts · resolve                  │
└──────────┬─────────────────┬───────────────────┬────┘
           │  SmartRouter    │                   │
           ▼                 ▼                   ▼
    ┌──────────┐     ┌──────────────┐    ┌──────────────┐
    │  SQLite  │     │   ChromaDB   │    │  FAISS/Neo4j │
    └──────────┘     └──────────────┘    └──────────────┘

Intelligence layer (Rust core / Python fallback):
  HybridSearchEngine  — BM25 + semantic + activation + recency fusion
  RouterEngine        — EMA-based health-aware backend selection
  ConsolidationEngine — merge duplicates, conflict detection, tier promotion
  MemoryDecay         — Ebbinghaus forgetting curve
  ACT-R Activation    — cognitive activation model

Backends

Backend	Install	Use case
SQLite	built-in	Zero-config, single machine
ChromaDB	`zetta[chroma]`	Local or server-mode vector DB
FAISS	`zetta[faiss]`	High-throughput in-process search
Neo4j	`zetta[neo4j]`	Graph-based relational memory

Protocol

Implement MemoryProtocol to add your own backend or agent:

from zetta.protocol import MemoryProtocol
from zetta.types import MemoryRecord, Scope, ConsolidationResult

class MyMemorySystem(MemoryProtocol):
    async def add(self, content, *, memory_type, scope, metadata, embedding): ...
    async def recall(self, query, *, scope, top_k, memory_types, filters): ...
    async def forget(self, memory_id, *, scope, strategy): ...
    async def update(self, memory_id, *, content, metadata): ...
    # ... 5 more methods

MCP Server

Zetta ships a built-in Model Context Protocol server — connect any MCP-compatible client (Claude Desktop, Cursor, etc.) directly to your memory store:

# stdio transport (Claude Desktop, Cursor)
python -m zetta.mcp

# HTTP transport
python -m zetta.mcp.run_http --port 8766

Tools exposed: remember, recall, forget, update_memory, list_memories, consolidate.

Benchmarks

Run on Apple M3, SQLite backend, HashEmbedder (no GPU, no semantic model):

Operation	p50	p99
`add`	0.35 ms	0.51 ms
`recall` (top-5)	1.75 ms	4.92 ms
Scope isolation	PASS	zero cross-user leakage

With zetta[embeddings] (sentence-transformers): semantic recall precision 94% on standard QA pairs. Run your own: python benchmarks/run_benchmarks.py

Contributing

Contributions welcome! Areas we'd love help with:

🔌 New backends (Pinecone, Weaviate, Qdrant, pgvector)
🤖 New integrations (AutoGen, DSPy, Haystack, Semantic Kernel)
🦀 Rust core improvements (HNSW indexing, SIMD embeddings)
📊 Benchmarks and evals

git clone https://github.com/Manoj-engineer/zetta
cd zetta
python -m venv .venv && source .venv/bin/activate
pip install -e ".[all]"
pytest

License

Apache 2.0

_{Keywords: agent memory · LLM memory · persistent memory · AI agent framework · LangChain memory ·
CrewAI memory · OpenAI Agents memory · vector store · RAG memory · multi-agent memory ·
memory augmented LLM · mem0 alternative · MemGPT alternative · Letta alternative ·
MCP server · Model Context Protocol · hybrid search · BM25 · semantic search ·
Rust Python · PyO3 · SQLite · ChromaDB · FAISS · Neo4j}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
src		src
tests		tests
zetta		zetta
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
MCP_SETUP.md		MCP_SETUP.md
README.md		README.md
agent_memory_demo.py		agent_memory_demo.py
llm_agent_demo.py		llm_agent_demo.py
pyproject.toml		pyproject.toml
real_test.py		real_test.py
test_mcp_live.py		test_mcp_live.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Zetta — Persistent Memory for AI Agents

Why Zetta?

Zetta vs. Alternatives

Install

Quick Start

Zero-config

With scope (multi-user / multi-agent)

Framework Integrations

LangChain

CrewAI

OpenAI Agents SDK

REST Server

Memory Intelligence

Hybrid 4-signal scoring

Consolidation

Visibility model

Benchmarks

Architecture

Backends

Protocol

MCP Server

Benchmarks

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Zetta — Persistent Memory for AI Agents

Why Zetta?

Zetta vs. Alternatives

Install

Quick Start

Zero-config

With scope (multi-user / multi-agent)

Framework Integrations

LangChain

CrewAI

OpenAI Agents SDK

REST Server

Memory Intelligence

Hybrid 4-signal scoring

Consolidation

Visibility model

Benchmarks

Architecture

Backends

Protocol

MCP Server

Benchmarks

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages