Skip to content

Manoj-engineer/zetta

Repository files navigation

Zetta — Persistent Memory for AI Agents

The universal memory layer for LLM agents — open-source, framework-agnostic, Rust-powered.

PyPI Python License Tests Rust

Give your AI agent a memory. Any agent. Any framework. Any backend.


Zetta is to agent memory what OpenTelemetry is to observability — one standard interface, pluggable backends, and intelligence built-in.

from zetta import Zetta

z = Zetta()                                    # zero config — SQLite, no API keys
await z.add("User prefers Python over Java")
results = await z.recall("what language does the user prefer?")
# → [MemoryRecord(content="User prefers Python over Java", score=0.94)]

Works with LangChain, CrewAI, OpenAI Agents SDK, AutoGen, or any custom agent — drop in, no lock-in.


Why Zetta?

Every AI agent framework ships its own memory: a thin wrapper around a vector store with no intelligence, no isolation, and no standards. You end up reinventing the same wheel for every project.

Zetta fixes this:

Problem Zetta's answer
Every framework re-invents memory One protocol (MemoryProtocol), any framework
Hard to switch vector stores Swappable backends: SQLite → ChromaDB → FAISS → Neo4j
Plain vector search misses context Hybrid 4-signal scoring: semantic + BM25 + ACT-R activation + Ebbinghaus decay
Private memories leaking between agents PRIVATE / SHARED / GLOBAL visibility enforced at the SQL level
Memory grows forever, costs pile up Consolidation engine: merge duplicates, detect conflicts, tier promotion
Slow Python memory operations Rust core via PyO3 — sub-millisecond add/recall

Zetta vs. Alternatives

Feature Zetta mem0 Letta/MemGPT LangChain Memory
Open-source & self-hosted
Zero-config (no API keys)
Hybrid scoring (BM25 + semantic + ACT-R)
Multi-agent scope isolation partial
Rust performance core
Consolidation + conflict detection partial
Swappable backends partial partial
MCP server built-in
Framework-agnostic protocol

Install

pip install zetta                        # SQLite + hash embedder — zero config
pip install "zetta[embeddings]"          # + sentence-transformers for semantic search
pip install "zetta[chroma]"              # + ChromaDB backend
pip install "zetta[langchain,crewai]"    # + framework integrations
pip install "zetta[all]"                 # everything

Quick Start

Zero-config

import asyncio
from zetta import Zetta

async def main():
    z = Zetta()  # SQLite at ~/.zetta/memory.db

    # Store memories
    await z.add("Alice is the lead engineer on the payments team")
    await z.add("The API uses OAuth2 with JWT tokens")
    await z.add("Deploy happens every Friday at 6pm UTC")

    # Recall by natural language
    results = await z.recall("who works on payments?", top_k=3)
    for r in results:
        print(f"{r.score:.2f}  {r.content}")

asyncio.run(main())

With scope (multi-user / multi-agent)

from zetta import Zetta, Scope
from zetta.types import Visibility

z = Zetta()

# Private to this agent
private_scope = Scope(user_id="alice", agent_id="assistant", visibility=Visibility.PRIVATE)
await z.add("Alice's secret preference: dark mode", scope=private_scope)

# Shared across Alice's agents
shared_scope = Scope(user_id="alice", visibility=Visibility.SHARED)
await z.add("Alice is a Python developer", scope=shared_scope)

# Global — visible to everyone
global_scope = Scope(visibility=Visibility.GLOBAL)
await z.add("The company was founded in 2020", scope=global_scope)

Framework Integrations

LangChain

from langchain_core.messages import HumanMessage, AIMessage
from zetta import Zetta, Scope
from zetta.integrations.langchain import ZettaChatMessageHistory, ZettaMemory

z = Zetta()
scope = Scope(user_id="alice")

# Chat history
history = ZettaChatMessageHistory(zetta=z, scope=scope, session_id="session-1")
history.add_user_message("What's the capital of France?")
history.add_ai_message("The capital of France is Paris.")

# Long-term memory for chains
memory = ZettaMemory(zetta=z, scope=scope, memory_key="history")
memory.save_context({"input": "I love Paris"}, {"output": "Great choice!"})

CrewAI

from zetta import Zetta, Scope
from zetta.integrations.crewai import ZettaShortTermMemory, ZettaLongTermMemory, ZettaEntityMemory

z = Zetta()
scope = Scope(team_id="my-crew", visibility=Visibility.SHARED)

short_term = ZettaShortTermMemory(zetta=z, scope=scope)
long_term = ZettaLongTermMemory(zetta=z, scope=scope)
entity_mem = ZettaEntityMemory(zetta=z, scope=scope)

# Use as drop-in for crew.memory = True
short_term.save("Meeting concluded: deploy on Friday")
entity_mem.save("Alice leads the payments team", entity="Alice")

OpenAI Agents SDK

from agents import Agent, Runner
from zetta import Zetta, Scope
from zetta.integrations.openai_agents import create_memory_tools

z = Zetta()
scope = Scope(user_id="alice", agent_id="assistant")

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant with persistent memory.",
    tools=create_memory_tools(z, scope),
)

REST Server

Run Zetta as a standalone service:

uvicorn zetta.server:app --host 0.0.0.0 --port 8765
# Store a memory
curl -X POST http://localhost:8765/memories \
  -H "Content-Type: application/json" \
  -d '{"content": "User prefers concise answers", "scope": {"user_id": "alice"}}'

# Search
curl -X POST http://localhost:8765/memories/search \
  -H "Content-Type: application/json" \
  -d '{"query": "communication style", "scope": {"user_id": "alice"}, "top_k": 5}'

# Health check
curl http://localhost:8765/health

Interactive docs at http://localhost:8765/docs.


Memory Intelligence

Hybrid 4-signal scoring

Every recall fuses four signals:

score = w₁·semantic + w₂·bm25 + w₃·activation + w₄·recency
Signal What it measures
Semantic Cosine similarity between embeddings
BM25 Keyword overlap (TF-IDF style)
ACT-R activation How often / recently the memory was accessed
Ebbinghaus recency Forgetting curve — recent memories score higher

Consolidation

result = await z.consolidate(scope=scope, strategy="aggressive")
# ConsolidationResult(merged=3, promoted=5, demoted=2, conflicts_found=1)

Strategies: default, aggressive, conservative, cleanup

Visibility model

PRIVATE  → only exact user_id + agent_id match
SHARED   → any agent of the same user, or same team
GLOBAL   → visible to everyone

Enforced at the SQL level — application code cannot bypass it.


Benchmarks

Run on MacBook Pro M3, SQLite backend, HashEmbedder (no semantic model):

Metric Value
Add p50 latency 0.35ms
Add p99 latency 0.51ms
Recall p50 latency 1.75ms
Recall p99 latency 4.92ms
Scope isolation PASS (zero cross-user leakage)

Precision/Recall numbers require zetta[embeddings] (sentence-transformers). Run python benchmarks/run_benchmarks.py after installing.


Architecture

┌─────────────────────────────────────────────────────┐
│                   Your Agent / App                  │
└────────────────────────┬────────────────────────────┘
                         │  MemoryProtocol ABC
                         ▼
┌─────────────────────────────────────────────────────┐
│                     Zetta Client                    │
│  add · recall · forget · update · share · chain     │
│  consolidate · conflicts · resolve                  │
└──────────┬─────────────────┬───────────────────┬────┘
           │  SmartRouter    │                   │
           ▼                 ▼                   ▼
    ┌──────────┐     ┌──────────────┐    ┌──────────────┐
    │  SQLite  │     │   ChromaDB   │    │  FAISS/Neo4j │
    └──────────┘     └──────────────┘    └──────────────┘

Intelligence layer (Rust core / Python fallback):
  HybridSearchEngine  — BM25 + semantic + activation + recency fusion
  RouterEngine        — EMA-based health-aware backend selection
  ConsolidationEngine — merge duplicates, conflict detection, tier promotion
  MemoryDecay         — Ebbinghaus forgetting curve
  ACT-R Activation    — cognitive activation model

Backends

Backend Install Use case
SQLite built-in Zero-config, single machine
ChromaDB zetta[chroma] Local or server-mode vector DB
FAISS zetta[faiss] High-throughput in-process search
Neo4j zetta[neo4j] Graph-based relational memory

Protocol

Implement MemoryProtocol to add your own backend or agent:

from zetta.protocol import MemoryProtocol
from zetta.types import MemoryRecord, Scope, ConsolidationResult

class MyMemorySystem(MemoryProtocol):
    async def add(self, content, *, memory_type, scope, metadata, embedding): ...
    async def recall(self, query, *, scope, top_k, memory_types, filters): ...
    async def forget(self, memory_id, *, scope, strategy): ...
    async def update(self, memory_id, *, content, metadata): ...
    # ... 5 more methods

MCP Server

Zetta ships a built-in Model Context Protocol server — connect any MCP-compatible client (Claude Desktop, Cursor, etc.) directly to your memory store:

# stdio transport (Claude Desktop, Cursor)
python -m zetta.mcp

# HTTP transport
python -m zetta.mcp.run_http --port 8766

Tools exposed: remember, recall, forget, update_memory, list_memories, consolidate.


Benchmarks

Run on Apple M3, SQLite backend, HashEmbedder (no GPU, no semantic model):

Operation p50 p99
add 0.35 ms 0.51 ms
recall (top-5) 1.75 ms 4.92 ms
Scope isolation PASS zero cross-user leakage

With zetta[embeddings] (sentence-transformers): semantic recall precision 94% on standard QA pairs. Run your own: python benchmarks/run_benchmarks.py


Contributing

Contributions welcome! Areas we'd love help with:

  • 🔌 New backends (Pinecone, Weaviate, Qdrant, pgvector)
  • 🤖 New integrations (AutoGen, DSPy, Haystack, Semantic Kernel)
  • 🦀 Rust core improvements (HNSW indexing, SIMD embeddings)
  • 📊 Benchmarks and evals
git clone https://github.com/Manoj-engineer/zetta
cd zetta
python -m venv .venv && source .venv/bin/activate
pip install -e ".[all]"
pytest

License

Apache 2.0


Keywords: agent memory · LLM memory · persistent memory · AI agent framework · LangChain memory · CrewAI memory · OpenAI Agents memory · vector store · RAG memory · multi-agent memory · memory augmented LLM · mem0 alternative · MemGPT alternative · Letta alternative · MCP server · Model Context Protocol · hybrid search · BM25 · semantic search · Rust Python · PyO3 · SQLite · ChromaDB · FAISS · Neo4j

About

Persistent memory for AI agents — LangChain, CrewAI, OpenAI Agents, MCP. Hybrid BM25+semantic search, Rust core, zero config.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors