Skip to content

Commit 4336a1e

Browse files
committed
docs: update readme
1 parent 7c009e4 commit 4336a1e

File tree

3 files changed

+337
-87
lines changed

3 files changed

+337
-87
lines changed

README.md

Lines changed: 146 additions & 87 deletions
Original file line numberDiff line numberDiff line change
@@ -1,63 +1,17 @@
1-
# MemState Transactional Memory for AI Agents
1+
# MemState - Transactional Memory for AI Agents
22

3-
**Keeps SQL and Vector DBs in sync. No drift. No ghost data. ACID-like consistency for agent state.**
3+
**Agents hallucinate because their memory drifts.**
4+
SQL says one thing, the Vector DB says another. MemState keeps them in sync, always.
5+
6+
> **Mental Model:** MemState extends **database transactions** to your Vector DB.<br>
7+
> One unit. One commit. One rollback.
48
59
[![PyPI version](https://img.shields.io/pypi/v/memstate.svg)](https://pypi.org/project/memstate/)
610
[![PyPI Downloads](https://static.pepy.tech/personalized-badge/memstate?period=total&units=INTERNATIONAL_SYSTEM&left_color=GREY&right_color=GREEN&left_text=downloads)](https://pepy.tech/projects/memstate)
711
[![Python versions](https://img.shields.io/pypi/pyversions/memstate.svg)](https://pypi.org/project/memstate/)
812
[![License](https://img.shields.io/pypi/l/memstate.svg)](https://github.com/scream4ik/MemState/blob/main/LICENSE)
913
[![Tests](https://github.com/scream4ik/MemState/actions/workflows/test.yml/badge.svg)](https://github.com/scream4ik/MemState/actions)
1014

11-
---
12-
13-
## Why MemState exists
14-
15-
AI agents usually store memory in two places:
16-
17-
* **SQL** (structured facts)
18-
* **Vector DB** (semantic search context)
19-
20-
These two **drift** easily:
21-
22-
### ❌ Example of real-world corruption
23-
24-
```python
25-
# Step 1: SQL write succeeds
26-
db.update("user_city", "London")
27-
28-
# Step 2: Vector DB update fails (timeout)
29-
vectors.upsert("User lives in London") # ❌ failed
30-
31-
# Final state:
32-
SQL: London
33-
Vectors: New York
34-
→ Agent retrieves stale context and behaves unpredictably
35-
```
36-
37-
Failures, crashes, retries, malformed payloads — all silently accumulate “ghost vectors” and inconsistent state.
38-
39-
**Vector DBs don't have transactions.
40-
JSON memory has no schema.
41-
Agents drift over time.**
42-
43-
---
44-
45-
## What MemState does
46-
47-
MemState makes all memory operations **atomic**:
48-
49-
```
50-
SQL write + Vector upsert
51-
→ succeed together or rollback together
52-
```
53-
54-
Also provides:
55-
56-
* **Rollback**: undo N steps (SQL + vectors)
57-
* **Type safety**: Pydantic schema validation
58-
* **Append-only Fact Log**: full version history
59-
* **Crash safety**: WAL replay for vector sync
60-
6115
<p align="center">
6216
<img src="https://raw.githubusercontent.com/scream4ik/MemState/main/assets/docs/demo.gif" width="100%" />
6317
<br>
@@ -68,78 +22,160 @@ Also provides:
6822

6923
---
7024

71-
## Minimal example (copy–paste)
25+
## Quick Start
7226

7327
```bash
7428
pip install memstate[chromadb]
7529
```
7630

7731
```python
78-
from memstate import MemoryStore, Fact, SQLiteStorage
32+
from memstate import MemoryStore, Fact, SQLiteStorage, HookError
7933
from memstate.integrations.chroma import ChromaSyncHook
8034
import chromadb
8135

8236
# Storage
83-
sqlite = SQLiteStorage("state.db")
37+
sqlite = SQLiteStorage("agent_memory.db")
8438
chroma = chromadb.Client()
8539

8640
# Hook: sync vectors atomically with SQL
87-
hook = ChromaSyncHook(
88-
client=chroma,
89-
collection_name="memory",
90-
text_field="content",
91-
metadata_fields=["role"]
92-
)
93-
9441
mem = MemoryStore(sqlite)
95-
mem.add_hook(hook)
42+
mem.add_hook(ChromaSyncHook(chroma, "agent_memory", text_field="content", metadata_fields=["role"]))
9643

97-
# Atomic commit: SQL + Vectors
44+
# Multi-step agent workflow
45+
# Each commit is atomic: if vector DB fails, SQL write is automatically rolled back
9846
mem.commit(Fact(
9947
type="profile_update",
10048
payload={"content": "User prefers vegetarian", "role": "preference"}
10149
))
10250

103-
# Rollback: removes SQL row + vector entry
104-
mem.rollback(1)
51+
# Attempt a task that may fail
52+
try:
53+
mem.commit(Fact(
54+
type="shopping_list",
55+
payload={"content": "Generate shopping list based on plan", "role": "task"}
56+
))
57+
except HookError as e:
58+
print("Commit failed, operation rolled back automatically:", e)
59+
60+
# Optional manual rollback of previous step
61+
# mem.rollback(1) # uncomment if you want to undo the last saved fact
10562
```
10663

64+
That's it. Your agent memory is now transactional.
65+
10766
---
10867

109-
## How MemState compares
68+
## LangGraph integration
11069

111-
| Operation | Without MemState | With MemState |
112-
| --------------------------- | -------------------- | ------------------- |
113-
| Vector DB write fails | ❌ SQL+Vector diverge | ✔ auto-rollback |
114-
| Partial workflow crash | ❌ ghost vectors | ✔ consistent |
115-
| LLM outputs malformed JSON | ❌ corrupt state | ✔ schema validation |
116-
| Need to undo last N actions | ❌ impossible | ✔ rollback() |
117-
| Need deterministic behavior | ❌ drift | ✔ ACID-like |
70+
```python
71+
from memstate.integrations.langgraph import MemStateCheckpointer
72+
73+
checkpointer = MemStateCheckpointer(memory=mem)
74+
app = workflow.compile(checkpointer=checkpointer)
75+
```
11876

11977
---
12078

121-
## Ideal for
79+
## Why MemState exists
12280

123-
* **Long-running agents**
124-
* **LangGraph projects** needing reliable state
125-
* **RAG systems** where DB data must match embeddings
126-
* **Local-first setups** (SQLite + Chroma/Qdrant/FAISS)
81+
### The Problem
12782

128-
---
83+
AI agents usually store memory in **two places**:
12984

130-
## LangGraph integration
85+
* **SQL**: structured facts (preferences, task history)
86+
* **Vector DB**: semantic search (embeddings for RAG)
87+
88+
These two stores **drift easily**. Even small failures create inconsistency:
13189

13290
```python
133-
from memstate.integrations.langgraph import MemStateCheckpointer
91+
# Step 1: SQL write succeeds
92+
db.update("user_city", "London")
13493

135-
checkpointer = MemStateCheckpointer(memory=mem)
136-
app = workflow.compile(checkpointer=checkpointer)
94+
# Step 2: Vector DB update fails
95+
vectors.upsert("User lives in London") # ❌ failed
96+
97+
# Final state:
98+
# SQL: London
99+
# Vectors: New York (stale embedding)
137100
```
138101

102+
**Result:** ghost vectors, inconsistent state, unpredictable agent behavior.<br>
103+
Drift accumulates silently, agents continue retrieving outdated or mismatched memory.
104+
105+
---
106+
107+
### Why this happens
108+
109+
Real-world agent pipelines create drift **even with correct code**, because:
110+
111+
* Vector DB upserts are **not atomic**
112+
* Retried writes can produce **duplicates or stale embeddings**
113+
* Async ingestion leads to **race conditions**
114+
* LLM outputs often contain **non-schema JSON**
115+
* Embedding model/version changes create **semantic mismatch**
116+
* SQL writes succeed while vector DB fails, partial updates persist
117+
118+
These issues are **invisible until retrieval fails**, making debugging extremely difficult.
119+
MemState prevents this by enforcing **atomic memory operations**: if any part fails, the whole operation is rolled back.
120+
121+
---
122+
123+
## The Solution
124+
125+
MemState makes memory operations **atomic**:
126+
127+
```
128+
SQL write + Vector upsert
129+
→ succeed together or rollback together
130+
```
131+
132+
## Key features
133+
134+
* **Atomic commits**: SQL and Vector DB stay in sync
135+
* **Rollback**: undo N steps across SQL and vectors
136+
* **Type safety**: Pydantic validation prevents malformed JSON
137+
* **Append-only Fact Log**: full versioned history
138+
* **Crash-safe atomicity**: if a vector DB write fails, the entire memory operation (SQL + vector) is **rolled back**.
139+
No partial writes, no ghost embeddings, no inconsistent checkpoints.
140+
141+
---
142+
143+
## Proof: Benchmark under failure
144+
145+
1000 memory updates with **10% random vector DB failures**:
146+
147+
| METRIC | MANUAL SYNC | MEMSTATE |
148+
| ---------------------- | ----------- | -------- |
149+
| SQL Records | 1000 | 900 |
150+
| Vector Records | 910 | 900 |
151+
| **DATA DRIFT** | **90** | **0** |
152+
| **INCONSISTENCY RATE** | **9.0%** | **0.0%** |
153+
154+
**Why 900 instead of 1000?**
155+
MemState refuses partial writes.<br>
156+
If vector sync fails, SQL is rolled back automatically.
157+
158+
Manual sync produces silent drift.<br>
159+
Drift compounds over time, stale embeddings keep being retrieved forever.
160+
161+
Full benchmark script: [`benchmarks/`](benchmarks/)
162+
163+
---
164+
165+
## Ideal for
166+
167+
* Long-running agents
168+
* LangGraph workflows
169+
* RAG systems requiring strict DB <-> embedding consistency
170+
* Local-first / offline-first setups (SQLite/Redis + Chroma/Qdrant/FAISS)
171+
* Deterministic, debuggable agentic pipelines
172+
139173
---
140174

141175
## Storage backends
142176

177+
All backends participate in the same atomic commit cycle:
178+
143179
* SQLite (JSON1)
144180
* Redis
145181
* In-memory
@@ -149,16 +185,39 @@ Vector sync hooks: ChromaDB (more coming)
149185

150186
---
151187

188+
## When you don't need it
189+
190+
* Your agent is fully stateless
191+
* You store everything in a single SQL table
192+
* You never update embeddings after creation
193+
194+
If your pipelines depend on RAG or long-term state, consistency **is** required - most teams realize this only when debugging unpredictable retrieval.
195+
196+
---
197+
152198
## Status
153199

154-
**Alpha.** API stable enough for prototypes and local agents.
155-
Semantic Versioning.
200+
**Early Access.**
201+
Production-ready for local agents and prototypes. API is stable (Semantic Versioning).
202+
203+
Community contributions welcome.
204+
205+
---
206+
207+
## Install
208+
209+
```bash
210+
pip install memstate[chromadb]
211+
```
212+
213+
**[Star the repo](https://github.com/scream4ik/MemState)**
214+
🐛 **[Report issues](https://github.com/scream4ik/MemState/issues)**
156215

157216
---
158217

159218
## License
160219

161-
Licensed under the [Apache 2.0 License](LICENSE).
220+
Apache 2.0 — see [LICENSE](LICENSE)
162221

163222
---
164223

assets/docs/demo.gif

1.12 MB
Loading

0 commit comments

Comments
 (0)