1- # MemState — Transactional Memory for AI Agents
1+ # MemState - Transactional Memory for AI Agents
22
3- ** Keeps SQL and Vector DBs in sync. No drift. No ghost data. ACID-like consistency for agent state.**
3+ ** Agents hallucinate because their memory drifts.**
4+ SQL says one thing, the Vector DB says another. MemState keeps them in sync, always.
5+
6+ > ** Mental Model:** MemState extends ** database transactions** to your Vector DB.<br >
7+ > One unit. One commit. One rollback.
48
59[ ![ PyPI version] ( https://img.shields.io/pypi/v/memstate.svg )] ( https://pypi.org/project/memstate/ )
610[ ![ PyPI Downloads] ( https://static.pepy.tech/personalized-badge/memstate?period=total&units=INTERNATIONAL_SYSTEM&left_color=GREY&right_color=GREEN&left_text=downloads )] ( https://pepy.tech/projects/memstate )
711[ ![ Python versions] ( https://img.shields.io/pypi/pyversions/memstate.svg )] ( https://pypi.org/project/memstate/ )
812[ ![ License] ( https://img.shields.io/pypi/l/memstate.svg )] ( https://github.com/scream4ik/MemState/blob/main/LICENSE )
913[ ![ Tests] ( https://github.com/scream4ik/MemState/actions/workflows/test.yml/badge.svg )] ( https://github.com/scream4ik/MemState/actions )
1014
11- ---
12-
13- ## Why MemState exists
14-
15- AI agents usually store memory in two places:
16-
17- * ** SQL** (structured facts)
18- * ** Vector DB** (semantic search context)
19-
20- These two ** drift** easily:
21-
22- ### ❌ Example of real-world corruption
23-
24- ``` python
25- # Step 1: SQL write succeeds
26- db.update(" user_city" , " London" )
27-
28- # Step 2: Vector DB update fails (timeout)
29- vectors.upsert(" User lives in London" ) # ❌ failed
30-
31- # Final state:
32- SQL : London
33- Vectors: New York
34- → Agent retrieves stale context and behaves unpredictably
35- ```
36-
37- Failures, crashes, retries, malformed payloads — all silently accumulate “ghost vectors” and inconsistent state.
38-
39- ** Vector DBs don't have transactions.
40- JSON memory has no schema.
41- Agents drift over time.**
42-
43- ---
44-
45- ## What MemState does
46-
47- MemState makes all memory operations ** atomic** :
48-
49- ```
50- SQL write + Vector upsert
51- → succeed together or rollback together
52- ```
53-
54- Also provides:
55-
56- * ** Rollback** : undo N steps (SQL + vectors)
57- * ** Type safety** : Pydantic schema validation
58- * ** Append-only Fact Log** : full version history
59- * ** Crash safety** : WAL replay for vector sync
60-
6115<p align =" center " >
6216 <img src =" https://raw.githubusercontent.com/scream4ik/MemState/main/assets/docs/demo.gif " width =" 100% " />
6317 <br >
@@ -68,78 +22,160 @@ Also provides:
6822
6923---
7024
71- ## Minimal example (copy–paste)
25+ ## Quick Start
7226
7327``` bash
7428pip install memstate[chromadb]
7529```
7630
7731``` python
78- from memstate import MemoryStore, Fact, SQLiteStorage
32+ from memstate import MemoryStore, Fact, SQLiteStorage, HookError
7933from memstate.integrations.chroma import ChromaSyncHook
8034import chromadb
8135
8236# Storage
83- sqlite = SQLiteStorage(" state .db" )
37+ sqlite = SQLiteStorage(" agent_memory .db" )
8438chroma = chromadb.Client()
8539
8640# Hook: sync vectors atomically with SQL
87- hook = ChromaSyncHook(
88- client = chroma,
89- collection_name = " memory" ,
90- text_field = " content" ,
91- metadata_fields = [" role" ]
92- )
93-
9441mem = MemoryStore(sqlite)
95- mem.add_hook(hook )
42+ mem.add_hook(ChromaSyncHook(chroma, " agent_memory " , text_field = " content " , metadata_fields = [ " role " ]) )
9643
97- # Atomic commit: SQL + Vectors
44+ # Multi-step agent workflow
45+ # Each commit is atomic: if vector DB fails, SQL write is automatically rolled back
9846mem.commit(Fact(
9947 type = " profile_update" ,
10048 payload = {" content" : " User prefers vegetarian" , " role" : " preference" }
10149))
10250
103- # Rollback: removes SQL row + vector entry
104- mem.rollback(1 )
51+ # Attempt a task that may fail
52+ try :
53+ mem.commit(Fact(
54+ type = " shopping_list" ,
55+ payload = {" content" : " Generate shopping list based on plan" , " role" : " task" }
56+ ))
57+ except HookError as e:
58+ print (" Commit failed, operation rolled back automatically:" , e)
59+
60+ # Optional manual rollback of previous step
61+ # mem.rollback(1) # uncomment if you want to undo the last saved fact
10562```
10663
64+ That's it. Your agent memory is now transactional.
65+
10766---
10867
109- ## How MemState compares
68+ ## LangGraph integration
11069
111- | Operation | Without MemState | With MemState |
112- | --------------------------- | -------------------- | ------------------- |
113- | Vector DB write fails | ❌ SQL+Vector diverge | ✔ auto-rollback |
114- | Partial workflow crash | ❌ ghost vectors | ✔ consistent |
115- | LLM outputs malformed JSON | ❌ corrupt state | ✔ schema validation |
116- | Need to undo last N actions | ❌ impossible | ✔ rollback() |
117- | Need deterministic behavior | ❌ drift | ✔ ACID-like |
70+ ``` python
71+ from memstate.integrations.langgraph import MemStateCheckpointer
72+
73+ checkpointer = MemStateCheckpointer(memory = mem)
74+ app = workflow.compile(checkpointer = checkpointer)
75+ ```
11876
11977---
12078
121- ## Ideal for
79+ ## Why MemState exists
12280
123- * ** Long-running agents**
124- * ** LangGraph projects** needing reliable state
125- * ** RAG systems** where DB data must match embeddings
126- * ** Local-first setups** (SQLite + Chroma/Qdrant/FAISS)
81+ ### The Problem
12782
128- ---
83+ AI agents usually store memory in ** two places ** :
12984
130- ## LangGraph integration
85+ * ** SQL** : structured facts (preferences, task history)
86+ * ** Vector DB** : semantic search (embeddings for RAG)
87+
88+ These two stores ** drift easily** . Even small failures create inconsistency:
13189
13290``` python
133- from memstate.integrations.langgraph import MemStateCheckpointer
91+ # Step 1: SQL write succeeds
92+ db.update(" user_city" , " London" )
13493
135- checkpointer = MemStateCheckpointer(memory = mem)
136- app = workflow.compile(checkpointer = checkpointer)
94+ # Step 2: Vector DB update fails
95+ vectors.upsert(" User lives in London" ) # ❌ failed
96+
97+ # Final state:
98+ # SQL: London
99+ # Vectors: New York (stale embedding)
137100```
138101
102+ ** Result:** ghost vectors, inconsistent state, unpredictable agent behavior.<br >
103+ Drift accumulates silently, agents continue retrieving outdated or mismatched memory.
104+
105+ ---
106+
107+ ### Why this happens
108+
109+ Real-world agent pipelines create drift ** even with correct code** , because:
110+
111+ * Vector DB upserts are ** not atomic**
112+ * Retried writes can produce ** duplicates or stale embeddings**
113+ * Async ingestion leads to ** race conditions**
114+ * LLM outputs often contain ** non-schema JSON**
115+ * Embedding model/version changes create ** semantic mismatch**
116+ * SQL writes succeed while vector DB fails, partial updates persist
117+
118+ These issues are ** invisible until retrieval fails** , making debugging extremely difficult.
119+ MemState prevents this by enforcing ** atomic memory operations** : if any part fails, the whole operation is rolled back.
120+
121+ ---
122+
123+ ## The Solution
124+
125+ MemState makes memory operations ** atomic** :
126+
127+ ```
128+ SQL write + Vector upsert
129+ → succeed together or rollback together
130+ ```
131+
132+ ## Key features
133+
134+ * ** Atomic commits** : SQL and Vector DB stay in sync
135+ * ** Rollback** : undo N steps across SQL and vectors
136+ * ** Type safety** : Pydantic validation prevents malformed JSON
137+ * ** Append-only Fact Log** : full versioned history
138+ * ** Crash-safe atomicity** : if a vector DB write fails, the entire memory operation (SQL + vector) is ** rolled back** .
139+ No partial writes, no ghost embeddings, no inconsistent checkpoints.
140+
141+ ---
142+
143+ ## Proof: Benchmark under failure
144+
145+ 1000 memory updates with ** 10% random vector DB failures** :
146+
147+ | METRIC | MANUAL SYNC | MEMSTATE |
148+ | ---------------------- | ----------- | -------- |
149+ | SQL Records | 1000 | 900 |
150+ | Vector Records | 910 | 900 |
151+ | ** DATA DRIFT** | ** 90** | ** 0** |
152+ | ** INCONSISTENCY RATE** | ** 9.0%** | ** 0.0%** |
153+
154+ ** Why 900 instead of 1000?**
155+ MemState refuses partial writes.<br >
156+ If vector sync fails, SQL is rolled back automatically.
157+
158+ Manual sync produces silent drift.<br >
159+ Drift compounds over time, stale embeddings keep being retrieved forever.
160+
161+ Full benchmark script: [ ` benchmarks/ ` ] ( benchmarks/ )
162+
163+ ---
164+
165+ ## Ideal for
166+
167+ * Long-running agents
168+ * LangGraph workflows
169+ * RAG systems requiring strict DB <-> embedding consistency
170+ * Local-first / offline-first setups (SQLite/Redis + Chroma/Qdrant/FAISS)
171+ * Deterministic, debuggable agentic pipelines
172+
139173---
140174
141175## Storage backends
142176
177+ All backends participate in the same atomic commit cycle:
178+
143179* SQLite (JSON1)
144180* Redis
145181* In-memory
@@ -149,16 +185,39 @@ Vector sync hooks: ChromaDB (more coming)
149185
150186---
151187
188+ ## When you don't need it
189+
190+ * Your agent is fully stateless
191+ * You store everything in a single SQL table
192+ * You never update embeddings after creation
193+
194+ If your pipelines depend on RAG or long-term state, consistency ** is** required - most teams realize this only when debugging unpredictable retrieval.
195+
196+ ---
197+
152198## Status
153199
154- ** Alpha.** API stable enough for prototypes and local agents.
155- Semantic Versioning.
200+ ** Early Access.**
201+ Production-ready for local agents and prototypes. API is stable (Semantic Versioning).
202+
203+ Community contributions welcome.
204+
205+ ---
206+
207+ ## Install
208+
209+ ``` bash
210+ pip install memstate[chromadb]
211+ ```
212+
213+ ⭐ ** [ Star the repo] ( https://github.com/scream4ik/MemState ) **
214+ 🐛 ** [ Report issues] ( https://github.com/scream4ik/MemState/issues ) **
156215
157216---
158217
159218## License
160219
161- Licensed under the [ Apache 2.0 License ] ( LICENSE ) .
220+ Apache 2.0 — see [ LICENSE ] ( LICENSE )
162221
163222---
164223
0 commit comments