Skip to content

menonpg/soul.py

Repository files navigation

soul.py 🧠

PyPI version PyPI downloads License: MIT Featured on Shipit Product Hunt Amazon Book

Star History

Your AI forgets everything when the conversation ends. soul.py fixes that.

📖 NEW: The book is out! Soul: Building AI Agents That Remember Who They Are — everything here + deep dives on identity, memory patterns, multi-agent coordination, and the philosophy of persistent AI. Get it on Amazon →

from hybrid_agent import HybridAgent

agent = HybridAgent()
agent.ask("My name is Prahlad and I'm building an AI research lab.")

# New process. New session. Memory persists.
agent = HybridAgent()
result = agent.ask("What do you know about me?")
print(result["answer"])
# → "You're Prahlad, building an AI research lab."

No database. No server. Just markdown files and smart retrieval.


▶ Live Demos

Version Demo What it shows
v0.1 soul.themenonlab.com Memory persists across sessions
v1.0 soulv1.themenonlab.com Semantic RAG retrieval
v2.0 soulv2.themenonlab.com Auto query routing: RAG + RLM
Ask Darwin soul-book.themenonlab.com 📖 Book companion — watch routing decisions live

📚 The Book

Soul: Building AI Agents That Remember Who They Are

The complete guide to persistent AI memory. Covers:

  • Why agents forget (and the architectural fix)
  • Identity vs Memory (SOUL.md vs MEMORY.md)
  • RAG vs RLM (when to use each)
  • Multi-agent memory sharing
  • Darwinian evolution of agent identity
  • Working code in every chapter

→ Available on Amazon


Install

pip install soul-agent
pip install soul-agent[anthropic]
pip install soul-agent[openai]
pip install soul-agent[gemini]   # ✅ Now available!

Quickstart

soul init   # creates SOUL.md and MEMORY.md
# v0.1 — simple markdown memory (great starting point)
from soul import Agent
agent = Agent(provider="anthropic")
agent.ask("Remember this.")

# v2.0 — automatic RAG + RLM routing (this repo's default)
from hybrid_agent import HybridAgent
agent = HybridAgent()  # auto-detects best retrieval per query
result = agent.ask("What do you know about me?")
print(result["answer"])
print(result["route"])   # "RAG" or "RLM"

Multi-Provider Support

soul.py works with any LLM provider — no SDK lock-in:

# Anthropic (default)
agent = HybridAgent(provider="anthropic")  # Uses ANTHROPIC_API_KEY

# Google Gemini
agent = HybridAgent(
    provider="gemini",
    chat_model="gemini-2.5-pro",       # or gemini-2.0-flash, gemini-2.5-flash
    router_model="gemini-2.0-flash",   # keep router cheap
)  # Uses GEMINI_API_KEY

# OpenAI
agent = HybridAgent(provider="openai")  # Uses OPENAI_API_KEY

# Local via Ollama
agent = HybridAgent(
    provider="openai-compatible",
    base_url="http://localhost:11434/v1",
    chat_model="llama3.2",
)
Provider Default Model Env Var
anthropic claude-haiku-4-5 ANTHROPIC_API_KEY
gemini gemini-2.0-flash GEMINI_API_KEY
openai gpt-4o-mini OPENAI_API_KEY
openai-compatible llama3.2 OPENAI_API_KEY (optional)

☁️ SoulMate API — Managed Cloud Option

Don't want to manage local files? SoulMate API gives you persistent memory as a service:

from soulmate import SoulMateClient

# Sign up at soulmate-api.themenonlab.com/docs
client = SoulMateClient(
    api_key="sm_live_...",
    anthropic_key="sk-ant-..."  # BYOK — your own Anthropic key
)

# That's it. Memory persists in the cloud.
response = client.ask("My name is Prahlad.")
response = client.ask("What's my name?")  # → "Prahlad"
Local (soul.py) Cloud (SoulMate API)
Files on your machine Managed cloud storage
You control everything Zero infrastructure
Git-versioned memory API-based, instant setup
Free forever Free tier available

Get started: soulmate-api.themenonlab.com/docs


How it works

soul.py uses two markdown files as persistent state:

File Purpose
SOUL.md Identity — who the agent is, how it behaves
MEMORY.md Memory — timestamped log of every exchange

v2.0 adds a query router that automatically dispatches to the right retrieval strategy:

Your query
    ↓
Router (fast LLM call)
    ├── FOCUSED  (~90%) → RAG — vector search, sub-second
    └── EXHAUSTIVE (~10%) → RLM — recursive synthesis, thorough

Architecture based on: RAG + RLM: The Complete Knowledge Base Architecture


Branches

Branch Description Best for
main v2.0 — RAG + RLM hybrid (default) Production use
v2.0-rag-rlm Same as main, versioned Pinning to v2
v1.0-rag RAG only, no RLM Simpler setup
v0.1-stable Pure markdown, zero deps Learning / prototyping

v2.0 API

result = agent.ask("What is my name?")

result["answer"]        # the response
result["route"]         # "RAG" or "RLM"
result["router_ms"]     # router latency
result["retrieval_ms"]  # retrieval latency
result["total_ms"]      # total latency
result["rag_context"]   # retrieved chunks (RAG path)
result["rlm_meta"]      # chunk stats (RLM path)

v2.0 Setup

agent = HybridAgent(
    soul_path="SOUL.md",
    memory_path="MEMORY.md",
    mode="auto",                    # "auto" | "rag" | "rlm"
    qdrant_url="...",               # or set QDRANT_URL env var
    qdrant_api_key="...",           # or QDRANT_API_KEY
    azure_embedding_endpoint="...", # or AZURE_EMBEDDING_ENDPOINT
    azure_embedding_key="...",      # or AZURE_EMBEDDING_KEY
    k=5,                            # RAG retrieval count
)

Falls back to BM25 (keyword) if Qdrant/Azure not configured.


📚 Knowledge Bases + Memory

soul.py isn't just for personal memory — the same architecture works for custom knowledge bases. Combine both in a single agent:

agent = HybridAgent(
    soul_path="SOUL.md",
    memory_path="MEMORY.md",        # Per-user memory
    knowledge_dir="./knowledge",     # Your corpus (docs, products, policies)
)

# Index your knowledge base once
agent.index_knowledge()

# Now the agent searches both pools
agent.ask("What's the return policy?")         # → Knowledge base
agent.ask("What was I asking about earlier?")  # → User memory
agent.ask("Which product fits my needs?")      # → Both

Example use cases:

Agent Type Knowledge Base Memory
Support Bot Product docs, policies, FAQs Customer history, preferences
Research Assistant Paper corpus, methodologies User's focus, papers read
Onboarding Buddy Company handbook, org chart New hire's role, questions
Book Companion Full book content Reader's interests, progress

Darwin (the AI companion for the Soul book) uses exactly this pattern — the entire book indexed as knowledge, plus per-reader conversation memory.

See the Memory Architecture Patterns guide for detailed implementation patterns.


🔌 Framework Integrations

Already using a framework? Drop in soul.py memory with one line:

Framework Package Install
LangChain langchain-soul pip install langchain-soul
LlamaIndex llamaindex-soul pip install llamaindex-soul
CrewAI crewai-soul pip install crewai-soul
# LangChain
from langchain_soul import SoulChatMessageHistory
history = SoulChatMessageHistory(session_id="user-123")

# LlamaIndex
from llamaindex_soul import SoulChatStore
chat_store = SoulChatStore()

# CrewAI
from crewai_soul import SoulMemory
memory = SoulMemory()

Each integration includes:

  • soul-agent — RAG + RLM hybrid retrieval
  • soul-schema — Database semantic layer (auto-document your tables)
  • SoulMate client — Managed cloud option

Why not LangChain / LlamaIndex / MemGPT?

Those are orchestration frameworks. soul.py is a primitive — persistent identity and memory you can drop into anything you're building.

  • No framework lock-in — works with any LLM provider, or with your favorite framework via integrations above
  • Human-readable — SOUL.md and MEMORY.md are plain text
  • Version-controllable — git diff your agent's memories
  • Composable — use just the parts you need

Roadmap

See ROADMAP.md for planned features and how to contribute.


License

MIT

Citation

@software{menon2026soul,
  author = {Menon, Prahlad G.},
  title  = {soul.py: Persistent Identity and Memory for LLM Agents},
  year   = {2026},
  url    = {https://github.com/menonpg/soul.py}
}

About

Persistent identity and memory for any LLM agent — markdown-native, provider-agnostic

Resources

Stars

Watchers

Forks

Packages

 
 
 

Languages