Skip to content

alpha-xone/langmiddle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

167 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧩 LangMiddle β€” Production Middleware for LangGraph

Supercharge your LangGraph agents with plug‑and‑play memory, context management, and chat persistence.

CI PyPI version Python versions License GitHub stars


🎯 Why LangMiddle?

Building production LangGraph agents? You need:

  • πŸ’Ύ Persistent chat history across sessions
  • 🧠 Long-term memory that remembers user preferences and context
  • πŸ” Semantic fact retrieval to inject relevant knowledge
  • πŸ—‘οΈ Clean message handling without tool noise

LangMiddle delivers all of this with zero boilerplateβ€”just add middleware to your agent.

✨ Key Features

Feature Description
πŸš€ Zero Config Start Works out-of-the-box with in-memory SQLiteβ€”no database setup
πŸ”„ Multi-Backend Storage Switch between SQLite, PostgreSQL, Supabase, Firebase with one parameter
🧠 Semantic Memory Automatic fact extraction, deduplication, and context injection
πŸ“ Smart Summarization Auto-compress long conversations while preserving context
πŸ” Production Ready JWT auth, RLS support, type-safe APIs, comprehensive logging
⚑ LangGraph Native Built for LangChain/LangGraph v1 middleware pattern

πŸ“‹ Table of Contents


πŸ“¦ Installation

Core Package (includes SQLite support):

pip install langmiddle

With Optional Backends:

# For SQLite with vector search (sqlite-vec)
pip install langmiddle[sqlite]

# For PostgreSQL
pip install langmiddle[postgres]

# For Supabase (includes PostgreSQL)
pip install langmiddle[supabase]

# For Firebase
pip install langmiddle[firebase]

# All backends + extras
pip install langmiddle[all]

πŸš€ Quick Start

Basic Chat Persistence (SQLite)

Get started in 30 seconds:

from langchain.agents import create_agent
from langmiddle.history import ChatSaver, StorageContext

agent = create_agent(
    model="openai:gpt-4o",
    tools=[],
    context_schema=StorageContext,
    middleware=[
        ChatSaver()  # Uses in-memory SQLite by default
    ],
)

# Chat history automatically saved!
agent.invoke(
    input={"messages": [{"role": "user", "content": "Hello!"}]},
    context=StorageContext(
        thread_id="conversation-123",
        user_id="user-456"
    )
)

Production Setup (Recommended)

For production apps, use StorageConfig to share settings between middleware components (like ContextEngineer for memory and ChatSaver for history).

from langchain.agents import create_agent
from langmiddle import StorageConfig
from langmiddle.history import ChatSaver, StorageContext
from langmiddle.context import ContextEngineer

# 1. Define shared configuration (e.g., for Supabase)
config = StorageConfig(
    backend="supabase",
    enable_facts=True,
    auto_create_tables=True
)

# 2. Create agent with middleware
agent = create_agent(
    model="openai:gpt-4o",
    tools=[],
    context_schema=StorageContext,
    middleware=[
        # Both components use the same config
        ContextEngineer(
            model="openai:gpt-4o",
            embedder="openai:text-embedding-3-small",
            backend=config
        ),
        ChatSaver(backend=config)
    ]
)

# 3. Invoke with context
agent.invoke(
    input={"messages": [{"role": "user", "content": "I'm vegan."}]},
    context=StorageContext(
        thread_id="thread-1",
        user_id="user-1",
        auth_token="your-jwt-token"
    )
)

πŸ” How It Works

Message Flow

User Input
    ↓
[ToolRemover] ← Cleans tool messages (optional)
    ↓
[ContextEngineer.before_agent] ← Injects facts + summary
    ↓
πŸ€– LangGraph Agent
    ↓
[ContextEngineer.after_agent] ← Extracts new facts
    ↓
[ChatSaver] ← Persists conversation
    ↓
Response

Fact Lifecycle

Conversation β†’ Extraction β†’ Deduplication β†’ Embedding β†’ Storage
                                ↓                          ↓
                          Query & Retrieve         Relevance Scoring
                                ↓                (recency + access + usage)
                          Context Injection              ↓
                          (adaptive detail)      Combined Score
                                ↓                (70% similarity
                               Agent              + 30% relevance)

Phase 3 Relevance Scoring:

  • Recency (40%): Newer facts score higher (exponential decay over 365 days)
  • Access Frequency (30%): Facts used more often get boosted
  • Usage Feedback (30%): Facts appearing in agent responses are prioritized
  • Adaptive Formatting: High-relevance facts (β‰₯0.8) get full detail, medium (0.5-0.8) compact, low (0.3-0.5) minimal

πŸ› οΈ Available Middleware

ChatSaver - Persist Conversations

Automatically save chat histories to your database of choice.

Features:

  • βœ… Multi-backend: SQLite, PostgreSQL, Supabase, Firebase
  • βœ… Automatic deduplication (skips already-saved messages)
  • βœ… Save interval control (every N turns)
  • βœ… Custom state persistence

Example:

from langmiddle.history import ChatSaver
from langmiddle import StorageConfig

# Option 1: Simple string setup
ChatSaver(
    backend="sqlite",
    db_path="./chat.db",
    save_interval=1
)

# Option 2: Shared config object (Recommended)
config = StorageConfig(backend="sqlite", db_path="./chat.db")
ChatSaver(backend=config)

Supported Backends:

Backend Use Case Auth Required
SQLite Development, local apps ❌ No
PostgreSQL Self-hosted production ❌ No
Supabase Managed PostgreSQL + RLS βœ… JWT
Firebase Mobile, real-time apps βœ… ID token

ToolRemover - Clean Tool Messages

Remove tool-related clutter from conversation history.

Why? Tool call messages and responses bloat chat history and aren't relevant for long-term storage.

Example:

from langmiddle.history import ToolRemover

middleware=[
    ToolRemover(when="both"),  # Filter before AND after agent
    ChatSaver()  # Clean messages are saved
]

Options:

  • when="before" - Filter before agent sees messages
  • when="after" - Filter before saving to storage
  • when="both" - Filter in both directions (recommended)

ContextEngineer - Intelligent Memory & Context

The brain of your agent β€” automatic fact extraction, semantic search, and context injection.

🧠 What It Does

  1. Extracts Facts: Identifies user preferences, goals, and key information
  2. Stores Semantically: Embeds facts for similarity search
  3. Retrieves Contextually: Injects relevant memories based on user queries
  4. Auto-Summarizes: Compresses old conversations to save tokens

πŸ”₯ Key Features

Feature Description
Semantic Fact Storage Vector-based storage with deduplication
Smart Extraction Filters out ephemeral states (e.g., "user understood")
Namespace Organization Hierarchical fact categories (["user", "preferences", "food"])
Automatic Summarization Configurable token thresholds
Atomic Query Breaking Splits complex queries for better retrieval
Relevance Scoring Dynamic scoring based on recency, access patterns, and usage feedback
Adaptive Formatting Context detail adjusts based on fact relevance
Caching Embeddings and core facts cached for performance

πŸ“ Example Usage

from langmiddle import StorageConfig
from langmiddle.context import ContextEngineer

# 1. Define configuration
config = StorageConfig(
    backend="supabase",
    auto_create_tables=True,
    enable_facts=True
)

# 2. Initialize middleware
agent = create_agent(
    model="openai:gpt-4o",
    tools=[],
    context_schema=StorageContext,
    middleware=[
        ContextEngineer(
            model="openai:gpt-4o",
            embedder="openai:text-embedding-3-small",
            backend=config,  # Pass config object
            max_tokens_before_summarization=5000,
            extraction_interval=3
        )
    ],
)

# 3. Use it!
response = agent.invoke(
    input={"messages": [{"role": "user", "content": "I love spicy Thai food"}]},
    context=StorageContext(
        thread_id="conversation-123",
        user_id="user-456",
        auth_token="your-jwt-token"
    )
)

# Later in the conversation...
response = agent.invoke(
    input={"messages": [{"role": "user", "content": "Recommend a restaurant"}]},
    context=StorageContext(
        thread_id="conversation-123",
        user_id="user-456",
        auth_token="your-jwt-token"
    )
)
# Agent remembers: "user loves spicy Thai food" and uses it for recommendations!

βš™οΈ Configuration Options

ContextEngineer(
    model="openai:gpt-4o",                    # LLM for extraction & summarization
    embedder="openai:text-embedding-3-small", # Embedding model for semantic search
    backend="supabase",                       # Storage backend

    # Extraction settings
    extraction_interval=3,                    # Extract facts every N turns
    max_tokens_before_extraction=None,        # Or trigger by token count

    # Summarization settings
    max_tokens_before_summarization=5000,     # Auto-summarize at 5k tokens

    # Context injection
    core_namespaces=[                         # Always-loaded fact categories
        ["user", "personal_info"],
        ["user", "preferences"]
    ],

    # Relevance scoring (Phase 3)
    relevance_threshold=0.3,                  # Minimum relevance score to inject
    similarity_weight=0.7,                    # Weight for vector similarity
    relevance_weight=0.3,                     # Weight for relevance score
    enable_adaptive_formatting=True,          # Adjust detail based on relevance

    # Backend configuration
    backend_kwargs={'enable_facts': True}
)

πŸ“Š What Gets Stored

Facts Examples:

[
  {
    "content": "User prefers concise and formal answers",
    "namespace": ["user", "preferences", "communication"],
    "intensity": 1.0,
    "confidence": 0.97,
    "language": "en"
  },
  {
    "content": "User's name is Alex",
    "namespace": ["user", "personal_info"],
    "intensity": 0.9,
    "confidence": 0.98,
    "language": "en"
  }
]

What's NOT stored (filtered out by design):

  • ❌ Transient states: "User understands X", "User is satisfied"
  • ❌ Single-use requests: "User wants a code example"
  • ❌ Politeness markers: "User says thank you"
  • ❌ Momentary emotions: "User feels frustrated right now"

πŸ’Ύ Storage Backends

Comparison Guide

Backend Best For Setup Complexity Scalability Vector Search Auth Cost
SQLite β€’ Local development
β€’ Demos
β€’ Single-user apps
⭐ Trivial πŸ”΅ Single machine βœ… (sqlite-vec) None Free
PostgreSQL β€’ Self-hosted production
β€’ Custom infrastructure
β€’ Full control
⭐⭐ Medium πŸ”΅πŸ”΅πŸ”΅ High (with replication) βœ… (pgvector) Custom Infrastructure cost
Supabase β€’ Web apps
β€’ Multi-tenant SaaS
β€’ Real-time features
⭐⭐ Easy πŸ”΅πŸ”΅πŸ”΅ High (managed) βœ… (pgvector) JWT + RLS Free tier + usage
Firebase β€’ Mobile apps
β€’ Google Cloud ecosystem
β€’ Real-time sync
⭐⭐ Easy πŸ”΅πŸ”΅πŸ”΅ Global (managed) ❌ Firebase Auth Free tier + usage

πŸ—‚οΈ Backend Configuration

SQLite β€” Zero-config local storage with vector search
from langmiddle.history import ChatSaver
from langmiddle.context import ContextEngineer

# Basic chat storage (file-based)
ChatSaver(backend="sqlite", db_path="./chat.db")

# With semantic memory (requires sqlite-vec)
# Install: pip install langmiddle[sqlite]
store = ChatStorage.create(
    "sqlite",
    db_path="./chat.db",
    auto_create_tables=True,
    enable_facts=True  # Enables vector similarity search
)

# In-memory (testing/dev)
ChatSaver(backend="sqlite", db_path=":memory:")

Features:

  • βœ… Zero configuration required
  • βœ… Vector similarity search via sqlite-vec
  • βœ… Full Phase 3 relevance scoring
  • βœ… Perfect for local development and demos

No environment variables needed!

PostgreSQL β€” Self-hosted database

Environment variables (.env):

POSTGRES_CONNECTION_STRING=postgresql://user:password@localhost:5432/dbname

Python code:

from langmiddle.storage import ChatStorage

# First-time setup: create tables
store = ChatStorage.create(
    "postgres",
    connection_string="postgresql://user:pass@localhost:5432/db",
    auto_create_tables=True
)

# In middleware
ChatSaver(backend="postgres")

πŸ“– PostgreSQL Setup Guide

Supabase β€” Managed PostgreSQL with RLS

Environment variables (.env):

SUPABASE_URL=https://your-project.supabase.co
SUPABASE_ANON_KEY=your-anon-key

# For table creation (one-time):
SUPABASE_CONNECTION_STRING=postgresql://postgres:[password]@db.[project].supabase.co:5432/postgres

Python code:

from langmiddle.context import ContextEngineer

# First-time setup: create tables
store = ChatStorage.create(
    "supabase",
    auto_create_tables=True,
    enable_facts=True  # Enable semantic memory tables
)

# In middleware
ContextEngineer(
    model="openai:gpt-4o",
    embedder="openai:text-embedding-3-small",
    backend="supabase",
    backend_kwargs={'enable_facts': True}
)

Context requirements:

context=StorageContext(
    thread_id="conversation-123",
    user_id="user-456",
    auth_token="jwt-token-from-supabase-auth"  # Required for RLS
)
Firebase β€” Real-time NoSQL database

Service Account Setup:

  1. Download service account JSON from Firebase Console
  2. Set environment variable OR pass path directly

Option 1: Environment variable

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/firebase-creds.json"

Option 2: Direct path

ChatSaver(
    backend="firebase",
    credentials_path="./firebase-creds.json"
)

Context requirements:

context=StorageContext(
    thread_id="conversation-123",
    user_id="user-456",
    auth_token="firebase-id-token"  # From Firebase Auth
)

πŸ“š Examples

πŸŽ“ Complete Examples

1. Simple Chat Bot with Persistence
from langchain.agents import create_agent
from langmiddle.history import ChatSaver, StorageContext

agent = create_agent(
    model="openai:gpt-4o",
    tools=[],
    context_schema=StorageContext,
    middleware=[ChatSaver(backend="sqlite", db_path="./chatbot.db")]
)

# Conversation 1
agent.invoke(
    {"messages": [{"role": "user", "content": "My name is Alice"}]},
    context=StorageContext(thread_id="thread-1", user_id="alice")
)

# Conversation 2 (same thread - history preserved!)
agent.invoke(
    {"messages": [{"role": "user", "content": "What's my name?"}]},
    context=StorageContext(thread_id="thread-1", user_id="alice")
)
# Response: "Your name is Alice"
2. Agent with Tools (Clean History)
from langchain.agents import create_agent
from langmiddle.history import ChatSaver, ToolRemover, StorageContext

def search_tool(query: str) -> str:
    return f"Search results for: {query}"

agent = create_agent(
    model="openai:gpt-4o",
    tools=[search_tool],
    context_schema=StorageContext,
    middleware=[
        ToolRemover(when="both"),  # Remove tool clutter
        ChatSaver(backend="sqlite", db_path="./agent.db")
    ]
)

agent.invoke(
    {"messages": [{"role": "user", "content": "Search for LangGraph tutorials"}]},
    context=StorageContext(thread_id="thread-1", user_id="user-1")
)
# Only user/assistant messages saved, no tool call noise!
3. Production Agent with Memory (Supabase)
from langchain.agents import create_agent
from langmiddle import StorageConfig
from langmiddle.context import ContextEngineer
from langmiddle.storage import ChatStorage
import os

# 1. Define shared config
config = StorageConfig(
    backend="supabase",
    enable_facts=True,
    auto_create_tables=True
)

# 2. One-time setup (optional, if not using auto_create_tables in config)
if os.getenv("INIT_TABLES"):
    store = ChatStorage.create(**config.to_kwargs())
    print("βœ… Tables created!")
    exit()

# 3. Create agent
agent = create_agent(
    model="openai:gpt-4o",
    tools=[],
    context_schema=StorageContext,
    middleware=[
        ContextEngineer(
            model="openai:gpt-4o",
            embedder="openai:text-embedding-3-small",
            backend=config,
            max_tokens_before_summarization=5000,
            extraction_interval=3
        )
    ]
)

# 4. Use in your app
def chat(user_id: str, thread_id: str, message: str, jwt_token: str):
    response = agent.invoke(
        {"messages": [{"role": "user", "content": message}]},
        context=StorageContext(
            thread_id=thread_id,
            user_id=user_id,
            auth_token=jwt_token
        )
    )
    return response["messages"][-1]["content"]

# Example usage
chat(
    user_id="user-123",
    thread_id="thread-456",
    message="I prefer vegetarian food and hate spicy dishes",
    jwt_token="eyJ..."
)
# Facts extracted: ["User prefers vegetarian food", "User dislikes spicy dishes"]

chat(
    user_id="user-123",
    thread_id="thread-789",
    message="Recommend a restaurant",
    jwt_token="eyJ..."
)
# Agent uses stored preferences to recommend vegetarian, non-spicy options!
4. Custom Configuration
from langmiddle.context import (
    ContextEngineer,
    ExtractionConfig,
    SummarizationConfig,
    ContextConfig
)

agent = create_agent(
    model="openai:gpt-4o",
    tools=[],
    context_schema=StorageContext,
    middleware=[
        ContextEngineer(
            model="openai:gpt-4o",
            embedder="openai:text-embedding-3-small",
            backend="supabase",

            # Custom extraction settings
            extraction_config=ExtractionConfig(
                interval=5,              # Extract every 5 turns
                max_tokens=2000,         # Or when 2k tokens accumulated
                prompt="<custom prompt>"  # Override extraction prompt
            ),

            # Custom summarization settings
            summarization_config=SummarizationConfig(
                max_tokens=8000,         # Summarize at 8k tokens
                keep_ratio=0.3,          # Keep last 30% of messages
                prefix="## Summary:\n"  # Custom prefix
            ),

            # Custom context injection
            context_config=ContextConfig(
                core_namespaces=[        # Custom always-loaded categories
                    ["user", "profile"],
                    ["user", "preferences"],
                    ["project", "settings"]
                ]
            ),

            backend_kwargs={'enable_facts': True}
        )
    ]
)

🎨 Architecture Highlights

  • πŸ”Œ Modular Design: Mix and match middleware components
  • 🎯 Single Responsibility: Each middleware does one thing well
  • ⚑ Performance: Embedding caching, batch operations, efficient queries
  • πŸ›‘οΈ Type Safety: Full Pydantic validation and type hints
  • πŸ“Š Observable: Structured logging with operation IDs and metrics
  • πŸ§ͺ Testable: Clean abstractions, dependency injection

🀝 Contributing

We welcome contributions! Here's how you can help:

  • πŸ› Report bugs via GitHub Issues
  • πŸ’‘ Request features or improvements
  • πŸ”§ Submit PRs for bug fixes or new features
  • πŸ“– Improve docs with examples or clarifications
  • ⭐ Star the repo if LangMiddle helped you!

Development Setup

git clone https://github.com/alpha-xone/langmiddle.git
cd langmiddle
pip install -e ".[dev]"
pytest

πŸ“„ License

Apache License 2.0 β€” see LICENSE for details.


🌟 Show Your Support

If LangMiddle saves you time or helps your project, please:

  • ⭐ Star the repo on GitHub
  • 🐦 Share on Twitter/X
  • πŸ’¬ Tell others in the LangChain community

Built with ❀️ for the LangGraph ecosystem

About

🧩 LangMiddle β€” Production Middleware for LangGraph

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages