redis
diff --git a/‎.github/workflows/python-tests.yml‎
Lines changed: 2 additions & 1 deletion b/‎.github/workflows/python-tests.yml‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎README.md‎
Lines changed: 115 additions & 9 deletions b/‎README.md‎
Lines changed: 115 additions & 9 deletions
diff --git a/‎agent_memory_server/api.py‎
Lines changed: 57 additions & 8 deletions b/‎agent_memory_server/api.py‎
Lines changed: 57 additions & 8 deletions
diff --git a/‎agent_memory_server/config.py‎
Lines changed: 6 additions & 0 deletions b/‎agent_memory_server/config.py‎
Lines changed: 6 additions & 0 deletions
@@ -34,7 +34,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        python-version: [3.12, 3.13]
+        python-version: [3.12]  # Not testing with 3.13 at the moment
         redis-version: ['6.2.6-v9', 'latest']  # 8.0-M03 is not working atm
 
     steps:
@@ -57,6 +57,7 @@ jobs:
     - name: Install dependencies
       run: |
         python -m pip install --upgrade pip
+        pip install uv
         uv sync --all-extras
 
     - name: Run tests
 
@@ -7,27 +7,35 @@ A Redis-powered memory server built for AI agents and applications. It manages b
 - **Short-Term Memory**
   - Storage for messages, token count, context, and metadata for a session
   - Automatically and recursively summarizes conversations
-  - Token limit management based on specific model capabilities
+  - Client model-aware token limit management (adapts to the context window of the client's LLM)
+  - Supports all major OpenAI and Anthropic models
 
 - **Long-Term Memory**
   - Storage for long-term memories across sessions
-  - Semantic search to retrieve memories, with filters such as topic, entity, etc.
+  - Semantic search to retrieve memories with advanced filtering system
+  - Filter by session, namespace, topics, entities, timestamps, and more
+  - Supports both exact match and semantic similarity search
   - Automatic topic modeling for stored memories with BERTopic
   - Automatic Entity Recognition using BERT
 
 - **Other Features**
-  - Support for OpenAI and Anthropic model providers
   - Namespace support for session and long-term memory isolation
   - Both a REST interface and MCP server
 
 ## System Diagram
 ![System Diagram](diagram.png)
 
-## Roadmap
-- Long-term memory deduplication
+## Project Status and Roadmap
+### Project Status: In Development, Pre-Release
+
+This project is under active development and is **pre-release** software. Think of it as an early beta!
+
+### Roadmap
+- Long-term memory deduplication and compaction
 - Configurable strategy for moving session memory to long-term memory
-- Auth hooks
+- Authentication/authorization hooks
 - Use a background task system instead of `BackgroundTask`
+- Separate Redis connections for long-term and short-term memory
 
 ## REST API Endpoints
 
@@ -50,6 +58,11 @@ The following endpoints are available:
 - **GET /sessions/{session_id}/memory**
   Retrieves conversation memory for a session, including messages and
   summarized older messages.
+  _Query Parameters:_
+  - `namespace` (string, optional): The namespace to use for the session
+  - `window_size` (int, optional): Number of messages to include in the response (default from config)
+  - `model_name` (string, optional): The client's LLM model name to determine appropriate context window size
+  - `context_window_max` (int, optional): Direct specification of max context window tokens (overrides model_name)
 
 - **POST /sessions/{session_id}/memory**
   Adds messages (and optional context) to a session's memory.
@@ -75,6 +88,40 @@ The following endpoints are available:
   }
   ```
 
+- **POST /long-term-memory/search**
+  Performs semantic search on long-term memories with advanced filtering options.
+  _Request Body Example:_
+  ```json
+  {
+    "text": "Search query text",
+    "limit": 10,
+    "offset": 0,
+    "session_id": {"eq": "session-123"},
+    "namespace": {"eq": "default"},
+    "topics": {"any": ["AI", "Machine Learning"]},
+    "entities": {"all": ["OpenAI", "Claude"]},
+    "created_at": {"gte": 1672527600, "lte": 1704063599},
+    "last_accessed": {"gt": 1704063600},
+    "user_id": {"eq": "user-456"}
+  }
+  ```
+
+  _Filter options:_
+  - Tag filters (session_id, namespace, topics, entities, user_id):
+    - `eq`: Equals this value
+    - `ne`: Not equals this value
+    - `any`: Contains any of these values
+    - `all`: Contains all of these values
+
+  - Numeric filters (created_at, last_accessed):
+    - `gt`: Greater than
+    - `lt`: Less than
+    - `gte`: Greater than or equal
+    - `lte`: Less than or equal
+    - `eq`: Equals
+    - `ne`: Not equals
+    - `between`: Between two values
+
 ## MCP Server Interface
 Agent Memory Server offers an MCP (Model Context Protocol) server interface powered by FastMCP, providing tool-based long-term memory management:
 
@@ -86,15 +133,29 @@ Agent Memory Server offers an MCP (Model Context Protocol) server interface powe
 
 ### Local Install
 
-1. Install the package and required dependencies:
+First, you'll need to download this repository. After you've downloaded it, you can install and run the servers.
+
+1. Install the package and required dependencies with pip, ideally into a virtual environment:
    ```bash
    pip install -e .
    ```
 
-2. Start both the REST API server and MCP server:
+**NOTE:** This project uses `uv` for dependency management, so if you have uv installed, you can run `uv sync` instead of `pip install ...` to install the project's dependencies.
+
+2 (a). The easiest way to start the REST API server and MCP server in SSE mode is to use Docker Compose. See the Docker Compose section of this file for more details.
+
+2 (b). You can also run the REST API and MCP servers directly:
+#### REST API
   ```bash
   python -m agent_memory_server.main
   ```
+#### MCP Server
+The MCP server can run in either SSE mode or stdio:
+  ```bash
+  python -m agent_memory_server.mcp <sse|stdio>
+  ```
+
+**NOTE:** With uv, just prefix the command with `uv`, e.g.: `uv run python -m agent_memory_server.mcp sse`.
 
 ### Docker Compose
 
@@ -114,6 +175,51 @@ To start the API using Docker Compose, follow these steps:
 6. To stop the containers, press Ctrl+C in the terminal and then run:
    docker-compose down
 
+## Using the MCP Server with Claude Desktop, Cursor, etc.
+You can use the MCP server that comes with this project in any application or SDK that supports MCP tools.
+
+### Claude
+<img src="claude.png">
+
+For example, with Claude, use the following configuration:
+  ```json
+  {
+    "mcpServers": {
+      "redis-memory-server": {
+          "command": "uv",
+          "args": [
+              "--directory",
+              "/ABSOLUTE/PATH/TO/REPO/DIRECTORY/agent-memory-server",
+              "run",
+              "python",
+              "-m",
+              "agent_memory_server.mcp",
+              "stdio"
+          ]
+      }
+    }
+  }
+  ```
+**NOTE:** On a Mac, this configuration requires that you use `brew install uv` to install uv. Probably any method that makes the `uv`
+command globally accessible, so Claude can find it, would work.
+
+### Cursor
+
+<img src="cursor.png">
+
+Cursor's MCP config is similar to Claude's, but it also supports SSE servers, so you can run the server yourself and pass in the URL:
+
+  ```json
+  {
+    "mcpServers": {
+      "redis-memory-server": {
+        "url": "http://localhost:9000/sse"
+      }
+    }
+  }
+  ```
+
+
 ## Configuration
 
 You can configure the service using environment variables:
@@ -164,8 +270,8 @@ python -m pytest
 ```
 
 ## Known Issues
-- The MCP server from the Python MCP SDK often refuses to shut down with Control-C if it's connected to a client
 - All background tasks run as async coroutines in the same process as the REST API server, using Starlette's `BackgroundTask`
+- ~~The MCP server from the Python MCP SDK often refuses to shut down with Control-C if it's connected to a client~~
 
 ### Contributing
 1. Fork the repository
 
@@ -1,7 +1,10 @@
+from typing import Literal
+
 from fastapi import APIRouter, BackgroundTasks, Depends, HTTPException
 
 from agent_memory_server import long_term_memory, messages
 from agent_memory_server.config import settings
+from agent_memory_server.llms import get_model_config
 from agent_memory_server.logging import get_logger
 from agent_memory_server.models import (
     AckResponse,
@@ -18,6 +21,32 @@
 
 logger = get_logger(__name__)
 
+ModelNameLiteral = Literal[
+    "gpt-3.5-turbo",
+    "gpt-3.5-turbo-16k",
+    "gpt-4",
+    "gpt-4-32k",
+    "gpt-4o",
+    "gpt-4o-mini",
+    "o1",
+    "o1-mini",
+    "o3-mini",
+    "text-embedding-ada-002",
+    "text-embedding-3-small",
+    "text-embedding-3-large",
+    "claude-3-opus-20240229",
+    "claude-3-sonnet-20240229",
+    "claude-3-haiku-20240307",
+    "claude-3-5-sonnet-20240620",
+    "claude-3-7-sonnet-20250219",
+    "claude-3-5-sonnet-20241022",
+    "claude-3-5-haiku-20241022",
+    "claude-3-7-sonnet-latest",
+    "claude-3-5-sonnet-latest",
+    "claude-3-5-haiku-latest",
+    "claude-3-opus-latest",
+]
+
 router = APIRouter()
 
 
@@ -54,6 +83,8 @@ async def get_session_memory(
     session_id: str,
     namespace: str | None = None,
     window_size: int = settings.window_size,
+    model_name: ModelNameLiteral | None = None,
+    context_window_max: int | None = None,
 ):
     """
     Get memory for a session.
@@ -62,18 +93,31 @@ async def get_session_memory(
 
     Args:
         session_id: The session ID
-        window_size: The number of messages to include in the response
         namespace: The namespace to use for the session
+        window_size: The number of messages to include in the response
+        model_name: The client's LLM model name (will determine context window size if provided)
+        context_window_max: Direct specification of the context window max tokens (overrides model_name)
 
     Returns:
         Conversation history and context
     """
     redis = get_redis_conn()
 
+    # If context_window_max is explicitly provided, use that
+    if context_window_max is not None:
+        effective_window_size = min(window_size, context_window_max)
+    # If model_name is provided, get its max_tokens from our config
+    elif model_name is not None:
+        model_config = get_model_config(model_name)
+        effective_window_size = min(window_size, model_config.max_tokens)
+    # Otherwise use the default window_size
+    else:
+        effective_window_size = window_size
+
     session = await messages.get_session_memory(
         redis=redis,
         session_id=session_id,
-        window_size=window_size,
+        window_size=effective_window_size,
         namespace=namespace,
     )
     if not session:
@@ -162,13 +206,10 @@ async def create_long_term_memory(
 @router.post("/long-term-memory/search", response_model=LongTermMemoryResultsResponse)
 async def search_long_term_memory(payload: SearchPayload):
     """
-    Run a semantic search on long-term memory
-
-    TODO: Infer topics, entities for `text` and attempt to use them
-          as boosts or filters in the search.
+    Run a semantic search on long-term memory with filtering options.
 
     Args:
-        payload: Search payload
+        payload: Search payload with filter objects for precise queries
 
     Returns:
         List of search results
@@ -178,7 +219,15 @@ async def search_long_term_memory(payload: SearchPayload):
     if not settings.long_term_memory:
         raise HTTPException(status_code=400, detail="Long-term memory is disabled")
 
+    # Extract filter objects from the payload
+    filters = payload.get_filters()
+
+    # Pass text, redis, and filter objects to the search function
     return await long_term_memory.search_long_term_memories(
         redis=redis,
-        **payload.model_dump(exclude_none=True),
+        text=payload.text,
+        distance_threshold=payload.distance_threshold,
+        limit=payload.limit,
+        offset=payload.offset,
+        **filters,
     )
@@ -24,5 +24,11 @@ class Settings(BaseSettings):
     enable_topic_extraction: bool = True
     enable_ner: bool = True
 
+    # RedisVL Settings
+    redisvl_distance_metric: str = "COSINE"
+    redisvl_vector_dimensions: str = "1536"
+    redisvl_index_name: str = "memory"
+    redisvl_index_prefix: str = "memory"
+
 
 settings = Settings()