Skip to content

Commit 25e3149

Browse files
authored
Merge pull request #1 from redis-developer/filters
Expand filters available for long-term search in the MCP and API servers
2 parents 1906a7e + 3f2753f commit 25e3149

File tree

19 files changed

+1159
-462
lines changed

19 files changed

+1159
-462
lines changed

.github/workflows/python-tests.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ jobs:
3434
strategy:
3535
fail-fast: false
3636
matrix:
37-
python-version: [3.12, 3.13]
37+
python-version: [3.12] # Not testing with 3.13 at the moment
3838
redis-version: ['6.2.6-v9', 'latest'] # 8.0-M03 is not working atm
3939

4040
steps:
@@ -57,6 +57,7 @@ jobs:
5757
- name: Install dependencies
5858
run: |
5959
python -m pip install --upgrade pip
60+
pip install uv
6061
uv sync --all-extras
6162
6263
- name: Run tests

README.md

Lines changed: 115 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,27 +7,35 @@ A Redis-powered memory server built for AI agents and applications. It manages b
77
- **Short-Term Memory**
88
- Storage for messages, token count, context, and metadata for a session
99
- Automatically and recursively summarizes conversations
10-
- Token limit management based on specific model capabilities
10+
- Client model-aware token limit management (adapts to the context window of the client's LLM)
11+
- Supports all major OpenAI and Anthropic models
1112

1213
- **Long-Term Memory**
1314
- Storage for long-term memories across sessions
14-
- Semantic search to retrieve memories, with filters such as topic, entity, etc.
15+
- Semantic search to retrieve memories with advanced filtering system
16+
- Filter by session, namespace, topics, entities, timestamps, and more
17+
- Supports both exact match and semantic similarity search
1518
- Automatic topic modeling for stored memories with BERTopic
1619
- Automatic Entity Recognition using BERT
1720

1821
- **Other Features**
19-
- Support for OpenAI and Anthropic model providers
2022
- Namespace support for session and long-term memory isolation
2123
- Both a REST interface and MCP server
2224

2325
## System Diagram
2426
![System Diagram](diagram.png)
2527

26-
## Roadmap
27-
- Long-term memory deduplication
28+
## Project Status and Roadmap
29+
### Project Status: In Development, Pre-Release
30+
31+
This project is under active development and is **pre-release** software. Think of it as an early beta!
32+
33+
### Roadmap
34+
- Long-term memory deduplication and compaction
2835
- Configurable strategy for moving session memory to long-term memory
29-
- Auth hooks
36+
- Authentication/authorization hooks
3037
- Use a background task system instead of `BackgroundTask`
38+
- Separate Redis connections for long-term and short-term memory
3139

3240
## REST API Endpoints
3341

@@ -50,6 +58,11 @@ The following endpoints are available:
5058
- **GET /sessions/{session_id}/memory**
5159
Retrieves conversation memory for a session, including messages and
5260
summarized older messages.
61+
_Query Parameters:_
62+
- `namespace` (string, optional): The namespace to use for the session
63+
- `window_size` (int, optional): Number of messages to include in the response (default from config)
64+
- `model_name` (string, optional): The client's LLM model name to determine appropriate context window size
65+
- `context_window_max` (int, optional): Direct specification of max context window tokens (overrides model_name)
5366

5467
- **POST /sessions/{session_id}/memory**
5568
Adds messages (and optional context) to a session's memory.
@@ -75,6 +88,40 @@ The following endpoints are available:
7588
}
7689
```
7790

91+
- **POST /long-term-memory/search**
92+
Performs semantic search on long-term memories with advanced filtering options.
93+
_Request Body Example:_
94+
```json
95+
{
96+
"text": "Search query text",
97+
"limit": 10,
98+
"offset": 0,
99+
"session_id": {"eq": "session-123"},
100+
"namespace": {"eq": "default"},
101+
"topics": {"any": ["AI", "Machine Learning"]},
102+
"entities": {"all": ["OpenAI", "Claude"]},
103+
"created_at": {"gte": 1672527600, "lte": 1704063599},
104+
"last_accessed": {"gt": 1704063600},
105+
"user_id": {"eq": "user-456"}
106+
}
107+
```
108+
109+
_Filter options:_
110+
- Tag filters (session_id, namespace, topics, entities, user_id):
111+
- `eq`: Equals this value
112+
- `ne`: Not equals this value
113+
- `any`: Contains any of these values
114+
- `all`: Contains all of these values
115+
116+
- Numeric filters (created_at, last_accessed):
117+
- `gt`: Greater than
118+
- `lt`: Less than
119+
- `gte`: Greater than or equal
120+
- `lte`: Less than or equal
121+
- `eq`: Equals
122+
- `ne`: Not equals
123+
- `between`: Between two values
124+
78125
## MCP Server Interface
79126
Agent Memory Server offers an MCP (Model Context Protocol) server interface powered by FastMCP, providing tool-based long-term memory management:
80127

@@ -86,15 +133,29 @@ Agent Memory Server offers an MCP (Model Context Protocol) server interface powe
86133

87134
### Local Install
88135

89-
1. Install the package and required dependencies:
136+
First, you'll need to download this repository. After you've downloaded it, you can install and run the servers.
137+
138+
1. Install the package and required dependencies with pip, ideally into a virtual environment:
90139
```bash
91140
pip install -e .
92141
```
93142

94-
2. Start both the REST API server and MCP server:
143+
**NOTE:** This project uses `uv` for dependency management, so if you have uv installed, you can run `uv sync` instead of `pip install ...` to install the project's dependencies.
144+
145+
2 (a). The easiest way to start the REST API server and MCP server in SSE mode is to use Docker Compose. See the Docker Compose section of this file for more details.
146+
147+
2 (b). You can also run the REST API and MCP servers directly:
148+
#### REST API
95149
```bash
96150
python -m agent_memory_server.main
97151
```
152+
#### MCP Server
153+
The MCP server can run in either SSE mode or stdio:
154+
```bash
155+
python -m agent_memory_server.mcp <sse|stdio>
156+
```
157+
158+
**NOTE:** With uv, just prefix the command with `uv`, e.g.: `uv run python -m agent_memory_server.mcp sse`.
98159

99160
### Docker Compose
100161

@@ -114,6 +175,51 @@ To start the API using Docker Compose, follow these steps:
114175
6. To stop the containers, press Ctrl+C in the terminal and then run:
115176
docker-compose down
116177

178+
## Using the MCP Server with Claude Desktop, Cursor, etc.
179+
You can use the MCP server that comes with this project in any application or SDK that supports MCP tools.
180+
181+
### Claude
182+
<img src="claude.png">
183+
184+
For example, with Claude, use the following configuration:
185+
```json
186+
{
187+
"mcpServers": {
188+
"redis-memory-server": {
189+
"command": "uv",
190+
"args": [
191+
"--directory",
192+
"/ABSOLUTE/PATH/TO/REPO/DIRECTORY/agent-memory-server",
193+
"run",
194+
"python",
195+
"-m",
196+
"agent_memory_server.mcp",
197+
"stdio"
198+
]
199+
}
200+
}
201+
}
202+
```
203+
**NOTE:** On a Mac, this configuration requires that you use `brew install uv` to install uv. Probably any method that makes the `uv`
204+
command globally accessible, so Claude can find it, would work.
205+
206+
### Cursor
207+
208+
<img src="cursor.png">
209+
210+
Cursor's MCP config is similar to Claude's, but it also supports SSE servers, so you can run the server yourself and pass in the URL:
211+
212+
```json
213+
{
214+
"mcpServers": {
215+
"redis-memory-server": {
216+
"url": "http://localhost:9000/sse"
217+
}
218+
}
219+
}
220+
```
221+
222+
117223
## Configuration
118224

119225
You can configure the service using environment variables:
@@ -164,8 +270,8 @@ python -m pytest
164270
```
165271

166272
## Known Issues
167-
- The MCP server from the Python MCP SDK often refuses to shut down with Control-C if it's connected to a client
168273
- All background tasks run as async coroutines in the same process as the REST API server, using Starlette's `BackgroundTask`
274+
- ~~The MCP server from the Python MCP SDK often refuses to shut down with Control-C if it's connected to a client~~
169275

170276
### Contributing
171277
1. Fork the repository

agent_memory_server/api.py

Lines changed: 57 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,10 @@
1+
from typing import Literal
2+
13
from fastapi import APIRouter, BackgroundTasks, Depends, HTTPException
24

35
from agent_memory_server import long_term_memory, messages
46
from agent_memory_server.config import settings
7+
from agent_memory_server.llms import get_model_config
58
from agent_memory_server.logging import get_logger
69
from agent_memory_server.models import (
710
AckResponse,
@@ -18,6 +21,32 @@
1821

1922
logger = get_logger(__name__)
2023

24+
ModelNameLiteral = Literal[
25+
"gpt-3.5-turbo",
26+
"gpt-3.5-turbo-16k",
27+
"gpt-4",
28+
"gpt-4-32k",
29+
"gpt-4o",
30+
"gpt-4o-mini",
31+
"o1",
32+
"o1-mini",
33+
"o3-mini",
34+
"text-embedding-ada-002",
35+
"text-embedding-3-small",
36+
"text-embedding-3-large",
37+
"claude-3-opus-20240229",
38+
"claude-3-sonnet-20240229",
39+
"claude-3-haiku-20240307",
40+
"claude-3-5-sonnet-20240620",
41+
"claude-3-7-sonnet-20250219",
42+
"claude-3-5-sonnet-20241022",
43+
"claude-3-5-haiku-20241022",
44+
"claude-3-7-sonnet-latest",
45+
"claude-3-5-sonnet-latest",
46+
"claude-3-5-haiku-latest",
47+
"claude-3-opus-latest",
48+
]
49+
2150
router = APIRouter()
2251

2352

@@ -54,6 +83,8 @@ async def get_session_memory(
5483
session_id: str,
5584
namespace: str | None = None,
5685
window_size: int = settings.window_size,
86+
model_name: ModelNameLiteral | None = None,
87+
context_window_max: int | None = None,
5788
):
5889
"""
5990
Get memory for a session.
@@ -62,18 +93,31 @@ async def get_session_memory(
6293
6394
Args:
6495
session_id: The session ID
65-
window_size: The number of messages to include in the response
6696
namespace: The namespace to use for the session
97+
window_size: The number of messages to include in the response
98+
model_name: The client's LLM model name (will determine context window size if provided)
99+
context_window_max: Direct specification of the context window max tokens (overrides model_name)
67100
68101
Returns:
69102
Conversation history and context
70103
"""
71104
redis = get_redis_conn()
72105

106+
# If context_window_max is explicitly provided, use that
107+
if context_window_max is not None:
108+
effective_window_size = min(window_size, context_window_max)
109+
# If model_name is provided, get its max_tokens from our config
110+
elif model_name is not None:
111+
model_config = get_model_config(model_name)
112+
effective_window_size = min(window_size, model_config.max_tokens)
113+
# Otherwise use the default window_size
114+
else:
115+
effective_window_size = window_size
116+
73117
session = await messages.get_session_memory(
74118
redis=redis,
75119
session_id=session_id,
76-
window_size=window_size,
120+
window_size=effective_window_size,
77121
namespace=namespace,
78122
)
79123
if not session:
@@ -162,13 +206,10 @@ async def create_long_term_memory(
162206
@router.post("/long-term-memory/search", response_model=LongTermMemoryResultsResponse)
163207
async def search_long_term_memory(payload: SearchPayload):
164208
"""
165-
Run a semantic search on long-term memory
166-
167-
TODO: Infer topics, entities for `text` and attempt to use them
168-
as boosts or filters in the search.
209+
Run a semantic search on long-term memory with filtering options.
169210
170211
Args:
171-
payload: Search payload
212+
payload: Search payload with filter objects for precise queries
172213
173214
Returns:
174215
List of search results
@@ -178,7 +219,15 @@ async def search_long_term_memory(payload: SearchPayload):
178219
if not settings.long_term_memory:
179220
raise HTTPException(status_code=400, detail="Long-term memory is disabled")
180221

222+
# Extract filter objects from the payload
223+
filters = payload.get_filters()
224+
225+
# Pass text, redis, and filter objects to the search function
181226
return await long_term_memory.search_long_term_memories(
182227
redis=redis,
183-
**payload.model_dump(exclude_none=True),
228+
text=payload.text,
229+
distance_threshold=payload.distance_threshold,
230+
limit=payload.limit,
231+
offset=payload.offset,
232+
**filters,
184233
)

agent_memory_server/config.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,5 +24,11 @@ class Settings(BaseSettings):
2424
enable_topic_extraction: bool = True
2525
enable_ner: bool = True
2626

27+
# RedisVL Settings
28+
redisvl_distance_metric: str = "COSINE"
29+
redisvl_vector_dimensions: str = "1536"
30+
redisvl_index_name: str = "memory"
31+
redisvl_index_prefix: str = "memory"
32+
2733

2834
settings = Settings()

0 commit comments

Comments
 (0)