Skip to content

Commit 9890ae2

Browse files
committed
Configurable model selection
1 parent b4915d7 commit 9890ae2

File tree

15 files changed

+605
-114
lines changed

15 files changed

+605
-114
lines changed

.github/workflows/python-tests.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ jobs:
1515
- name: Set up Python
1616
uses: actions/setup-python@v4
1717
with:
18-
python-version: '3.11'
18+
python-version: '3.12'
1919
cache: 'pip'
2020

2121
- name: Install dependencies

README.md

Lines changed: 82 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,55 +1,106 @@
11
# Redis Memory Server
22

3-
A Python memory server for agents and LLM applications. This application
4-
provides memory features for LLM conversations, including short-term memory
5-
(message history) and long-term memory (vector embeddings for semantic search).
3+
A service that provides memory management for AI applications using Redis.
64

75
## Features
86

9-
- Short-term memory storage for conversation history
10-
- Optional long-term memory with semantic search capabilities
11-
- Automatic context summarization to handle long conversations
12-
- Integration with OpenAI API (more coming soon)
13-
- Redis-based storage with vector search
7+
- Short-term memory management with configurable window size
8+
- Long-term memory with semantic search capabilities
9+
- Automatic context summarization using LLMs
10+
- Support for multiple model providers (OpenAI and Anthropic)
11+
- Configurable token limits based on selected model
12+
13+
## Configuration
14+
15+
The service can be configured using environment variables:
16+
17+
- `REDIS_URL`: URL for Redis connection (default: `redis://localhost:6379`)
18+
- `LONG_TERM_MEMORY`: Enable/disable long-term memory (default: `True`)
19+
- `WINDOW_SIZE`: Maximum number of messages to keep in short-term memory (default: `20`)
20+
- `OPENAI_API_KEY`: API key for OpenAI
21+
- `ANTHROPIC_API_KEY`: API key for Anthropic
22+
- `GENERATION_MODEL`: Model to use for text generation (default: `gpt-4o-mini`)
23+
- `EMBEDDING_MODEL`: Model to use for text embeddings (default: `text-embedding-3-small`)
24+
- `PORT`: Port to run the server on (default: `8000`)
25+
26+
## Supported Models
27+
28+
### OpenAI Models
29+
30+
- `gpt-3.5-turbo`: 4K context window
31+
- `gpt-3.5-turbo-16k`: 16K context window
32+
- `gpt-4`: 8K context window
33+
- `gpt-4-32k`: 32K context window
34+
- `gpt-4o`: 128K context window
35+
- `gpt-4o-mini`: 128K context window
36+
37+
### Anthropic Models
38+
39+
- `claude-3-opus-20240229`: 200K context window
40+
- `claude-3-sonnet-20240229`: 200K context window
41+
- `claude-3-haiku-20240307`: 200K context window
42+
- `claude-3-5-sonnet-20240620`: 200K context window
43+
44+
**Note**: Embedding operations always use OpenAI models, as Anthropic does not provide embedding API.
1445

1546
## Installation
1647

1748
1. Clone the repository
18-
2. Install dependencies:
49+
2. Install dependencies: `pip install -r requirements.txt`
50+
3. Set up environment variables (see Configuration section)
51+
4. Run the server: `python main.py`
52+
53+
## Usage
54+
55+
### Add Messages to Memory
56+
1957
```
20-
pip install -r requirements.txt
58+
POST /sessions/{session_id}/memory
2159
```
22-
3. Set up environment variables:
60+
61+
Request body:
62+
```json
63+
{
64+
"messages": [
65+
{
66+
"role": "user",
67+
"content": "Hello, how are you?"
68+
},
69+
{
70+
"role": "assistant",
71+
"content": "I'm doing well, thank you for asking!"
72+
}
73+
],
74+
"context": "Optional previous summary"
75+
}
2376
```
24-
# Required
25-
REDIS_URL=redis://localhost:6379
2677

27-
# Optional
28-
PORT=8000
29-
LONG_TERM_MEMORY=true
30-
MAX_WINDOW_SIZE=12
31-
MODEL=gpt-3.5-turbo
78+
### Get Memory
3279

33-
# For OpenAI
34-
OPENAI_API_KEY=your_openai_api_key
80+
```
81+
GET /sessions/{session_id}/memory
3582
```
3683

37-
## Usage
38-
39-
Start the server:
84+
### Search Memory
4085

4186
```
42-
python main.py
87+
POST /sessions/{session_id}/retrieval
4388
```
4489

45-
## API Endpoints
90+
Request body:
91+
```json
92+
{
93+
"text": "What was the conversation about?"
94+
}
95+
```
96+
97+
## Development
98+
99+
To run tests:
46100

47-
- `GET /health`: Health check endpoint
48-
- `GET /sessions`: Get a list of session IDs
49-
- `GET /sessions/{session_id}/memory`: Get memory for a session
50-
- `POST /sessions/{session_id}/memory`: Add messages to a session
51-
- `DELETE /sessions/{session_id}/memory`: Delete a session's memory
52-
- `POST /sessions/{session_id}/retrieval`: Perform semantic search on session memory
101+
```
102+
python -m pytest
103+
```
53104

54105
## License
55106
TBD

config.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ class Settings(BaseSettings):
77
long_term_memory: bool = True
88
window_size: int = 20
99
openai_api_key: str = os.getenv("OPENAI_API_KEY", "")
10+
anthropic_api_key: str = os.getenv("ANTHROPIC_API_KEY", "")
1011
generation_model: str = "gpt-4o-mini"
1112
embedding_model: str = "text-embedding-3-small"
1213
port: int = 8000

long_term_memory.py

Lines changed: 36 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,15 @@
1-
from typing import List, Type
1+
from typing import List, Type, Union, Any
22
import nanoid
3-
import numpy as np
43
from redis.asyncio import Redis
54
from redis.commands.search.query import Query
65
from models import (
76
MemoryMessage,
87
OpenAIClientWrapper,
8+
AnthropicClientWrapper,
99
RedisearchResult,
1010
SearchResults,
11+
ModelProvider,
12+
get_model_config,
1113
)
1214
import logging
1315

@@ -19,7 +21,7 @@
1921
async def index_messages(
2022
messages: List[MemoryMessage],
2123
session_id: str,
22-
openai_client: OpenAIClientWrapper,
24+
client: OpenAIClientWrapper, # Only OpenAI supports embeddings currently
2325
redis_conn: Redis,
2426
) -> None:
2527
"""Index messages in Redis for vector search"""
@@ -28,7 +30,7 @@ async def index_messages(
2830
contents = [msg.content for msg in messages]
2931

3032
# Get embeddings from OpenAI
31-
embeddings = await openai_client.create_embedding(contents)
33+
embeddings = await client.create_embedding(contents)
3234

3335
# Index each message with its embedding
3436
for index, embedding in enumerate(embeddings):
@@ -64,16 +66,18 @@ class Unset:
6466
async def search_messages(
6567
query: str,
6668
session_id: str,
67-
openai_client: OpenAIClientWrapper,
69+
client: OpenAIClientWrapper, # Only OpenAI supports embeddings currently
6870
redis_conn: Redis,
6971
distance_threshold: float | Type[Unset] = Unset,
7072
limit: int = 10,
7173
) -> SearchResults:
7274
"""Search for messages using vector similarity"""
7375
try:
7476
# Get embedding for query
75-
query_embedding = await openai_client.create_embedding([query])
77+
query_embedding = await client.create_embedding([query])
7678
vector = query_embedding.tobytes()
79+
80+
# Set up query parameters
7781
params = {"vec": vector}
7882

7983
if distance_threshold and distance_threshold is not Unset:
@@ -85,26 +89,46 @@ async def search_messages(
8589
base_query = Query(
8690
f"@session:{{{session_id}}}=>[KNN {limit} @vector $vec AS dist]"
8791
)
92+
8893
q = (
8994
base_query.return_fields("role", "content", "dist")
9095
.sort_by("dist", asc=True)
9196
.paging(0, limit)
9297
.dialect(2)
9398
)
9499

100+
# Execute search
95101
raw_results = await redis_conn.ft(REDIS_INDEX_NAME).search(
96102
q,
97103
query_params=params, # type: ignore
98104
)
99105

100-
# Parse results
101-
results = [
102-
RedisearchResult(role=doc.role, content=doc.content, dist=doc.dist)
103-
for doc in raw_results.docs
104-
]
106+
# Parse results safely
107+
results = []
108+
total_results = 0
109+
110+
# Check if raw_results has the expected attributes
111+
if hasattr(raw_results, "docs") and isinstance(raw_results.docs, list):
112+
for doc in raw_results.docs:
113+
if (
114+
hasattr(doc, "role")
115+
and hasattr(doc, "content")
116+
and hasattr(doc, "dist")
117+
):
118+
results.append(
119+
RedisearchResult(
120+
role=doc.role, content=doc.content, dist=float(doc.dist)
121+
)
122+
)
123+
124+
total_results = getattr(raw_results, "total", len(results))
125+
else:
126+
# Handle the case where raw_results doesn't have the expected structure
127+
logger.warning("Unexpected search result format")
128+
total_results = 0
105129

106130
logger.info(f"Found {len(results)} results for query in session {session_id}")
107-
return SearchResults(total=raw_results.total, docs=results)
131+
return SearchResults(total=total_results, docs=results)
108132
except Exception as e:
109133
logger.error(f"Error searching messages: {e}")
110134
raise

main.py

Lines changed: 68 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from fastapi import FastAPI
77

88
import utils
9+
from models import ModelProvider, MODEL_CONFIGS
910

1011
load_dotenv()
1112

@@ -36,12 +37,58 @@ async def startup_event():
3637
"""Initialize the application on startup"""
3738
logger.info("Starting Redis Memory Server 🤘")
3839

40+
# Check for required API keys
41+
available_providers = []
42+
43+
if settings.openai_api_key:
44+
available_providers.append(ModelProvider.OPENAI)
45+
else:
46+
logger.warning("OpenAI API key not set, OpenAI models will not be available")
47+
48+
if settings.anthropic_api_key:
49+
available_providers.append(ModelProvider.ANTHROPIC)
50+
else:
51+
logger.warning(
52+
"Anthropic API key not set, Anthropic models will not be available"
53+
)
54+
55+
# Check if the configured models are available
56+
generation_model_config = MODEL_CONFIGS.get(settings.generation_model)
57+
embedding_model_config = MODEL_CONFIGS.get(settings.embedding_model)
58+
59+
if (
60+
generation_model_config
61+
and generation_model_config.provider not in available_providers
62+
):
63+
logger.warning(
64+
f"Selected generation model {settings.generation_model} requires {generation_model_config.provider} API key"
65+
)
66+
67+
if (
68+
embedding_model_config
69+
and embedding_model_config.provider not in available_providers
70+
):
71+
logger.warning(
72+
f"Selected embedding model {settings.embedding_model} requires {embedding_model_config.provider} API key"
73+
)
74+
75+
# If long-term memory is enabled but OpenAI isn't available, warn user
76+
if settings.long_term_memory and ModelProvider.OPENAI not in available_providers:
77+
logger.warning(
78+
"Long-term memory requires OpenAI for embeddings, but OpenAI API key is not set"
79+
)
80+
3981
# Set up RediSearch index if long-term memory is enabled
4082
if settings.long_term_memory:
4183
redis = get_redis_conn()
4284

43-
# For now, just ada support
44-
vector_dimensions = 1536
85+
# Get embedding dimensions from model config
86+
embedding_model_config = MODEL_CONFIGS.get(settings.embedding_model)
87+
vector_dimensions = (
88+
embedding_model_config.embedding_dimensions
89+
if embedding_model_config
90+
else 1536
91+
)
4592
distance_metric = "COSINE"
4693

4794
try:
@@ -50,6 +97,25 @@ async def startup_event():
5097
logger.error(f"Failed to ensure RediSearch index: {e}")
5198
raise
5299

100+
# Show available models
101+
openai_models = [
102+
model
103+
for model, config in MODEL_CONFIGS.items()
104+
if config.provider == ModelProvider.OPENAI
105+
and ModelProvider.OPENAI in available_providers
106+
]
107+
anthropic_models = [
108+
model
109+
for model, config in MODEL_CONFIGS.items()
110+
if config.provider == ModelProvider.ANTHROPIC
111+
and ModelProvider.ANTHROPIC in available_providers
112+
]
113+
114+
if openai_models:
115+
logger.info(f"Available OpenAI models: {', '.join(openai_models)}")
116+
if anthropic_models:
117+
logger.info(f"Available Anthropic models: {', '.join(anthropic_models)}")
118+
53119
logger.info(
54120
"Redis Memory Server initialized",
55121
window_size=settings.window_size,

0 commit comments

Comments
 (0)