Skip to content

Feat: Forgetting mechanism and recency boost #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Aug 13, 2025
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
6b44d7e
style: formatting updates from pre-commit
abrookins Aug 8, 2025
24d321d
feat(redis): DB-level recency ranking via RedisVL VectorQuery and ada…
abrookins Aug 8, 2025
576d6c5
feat(redis): use RangeQuery when distance_threshold is provided; Vect…
abrookins Aug 8, 2025
a6d1961
feat(redis): use RedisVL paging with contextlib.suppress; fix loop va…
abrookins Aug 8, 2025
e447288
feat(redis): AggregateQuery with KNN + APPLY + SORTBY for server_side…
abrookins Aug 8, 2025
eac6268
refactor(redis): integrate RecencyAggregationQuery; fix AggregationQu…
abrookins Aug 8, 2025
5e1f7c5
test(redis): add RecencyAggregationQuery and server_side_recency adap…
abrookins Aug 8, 2025
a633044
fix(redis): coerce list fields in Redis aggregate path; add RecencyAg…
abrookins Aug 8, 2025
4c6b1c1
fix: add _parse_list_field to base adapter; tests now pass including …
abrookins Aug 8, 2025
a8fb65c
fix: address PR feedback - improve type checking, extract complex log…
abrookins Aug 11, 2025
453c7b5
feat: expand short recency parameter names to descriptive ones
abrookins Aug 12, 2025
9db522b
feat: complete vectorstore adapter parameter name updates
abrookins Aug 12, 2025
5f849dc
fix: address PR review feedback
abrookins Aug 12, 2025
c8bdeb9
feat: complete client library parameter naming updates
abrookins Aug 12, 2025
5455792
docs: add descriptive parameter examples
abrookins Aug 12, 2025
6c88daf
More variable name fixes
abrookins Aug 12, 2025
83d5abb
refactor: improve code quality and remove duplication in vectorstore …
abrookins Aug 12, 2025
a1a5a4d
refactor: move imports to top of vectorstore_adapter.py module
abrookins Aug 12, 2025
58ee06b
refactor: resolve PR review comments on recency and MCP changes
abrookins Aug 13, 2025
b324f48
merge: resolve conflicts with main branch
abrookins Aug 13, 2025
c1f0729
fix: update test imports after moving recency functions
abrookins Aug 13, 2025
6785bb1
Merge branch 'main' into feature/forgetting-recency
abrookins Aug 13, 2025
aa8c3ea
Remove task memory file
abrookins Aug 13, 2025
be0abca
fix: add robust error handling for LLM response parsing
abrookins Aug 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -231,5 +231,5 @@ libs/redis/docs/.Trash*
.cursor

*.pyc
ai
.ai
.claude
40 changes: 33 additions & 7 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,42 +5,68 @@ This project uses Redis 8, which is the redis:8 docker image.
Do not use Redis Stack or other earlier versions of Redis.

## Frequently Used Commands
Get started in a new environment by installing `uv`:
```bash
pip install uv
```

### Project Setup
Get started in a new environment by installing `uv`:
```bash
# Development workflow
pip install uv # Install uv (once)
uv venv # Create a virtualenv (once)
source .venv/bin/activate # Activate the virtualenv (start of terminal session)
uv install --all-extras # Install dependencies
uv sync --all-extras # Sync latest dependencies
```

### Activate the virtual environment
You MUST always activate the virtualenv before running commands:

```bash
source .venv/bin/activate
```

### Running Tests
Always run tests before committing. You MUST have 100% of the tests in the
code basepassing to commit.

Run all tests like this, including tests that require API keys in the
environment:
```bash
uv run pytest --run-api-tests
```

### Linting

```bash
uv run ruff check # Run linting
uv run ruff format # Format code
uv run pytest --run-api-tests # Run all tests

### Managing Dependencies
uv add <dependency> # Add a dependency to pyproject.toml and update lock file
uv remove <dependency> # Remove a dependency from pyproject.toml and update lock file

### Running Servers
# Server commands
uv run agent-memory api # Start REST API server (default port 8000)
uv run agent-memory mcp # Start MCP server (stdio mode)
uv run agent-memory mcp --mode sse --port 9000 # Start MCP server (SSE mode)

### Database Operations
# Database/Redis operations
uv run agent-memory rebuild-index # Rebuild Redis search index
uv run agent-memory migrate-memories # Run memory migrations

### Background Tasks
# Background task management
uv run agent-memory task-worker # Start background task worker
# Schedule a specific task
uv run agent-memory schedule-task "agent_memory_server.long_term_memory.compact_long_term_memories"

### Running All Containers
# Docker development
docker-compose up # Start full stack (API, MCP, Redis)
docker-compose up redis # Start only Redis Stack
docker-compose down # Stop all services
```

### Committing Changes
IMPORTANT: This project uses `pre-commit`. You should run `pre-commit`
before committing:
```bash
Expand Down
36 changes: 35 additions & 1 deletion agent-memory-client/agent_memory_client/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
MemoryRecordResults,
MemoryTypeEnum,
ModelNameLiteral,
RecencyConfig,
SessionListResponse,
WorkingMemory,
WorkingMemoryResponse,
Expand Down Expand Up @@ -572,6 +573,7 @@ async def search_long_term_memory(
user_id: UserId | dict[str, Any] | None = None,
distance_threshold: float | None = None,
memory_type: MemoryType | dict[str, Any] | None = None,
recency: RecencyConfig | None = None,
limit: int = 10,
offset: int = 0,
) -> MemoryRecordResults:
Expand Down Expand Up @@ -669,13 +671,45 @@ async def search_long_term_memory(
if distance_threshold is not None:
payload["distance_threshold"] = distance_threshold

# Add recency config if provided
if recency is not None:
if recency.recency_boost is not None:
payload["recency_boost"] = recency.recency_boost
if recency.w_sem is not None:
payload["recency_w_sem"] = recency.w_sem
if recency.w_recency is not None:
payload["recency_w_recency"] = recency.w_recency
if recency.wf is not None:
payload["recency_wf"] = recency.wf
if recency.wa is not None:
payload["recency_wa"] = recency.wa
if recency.half_life_last_access_days is not None:
payload["recency_half_life_last_access_days"] = (
recency.half_life_last_access_days
)
if recency.half_life_created_days is not None:
payload["recency_half_life_created_days"] = (
recency.half_life_created_days
)
if recency.server_side_recency is not None:
payload["server_side_recency"] = recency.server_side_recency

try:
response = await self._client.post(
"/v1/long-term-memory/search",
json=payload,
)
response.raise_for_status()
return MemoryRecordResults(**response.json())
data = response.json()
# Some tests may stub json() as an async function; handle awaitable
try:
import inspect

if inspect.isawaitable(data):
data = await data
except Exception:
pass
return MemoryRecordResults(**data)
except httpx.HTTPStatusError as e:
self._handle_http_error(e.response)
raise
Expand Down
24 changes: 24 additions & 0 deletions agent-memory-client/agent_memory_client/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,30 @@ class MemoryRecordResult(MemoryRecord):
dist: float


class RecencyConfig(BaseModel):
"""Client-side configuration for recency-aware ranking options."""

recency_boost: bool | None = Field(
default=None, description="Enable recency-aware re-ranking"
)
w_sem: float | None = Field(default=None, description="Weight for semantic score")
w_recency: float | None = Field(
default=None, description="Weight for recency composite"
)
wf: float | None = Field(default=None, description="Weight for freshness")
wa: float | None = Field(default=None, description="Weight for age/novelty")
half_life_last_access_days: float | None = Field(
default=None, description="Half-life (days) for last_accessed decay"
)
half_life_created_days: float | None = Field(
default=None, description="Half-life (days) for created_at decay"
)
server_side_recency: bool | None = Field(
default=None,
description="If true, attempt server-side recency ranking (Redis-only)",
)


class MemoryRecordResults(BaseModel):
"""Results from memory search operations"""

Expand Down
42 changes: 42 additions & 0 deletions agent-memory-client/tests/test_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
MemoryRecordResult,
MemoryRecordResults,
MemoryTypeEnum,
RecencyConfig,
WorkingMemoryResponse,
)

Expand Down Expand Up @@ -298,6 +299,47 @@ async def test_search_all_long_term_memories(self, enhanced_test_client):
assert mock_search.call_count == 3


class TestRecencyConfig:
@pytest.mark.asyncio
async def test_recency_config_payload(self, enhanced_test_client):
"""Ensure RecencyConfig fields are forwarded in the search payload."""
with patch.object(enhanced_test_client._client, "post") as mock_post:
mock_response = AsyncMock()
mock_response.raise_for_status.return_value = None
mock_response.json.return_value = MemoryRecordResults(
total=0, memories=[], next_offset=None
).model_dump()
mock_post.return_value = mock_response

rc = RecencyConfig(
recency_boost=True,
w_sem=0.7,
w_recency=0.3,
wf=0.6,
wa=0.4,
half_life_last_access_days=7,
half_life_created_days=30,
server_side_recency=True,
)

await enhanced_test_client.search_long_term_memory(
text="q", recency=rc, limit=5
)

# Verify payload contained recency fields
args, kwargs = mock_post.call_args
assert args[0] == "/v1/long-term-memory/search"
body = kwargs["json"]
assert body["recency_boost"] is True
assert body["recency_w_sem"] == 0.7
assert body["recency_w_recency"] == 0.3
assert body["recency_wf"] == 0.6
assert body["recency_wa"] == 0.4
assert body["recency_half_life_last_access_days"] == 7
assert body["recency_half_life_created_days"] == 30
assert body["server_side_recency"] is True


class TestClientSideValidation:
"""Tests for client-side validation methods."""

Expand Down
135 changes: 134 additions & 1 deletion agent_memory_server/api.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from typing import Any

import tiktoken
from fastapi import APIRouter, Depends, HTTPException, Query
from mcp.server.fastmcp.prompts import base
Expand Down Expand Up @@ -34,6 +36,32 @@
router = APIRouter()


@router.post("/v1/long-term-memory/forget")
async def forget_endpoint(
policy: dict,
namespace: str | None = None,
user_id: str | None = None,
session_id: str | None = None,
limit: int = 1000,
dry_run: bool = True,
pinned_ids: list[str] | None = None,
current_user: UserInfo = Depends(get_current_user),
):
"""Run a forgetting pass with the provided policy. Returns summary data.

This is an admin-style endpoint; auth is enforced by the standard dependency.
"""
return await long_term_memory.forget_long_term_memories(
policy,
namespace=namespace,
user_id=user_id,
session_id=session_id,
limit=limit,
dry_run=dry_run,
pinned_ids=pinned_ids,
)


def _get_effective_token_limit(
model_name: ModelNameLiteral | None,
context_window_max: int | None,
Expand Down Expand Up @@ -102,6 +130,54 @@ def _calculate_context_usage_percentages(
return min(total_percentage, 100.0), min(until_summarization_percentage, 100.0)


def _build_recency_params(payload: SearchRequest) -> dict[str, Any]:
"""Build recency parameters dict with backward compatibility.

Prefers new descriptive parameter names over old short names.
"""
# Use new parameter names if available, fall back to old ones, then defaults
semantic_weight = (
payload.recency_semantic_weight
if payload.recency_semantic_weight is not None
else (payload.recency_w_sem if payload.recency_w_sem is not None else 0.8)
)
recency_weight = (
payload.recency_recency_weight
if payload.recency_recency_weight is not None
else (
payload.recency_w_recency if payload.recency_w_recency is not None else 0.2
)
)
freshness_weight = (
payload.recency_freshness_weight
if payload.recency_freshness_weight is not None
else (payload.recency_wf if payload.recency_wf is not None else 0.6)
)
novelty_weight = (
payload.recency_novelty_weight
if payload.recency_novelty_weight is not None
else (payload.recency_wa if payload.recency_wa is not None else 0.4)
)

return {
# Use new descriptive names internally
"semantic_weight": semantic_weight,
"recency_weight": recency_weight,
"freshness_weight": freshness_weight,
"novelty_weight": novelty_weight,
"half_life_last_access_days": (
payload.recency_half_life_last_access_days
if payload.recency_half_life_last_access_days is not None
else 7.0
),
"half_life_created_days": (
payload.recency_half_life_created_days
if payload.recency_half_life_created_days is not None
else 30.0
),
}


async def _summarize_working_memory(
memory: WorkingMemory,
model_name: ModelNameLiteral | None = None,
Expand Down Expand Up @@ -525,7 +601,64 @@ async def search_long_term_memory(
logger.debug(f"Long-term search kwargs: {kwargs}")

# Pass text and filter objects to the search function (no redis needed for vectorstore adapter)
return await long_term_memory.search_long_term_memories(**kwargs)
# Server-side recency rerank toggle (Redis-only path); defaults to False
server_side_recency = (
payload.server_side_recency
if payload.server_side_recency is not None
else False
)
if server_side_recency:
kwargs["server_side_recency"] = True
kwargs["recency_params"] = _build_recency_params(payload)
return await long_term_memory.search_long_term_memories(**kwargs)

raw_results = await long_term_memory.search_long_term_memories(**kwargs)

# Recency-aware re-ranking of results (configurable)
try:
from datetime import UTC, datetime as _dt

# Decide whether to apply recency boost
recency_boost = (
payload.recency_boost if payload.recency_boost is not None else True
)
if not recency_boost or not raw_results.memories:
return raw_results

now = _dt.now(UTC)
recency_params = {
"w_sem": payload.recency_w_sem
if payload.recency_w_sem is not None
else 0.8,
"w_recency": payload.recency_w_recency
if payload.recency_w_recency is not None
else 0.2,
"wf": payload.recency_wf if payload.recency_wf is not None else 0.6,
"wa": payload.recency_wa if payload.recency_wa is not None else 0.4,
"half_life_last_access_days": (
payload.recency_half_life_last_access_days
if payload.recency_half_life_last_access_days is not None
else 7.0
),
"half_life_created_days": (
payload.recency_half_life_created_days
if payload.recency_half_life_created_days is not None
else 30.0
),
}
ranked = long_term_memory.rerank_with_recency(
raw_results.memories, now=now, params=recency_params
)
# Update last_accessed in background with rate limiting
ids = [m.id for m in ranked if m.id]
if ids:
background_tasks = get_background_tasks()
await background_tasks.add_task(long_term_memory.update_last_accessed, ids)

raw_results.memories = ranked
return raw_results
except Exception:
return raw_results


@router.delete("/v1/long-term-memory", response_model=AckResponse)
Expand Down
Loading