Memory Management Toolkit for AI Agents, Remember Me, Refine Me.
For legacy versions, please refer to 0.2.x Documentation
🧠 ReMe is a memory management framework built specifically for AI Agents, offering both file-based and vector-based memory systems.
It addresses two core memory challenges for agents: Limited context window (early information gets truncated or lost in long conversations), and Stateless sessions (new conversations cannot inherit history, starting from scratch every time).
ReMe gives agents true memory capability — old conversations are automatically condensed, important information is persistently stored, and relevant context is automatically recalled in future conversations.
Memory as Files, Files as Memory
Treat memory as files — readable, editable, and copyable. CoPaw's memory system inherits from ReMeLight, implementing memory management capabilities.
| Traditional Memory System | File Based ReMe |
|---|---|
| 🗄️ Database storage | 📝 Markdown files |
| 🔒 Invisible | 👀 Always readable |
| ❌ Hard to modify | ✏️ Direct editing |
| 🚫 Hard to migrate | 📦 Copy to migrate |
working_dir/
├── MEMORY.md # Long-term memory: user preferences, project configs, etc.
├── memory/
│ └── YYYY-MM-DD.md # Daily summary logs: auto-written after conversations
└── tool_result/ # Long tool output cache (auto-managed, auto-cleanup on expiry)
└── <uuid>.txt
ReMeLight is the core class of this memory system, providing complete memory management capabilities for AI Agents:
| Method | Function | Key Components |
|---|---|---|
start |
🚀 Start memory system | Initialize file store, file watcher, embedding cache; cleanup expired tool result files |
close |
📕 Close and cleanup | Cleanup tool result files, stop file watcher, save embedding cache |
compact_memory |
📦 Compress history to summary | Compactor — ReActAgent generates structured context checkpoints |
summary_memory |
📝 Write important memories to files | Summarizer — ReActAgent + file tools (read / write / edit) |
compact_tool_result |
✂️ Compress long tool outputs | ToolResultCompactor — Truncate and save to tool_result/, keep file reference in message |
add_async_summary_task |
⚡ Submit background summary task | asyncio.create_task, summary doesn't block main conversation flow |
await_summary_tasks |
⏳ Wait for background tasks | Collect results from all background summary tasks, call before closing to ensure writes complete |
memory_search |
🔍 Semantic memory search | MemorySearch — Vector + BM25 hybrid retrieval |
get_in_memory_memory |
🗂️ Create in-memory instance | ReMeInMemoryMemory — Token-aware memory management, supports compression summaries and state serialization |
update_params |
⚙️ Dynamically update runtime params | Runtime adjustment of max_input_length, memory_compact_ratio, language |
pip install -U reme-ai[as]ReMeLight environment variables configure embedding and storage backends
| Environment Variable | Description | Default |
|---|---|---|
EMBEDDING_API_KEY |
Embedding service API Key | "" |
EMBEDDING_BASE_URL |
Embedding service Base URL | https://dashscope.aliyuncs.com/compatible-mode/v1 |
EMBEDDING_MODEL_NAME |
Embedding model name | "" |
EMBEDDING_DIMENSIONS |
Vector dimensions | 1024 |
EMBEDDING_CACHE_ENABLED |
Enable embedding cache | true |
EMBEDDING_MAX_CACHE_SIZE |
Maximum cache entries | 2000 |
FTS_ENABLED |
Enable full-text search (BM25) | true |
MEMORY_STORE_BACKEND |
Storage backend (auto / chroma / local) |
auto (local on Windows, chroma otherwise) |
import asyncio
from agentscope.message import Msg
from reme.reme_light import ReMeLight
async def main():
reme = ReMeLight(
working_dir=".reme", # Memory file storage directory
max_input_length=128000, # Model context window (tokens)
memory_compact_ratio=0.7, # Trigger compression at max_input_length * 0.7
language="zh", # Summary language (zh / "")
tool_result_threshold=1000, # Auto-save tool outputs exceeding this character count
retention_days=7, # tool_result/ file retention days
)
await reme.start()
messages = [...]
# 1. Compress long tool outputs (prevent tool results from bloating context)
messages = await reme.compact_tool_result(messages)
# 2. Compress conversation history to structured summary (triggered when context approaches limit), pass previous summary for incremental updates
summary = await reme.compact_memory(messages=messages, previous_summary="")
# 3. Submit async summary task in background (non-blocking, writes to memory/YYYY-MM-DD.md)
reme.add_async_summary_task(messages=messages)
# 4. Semantic memory search (Vector + BM25 hybrid retrieval)
result = await reme.memory_search(query="Python version preference", max_results=5)
# 5. Get in-memory instance (ReMeInMemoryMemory, manages single conversation context) AgentScope InMemoryMemory
memory = reme.get_in_memory_memory()
token_stats = await memory.estimate_tokens()
print(f"Current context usage: {token_stats['context_usage_ratio']:.1f}%")
print(f"Message tokens: {token_stats['messages_tokens']}")
print(f"Estimated total tokens: {token_stats['estimated_tokens']}")
# 6. Wait for background tasks before closing
summary_result = await reme.await_summary_tasks()
# Close ReMeLight
await reme.close()
if __name__ == "__main__":
asyncio.run(main())ReMe Vector Based is the core class of the vector-based memory system, supporting unified management of three memory types:
| Memory Type | Purpose | Use Case |
|---|---|---|
| Personal Memory | Record user preferences, habits | user_name |
| Task/Procedural Memory | Record task execution experience, success/failure patterns | task_name |
| Tool Memory | Record tool usage experience, parameter optimization | tool_name |
| Method | Function | Description |
|---|---|---|
summarize_memory |
🧠 Memory Summary | Auto-extract and store memories from conversations |
retrieve_memory |
🔍 Memory Retrieval | Retrieve relevant memories based on query |
add_memory |
➕ Add Memory | Manually add memory to vector store |
get_memory |
📖 Get Memory | Get single memory by ID |
update_memory |
✏️ Update Memory | Update existing memory content or metadata |
delete_memory |
🗑️ Delete Memory | Delete specified memory |
list_memory |
📋 List Memories | List memories by type, supports filtering and sorting |
import asyncio
from reme import ReMe
async def main():
# Initialize ReMe
reme = ReMe(
working_dir=".reme",
default_llm_config={
"backend": "openai",
"model_name": "qwen3.5-plus",
},
default_embedding_model_config={
"backend": "openai",
"model_name": "text-embedding-v4",
"dimensions": 1024,
},
default_vector_store_config={
"backend": "local", # Supports local/chroma/qdrant/elasticsearch
},
)
await reme.start()
messages = [
{"role": "user", "content": "Help me write a Python script", "time_created": "2026-02-28 10:00:00"},
{"role": "assistant", "content": "OK, let me help you", "time_created": "2026-02-28 10:00:05"},
]
# 1. Summarize memories from conversation (auto-extract user preferences, task experience, etc.)
result = await reme.summarize_memory(
messages=messages,
user_name="alice", # Personal memory
# task_name="code_writing", # Task memory
)
print(f"Summary result: {result}")
# 2. Retrieve relevant memories
memories = await reme.retrieve_memory(
query="Python programming",
user_name="alice",
# task_name="code_writing",
)
print(f"Retrieval result: {memories}")
# 3. Manually add memory
memory_node = await reme.add_memory(
memory_content="User prefers concise code style",
user_name="alice",
)
print(f"Added memory: {memory_node}")
memory_id = memory_node.memory_id
# 4. Get single memory by ID
fetched_memory = await reme.get_memory(memory_id=memory_id)
print(f"Fetched memory: {fetched_memory}")
# 5. Update memory content
updated_memory = await reme.update_memory(
memory_id=memory_id,
user_name="alice",
memory_content="User prefers concise code style with comments",
)
print(f"Updated memory: {updated_memory}")
# 6. List all user memories (supports filtering and sorting)
all_memories = await reme.list_memory(
user_name="alice",
limit=10,
sort_key="time_created",
reverse=True,
)
print(f"User memory list: {all_memories}")
# 7. Delete specified memory
await reme.delete_memory(memory_id=memory_id)
print(f"Deleted memory: {memory_id}")
# 8. Delete all memories (use with caution)
# await reme.delete_all()
await reme.close()
if __name__ == "__main__":
asyncio.run(main())CoPaw MemoryManager inherits from ReMeLight, integrating memory capabilities into the Agent reasoning flow:
graph TB
CoPaw["CoPaw MemoryManager\n(inherits ReMeLight)"] -->|pre_reasoning hook| Hook[MemoryCompactionHook]
CoPaw --> ReMeLight[ReMeLight]
Hook -->|exceeds threshold| ReMeLight
ReMeLight --> CompactMemory[compact_memory\nHistory Compression]
ReMeLight --> SummaryMemory[summary_memory\nWrite Memory to Files]
ReMeLight --> CompactToolResult[compact_tool_result\nLong Tool Output Compression]
ReMeLight --> MemSearch[memory_search\nSemantic Search]
ReMeLight --> InMemory[get_in_memory_memory\nReMeInMemoryMemory]
CompactMemory --> Compactor[Compactor\nReActAgent]
SummaryMemory --> Summarizer[Summarizer\nReActAgent + File Tools]
CompactToolResult --> ToolResultCompactor[ToolResultCompactor\nTruncate + Save to File]
Summarizer --> FileIO[FileIO\nread / write / edit]
FileIO --> MemoryFiles[memory/YYYY-MM-DD.md]
ToolResultCompactor --> ToolResultFiles[tool_result/*.txt]
MemoryFiles -.->|file changes| FileWatcher[Async File Watcher]
FileWatcher -->|update index| FileStore[Local Database]
MemSearch --> FileStore
MemoryCompactionHook checks context token usage before each reasoning step, automatically triggering compression when threshold is exceeded:
graph LR
A[pre_reasoning] --> B{Token exceeds threshold?}
B -->|No| Z[Continue reasoning]
B -->|Yes| C[compact_tool_result\nCompress long tool outputs in recent messages]
C --> D[compact_memory\nGenerate structured context checkpoint]
D --> E[Mark old messages as COMPRESSED]
E --> F[add_async_summary_task\nBackground write to memory files]
F --> Z
Compactor uses ReActAgent to compress conversation history into structured context checkpoints:
| Field | Description |
|---|---|
## Goal |
🎯 Goals the user wants to accomplish (can be multiple) |
## Constraints |
⚙️ Constraints and preferences mentioned by user |
## Progress |
📈 Completed / In-progress / Blocked tasks |
## Key Decisions |
🔑 Decisions made with brief rationale |
## Next Steps |
🗺️ Next action plan (ordered list) |
## Critical Context |
📌 Key data like file paths, function names, error messages |
Supports incremental updates: When previous_summary is provided, new conversation is automatically merged with old summary, preserving historical progress.
ToolResultCompactor solves the problem of context bloat caused by overly long tool outputs:
graph LR
A[tool_result message] --> B{Content length > threshold?}
B -->|No| C[Keep as is]
B -->|Yes| D[Truncate to threshold characters]
D --> E[Write full content to tool_result/uuid.txt]
E --> F[Append file reference path to message]
Expired files (exceeding retention_days) are automatically cleaned up during start / close / compact_tool_result.
Summarizer uses the ReAct + File Tools pattern, letting AI autonomously decide what to write and where:
graph LR
A[Receive conversation] --> B{Think: What's worth recording?}
B --> C[Act: read memory/YYYY-MM-DD.md]
C --> D{Think: How to merge with existing content?}
D --> E[Act: edit to update file]
E --> F{Think: Anything missed?}
F -->|Yes| B
F -->|No| G[Complete]
FileIO provides file operation tools:
| Tool | Function | Use Case |
|---|---|---|
read |
Read file content (supports line ranges) | View existing memories, avoid duplicate writes |
write |
Overwrite file | Create new memory files or major restructuring |
edit |
Replace after exact match | Append new content or modify specific sections |
ReMeInMemoryMemory extends AgentScope's InMemoryMemory:
| Feature | Description |
|---|---|
get_memory |
Filter messages by tag, auto-prepend compression summary at head |
estimate_tokens |
Precisely estimate current context token usage and utilization |
get_history_str |
Generate human-readable conversation history summary (with token stats) |
state_dict / load_state_dict |
Support state serialization / deserialization (session persistence) |
mark_messages_compressed |
Mark messages as compressed state |
get_compressed_summary |
Get compressed summary content |
MemorySearch provides Vector + BM25 hybrid retrieval capabilities:
| Retrieval Method | Advantage | Disadvantage |
|---|---|---|
| Vector Semantic | Captures semantically similar but differently worded content | Weak on exact token matching |
| BM25 Full-text | Excellent for exact token hits | Cannot understand synonyms and paraphrases |
Fusion Mechanism: After dual-path recall, weighted sum is applied (Vector 0.7 + BM25 0.3), enabling both natural language and exact lookups to hit.
graph LR
Q[Search Query] --> V[Vector Search × 0.7]
Q --> B[BM25 × 0.3]
V --> M[Dedupe + Weighted Fusion]
B --> M
M --> R[Top-N Results]
pip install -U reme-aiAPI keys are set via environment variables, can be written in .env file in project root:
| Environment Variable | Description | Example |
|---|---|---|
REME_LLM_API_KEY |
LLM API Key | sk-xxx |
REME_LLM_BASE_URL |
LLM Base URL | https://dashscope.aliyuncs.com/compatible-mode/v1 |
REME_EMBEDDING_API_KEY |
Embedding API Key | sk-xxx |
REME_EMBEDDING_BASE_URL |
Embedding Base URL | https://dashscope.aliyuncs.com/compatible-mode/v1 |
graph TB
User[User / Agent] --> ReMe[Vector Based ReMe]
ReMe --> Summarize[Memory Summarization]
ReMe --> Retrieve[Memory Retrieval]
ReMe --> CRUD[CRUD Operations]
Summarize --> PersonalSum[PersonalSummarizer]
Summarize --> ProceduralSum[ProceduralSummarizer]
Summarize --> ToolSum[ToolSummarizer]
Retrieve --> PersonalRet[PersonalRetriever]
Retrieve --> ProceduralRet[ProceduralRetriever]
Retrieve --> ToolRet[ToolRetriever]
PersonalSum --> VectorStore[Vector Database]
ProceduralSum --> VectorStore
ToolSum --> VectorStore
PersonalRet --> VectorStore
ProceduralRet --> VectorStore
ToolRet --> VectorStore
- Star & Watch: Star helps more agent developers discover ReMe; Watch keeps you informed about new releases and features.
- Share Your Work: Share what ReMe unlocked for your agent in Issues or Discussions — we'd love to showcase community achievements.
- Need a Feature? Submit a Feature Request, and we'll work with the community to improve.
- Code Contributions: All forms of code contributions are welcome, please see the Contribution Guide.
- Acknowledgments: Thanks to OpenClaw, Mem0, MemU, CoPaw, and other excellent open-source projects for their inspiration and help.
@software{AgentscopeReMe2025,
title = {AgentscopeReMe: Memory Management Kit for Agents},
author = {ReMe Team},
url = {https://reme.agentscope.io},
year = {2025}
}This project is open-sourced under the Apache License 2.0, see the LICENSE file for details.
