Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
d9f863e
fix: sqlite list users error (#384)
fridayL Oct 23, 2025
b5ea7e6
feat: introduce async memory add for TreeTextMemory using MemSchedule…
CaralHsi Oct 23, 2025
6efe419
add pm and pref eval scripts (#385)
Nyakult Oct 24, 2025
651e8df
Meger update about scheduler and new api to Dev (#386)
tangg555 Oct 25, 2025
0b2b6ed
Feat/merge inst cplt to dev (#388)
Wang-Daoji Oct 25, 2025
f6e96d5
Feat: add reranker strategies and update configs (#390)
fridayL Oct 27, 2025
e069928
modify code in evaluation (#392)
Wang-Daoji Oct 27, 2025
84adda6
fix bug in pref_mem return (#399)
Wang-Daoji Oct 27, 2025
ce34bd1
add polardb (#395)
wustzdy Oct 27, 2025
83a7c34
feat: fix mode (#400)
lijicode Oct 28, 2025
018d759
Feat: remove long waring for internet and add content for memreader (…
fridayL Oct 28, 2025
3680286
feat: redis for sync history memories and new api of mixture search (…
tangg555 Oct 28, 2025
5ff29d1
memos online api eval scripts and readme (#403)
Nyakult Oct 28, 2025
1f6757d
feat: fix sources (#404)
lijicode Oct 28, 2025
e21f5bb
fix porlar (#406)
lijicode Oct 29, 2025
d79647e
Feat/arms (#402)
CarltonXiang Oct 29, 2025
f8859f1
Hotfix: memos playground prompt reverse (#408)
fridayL Oct 29, 2025
7eb531b
Feat/pref optimize update (#409)
Wang-Daoji Oct 29, 2025
4ed7574
feat: fix polardb graph (#411)
lijicode Oct 29, 2025
fef40e9
feat: async add api (#410)
CaralHsi Oct 29, 2025
6e219c4
use nacos (#407)
lijicode Oct 29, 2025
f74ea76
feat: async add api (#413)
CaralHsi Oct 29, 2025
5923001
revision of mixture api: add conversation turn and reduce 2 stage ran…
tangg555 Oct 29, 2025
a375911
Feat: add recall strategy (#414)
whipser030 Oct 29, 2025
5b8893e
Revert "Feat: add recall strategy " (#415)
CaralHsi Oct 30, 2025
445c597
Feat: add new recall and verify (#416)
fridayL Oct 30, 2025
0765e1c
Feat: remove usage data (#417)
fridayL Oct 30, 2025
39a4f29
feat: add moniter schedule (#419)
CaralHsi Oct 30, 2025
a4d1e7b
feat:turn off graph call (#418)
whipser030 Oct 30, 2025
87e2699
pm & prefEval scripts updates (#421)
Nyakult Oct 30, 2025
81c7ad9
add polardb pool (#420)
wustzdy Oct 30, 2025
25c7642
Feat/pref optimize update (#422)
Wang-Daoji Oct 30, 2025
0e7128e
fix:tree file change Searcher inputs (#423)
whipser030 Oct 30, 2025
aa80863
Feat/pref optimize update (#425)
Wang-Daoji Oct 30, 2025
8be2e80
Fix/query schedule (#424)
CaralHsi Oct 30, 2025
28cf578
fix: message schema bug (#426)
CaralHsi Oct 30, 2025
af89531
fix commit (#427)
wustzdy Oct 30, 2025
9c5d9fb
Feat/pref optimize update (#429)
Wang-Daoji Oct 30, 2025
c7e9af4
Feat/pref optimize update (#431)
Wang-Daoji Oct 31, 2025
387fe8a
Feat/pref optimize update (#432)
Wang-Daoji Nov 3, 2025
4f96241
Merge branch 'main' into dev
CaralHsi Nov 3, 2025
b3ec17a
Feat/add request log (#439)
CarltonXiang Nov 3, 2025
b3b0baa
Feat/standardized preference field (#440)
Wang-Daoji Nov 3, 2025
cef9369
update polardb pool timeout (#441)
wustzdy Nov 3, 2025
fd56f64
feat: fix self-input prompt error (#443)
fridayL Nov 3, 2025
e79a9ab
feat: fix polardb value (#445)
lijicode Nov 3, 2025
9ea42e4
Feat/add request log (#442)
CarltonXiang Nov 3, 2025
0635127
Feat/revert request context (#446)
CarltonXiang Nov 4, 2025
4c8f89a
fix prompt error (#447)
Wang-Daoji Nov 4, 2025
88699f9
fix: fix search config input bug; patch retrieve_utils path set; adju…
whipser030 Nov 4, 2025
fafc747
eval result (#428)
Nyakult Nov 4, 2025
e62f72d
Useless quotes (#450)
wustzdy Nov 4, 2025
f3e7338
Revert "fix prompt error" (#452)
CaralHsi Nov 4, 2025
b8cd27b
Revert "fix: fix search config input bug; patch retrieve_utils path s…
CaralHsi Nov 4, 2025
e07a1b4
Revert "Useless quotes" (#454)
CaralHsi Nov 4, 2025
f67ca36
fix prompt error (#455)
Wang-Daoji Nov 5, 2025
7c4a74c
Feat/remove pref rank prompt (#456)
Wang-Daoji Nov 5, 2025
a0f3a00
fix: fix strategy reader input; code reformat (#457)
whipser030 Nov 5, 2025
65a2daf
Dev ccl1103 (#449)
lijicode Nov 5, 2025
b37939c
Feat/fix bug 1031 (#459)
Wang-Daoji Nov 5, 2025
ccbffae
Fix/no response (#463)
CarltonXiang Nov 6, 2025
4e500a9
doc: Update readme (#458)
Nyakult Nov 6, 2025
5e4f695
Fix/no response (#464)
CarltonXiang Nov 6, 2025
f640855
Fix: fix pg query for group error string (#465)
fridayL Nov 6, 2025
119bbe2
feat: freeze usage update in Searcher (#466)
CaralHsi Nov 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ evaluation/.env
!evaluation/configs-example/*.json
evaluation/configs/*
**tree_textual_memory_locomo**
**script.py**
.env
evaluation/scripts/personamem

Expand Down
49 changes: 34 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,22 +54,20 @@

## 📈 Performance Benchmark

MemOS demonstrates significant improvements over baseline memory solutions in multiple reasoning tasks.
MemOS demonstrates significant improvements over baseline memory solutions in multiple memory tasks,
showcasing its capabilities in **information extraction**, **temporal and cross-session reasoning**, and **personalized preference responses**.

| Model | Avg. Score | Multi-Hop | Open Domain | Single-Hop | Temporal Reasoning |
|-------------|------------|-----------|-------------|------------|---------------------|
| **OpenAI** | 0.5275 | 0.6028 | 0.3299 | 0.6183 | 0.2825 |
| **MemOS** | **0.7331** | **0.6430** | **0.5521** | **0.7844** | **0.7321** |
| **Improvement** | **+38.98%** | **+6.67%** | **+67.35%** | **+26.86%** | **+159.15%** |
| Model | LOCOMO | LongMemEval | PrefEval-10 | PersonaMem |
|-----------------|-------------|-------------|-------------|-------------|
| **GPT-4o-mini** | 52.75 | 55.4 | 2.8 | 43.46 |
| **MemOS** | **75.80** | **77.80** | **71.90** | **61.17** |
| **Improvement** | **+43.70%** | **+40.43%** | **+2568%** | **+40.75%** |

> 💡 **Temporal reasoning accuracy improved by 159% compared to the OpenAI baseline.**

### Details of End-to-End Evaluation on LOCOMO

> [!NOTE]
> Comparison of LLM Judge Scores across five major tasks in the LOCOMO benchmark. Each bar shows the mean evaluation score judged by LLMs for a given method-task pair, with standard deviation as error bars. MemOS-0630 consistently outperforms baseline methods (LangMem, Zep, OpenAI, Mem0) across all task types, especially in multi-hop and temporal reasoning scenarios.

<img src="https://statics.memtensor.com.cn/memos/score_all_end2end.jpg" alt="END2END SCORE">
### Detailed Evaluation Results
- We use gpt-4o-mini as the processing and judging LLM and bge-m3 as embedding model in MemOS evaluation.
- The evaluation was conducted under conditions that align various settings as closely as possible. Reproduce the results with our scripts at [`evaluation`](./evaluation).
- Check the full search and response details at huggingface https://huggingface.co/datasets/MemTensor/MemOS_eval_result.
> 💡 **MemOS outperforms all other methods (Mem0, Zep, Memobase, SuperMemory et al.) across all benchmarks!**

## ✨ Key Features

Expand All @@ -83,6 +81,27 @@ MemOS demonstrates significant improvements over baseline memory solutions in mu

## 🚀 Getting Started

### ⭐️ MemOS online API
The easiest way to use MemOS. Equip your agent with memory **in minutes**!

Sign up and get started on[`MemOS dashboard`](https://memos-dashboard.openmem.net/cn/quickstart/?source=landing).


### Self-Hosted Server
1. Get the repository.
```bash
git clone https://github.com/MemTensor/MemOS.git
cd MemOS
pip install -r ./docker/requirements.txt
```

2. Configure `docker/.env.example` and copy to `MemOS/.env`
3. Start the service.
```bash
uvicorn memos.api.server_api:app --host 0.0.0.0 --port 8001 --workers 8
```

### Local SDK
Here's a quick example of how to create a **`MemCube`**, load it from a directory, access its memories, and save it.

```python
Expand All @@ -104,7 +123,7 @@ for item in mem_cube.act_mem.get_all():
mem_cube.dump("tmp/mem_cube")
```

What about **`MOS`** (Memory Operating System)? It's a higher-level orchestration layer that manages multiple MemCubes and provides a unified API for memory operations. Here's a quick example of how to use MOS:
**`MOS`** (Memory Operating System) is a higher-level orchestration layer that manages multiple MemCubes and provides a unified API for memory operations. Here's a quick example of how to use MOS:

```python
from memos.configs.mem_os import MOSConfig
Expand Down
69 changes: 50 additions & 19 deletions docker/.env.example
Original file line number Diff line number Diff line change
@@ -1,29 +1,60 @@
# MemOS Environment Variables Configuration
TZ=Asia/Shanghai

# Path to memory storage (e.g. /tmp/data_test)
MOS_CUBE_PATH=
MOS_CUBE_PATH="/tmp/data_test" # Path to memory storage (e.g. /tmp/data_test)
MOS_ENABLE_DEFAULT_CUBE_CONFIG="true" # Enable default cube config (true/false)

# OpenAI Configuration
OPENAI_API_KEY= # Your OpenAI API key
OPENAI_API_BASE= # OpenAI API base URL (default: https://api.openai.com/v1)
OPENAI_API_KEY="sk-xxx" # Your OpenAI API key
OPENAI_API_BASE="http://xxx" # OpenAI API base URL (default: https://api.openai.com/v1)

# MemOS Feature Toggles
MOS_ENABLE_DEFAULT_CUBE_CONFIG= # Enable default cube config (true/false)
MOS_ENABLE_SCHEDULER= # Enable background scheduler (true/false)
# MemOS Chat Model Configuration
MOS_CHAT_MODEL=gpt-4o-mini
MOS_CHAT_TEMPERATURE=0.8
MOS_MAX_TOKENS=8000
MOS_TOP_P=0.9
MOS_TOP_K=50
MOS_CHAT_MODEL_PROVIDER=openai

# Neo4j Configuration
NEO4J_URI= # Neo4j connection URI (e.g. bolt://localhost:7687)
NEO4J_USER= # Neo4j username
NEO4J_PASSWORD= # Neo4j password
MOS_NEO4J_SHARED_DB= # Shared Neo4j database name (if using multi-db)
# graph db
# neo4j
NEO4J_BACKEND=xxx
NEO4J_URI=bolt://xxx
NEO4J_USER=xxx
NEO4J_PASSWORD=xxx
MOS_NEO4J_SHARED_DB=xxx
NEO4J_DB_NAME=xxx

# tetxmem reog
MOS_ENABLE_REORGANIZE=false

# MemOS User Configuration
MOS_USER_ID= # Unique user ID
MOS_SESSION_ID= # Session ID for current chat
MOS_MAX_TURNS_WINDOW= # Max number of turns to keep in memory
MOS_USER_ID=root
MOS_SESSION_ID=default_session
MOS_MAX_TURNS_WINDOW=20

# MemRader Configuration
MEMRADER_MODEL=gpt-4o-mini
MEMRADER_API_KEY=sk-xxx
MEMRADER_API_BASE=http://xxx:3000/v1
MEMRADER_MAX_TOKENS=5000

#embedding & rerank
EMBEDDING_DIMENSION=1024
MOS_EMBEDDER_BACKEND=universal_api
MOS_EMBEDDER_MODEL=bge-m3
MOS_EMBEDDER_API_BASE=http://xxx
MOS_EMBEDDER_API_KEY=EMPTY
MOS_RERANKER_BACKEND=http_bge
MOS_RERANKER_URL=http://xxx
# Ollama Configuration (for embeddings)
#OLLAMA_API_BASE=http://xxx

# Ollama Configuration (for local embedding models)
OLLAMA_API_BASE= # Ollama API base URL (e.g. http://localhost:11434)
# milvus for pref mem
MILVUS_URI=http://xxx
MILVUS_USER_NAME=xxx
MILVUS_PASSWORD=xxx

# Embedding Configuration
MOS_EMBEDDER_BACKEND= # Embedding backend: openai, ollama, etc.
# pref mem
ENABLE_PREFERENCE_MEMORY=true
RETURN_ORIGINAL_PREF_MEM=true
2 changes: 1 addition & 1 deletion docker/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -157,4 +157,4 @@ volcengine-python-sdk==4.0.6
watchfiles==1.1.0
websockets==15.0.1
xlrd==2.0.2
xlsxwriter==3.2.5
xlsxwriter==3.2.5
8 changes: 7 additions & 1 deletion docs/openapi.json
Original file line number Diff line number Diff line change
Expand Up @@ -884,7 +884,7 @@
"type": "string",
"title": "Session Id",
"description": "Session ID for the MOS. This is used to distinguish between different dialogue",
"default": "0ce84b9c-0615-4b9d-83dd-fba50537d5d3"
"default": "41bb5e18-252d-4948-918c-07d82aa47086"
},
"chat_model": {
"$ref": "#/components/schemas/LLMConfigFactory",
Expand Down Expand Up @@ -939,6 +939,12 @@
"description": "Enable parametric memory for the MemChat",
"default": false
},
"enable_preference_memory": {
"type": "boolean",
"title": "Enable Preference Memory",
"description": "Enable preference memory for the MemChat",
"default": false
},
"enable_mem_scheduler": {
"type": "boolean",
"title": "Enable Mem Scheduler",
Expand Down
37 changes: 10 additions & 27 deletions evaluation/.env-example
Original file line number Diff line number Diff line change
Expand Up @@ -3,39 +3,22 @@ MODEL="gpt-4o-mini"
OPENAI_API_KEY="sk-***REDACTED***"
OPENAI_BASE_URL="http://***.***.***.***:3000/v1"

MEM0_API_KEY="m0-***REDACTED***"

ZEP_API_KEY="z_***REDACTED***"

# response model
CHAT_MODEL="gpt-4o-mini"
CHAT_MODEL_BASE_URL="http://***.***.***.***:3000/v1"
CHAT_MODEL_API_KEY="sk-***REDACTED***"

# memos
MEMOS_KEY="Token mpg-xxxxx"
MEMOS_URL="https://apigw-pre.memtensor.cn/api/openmem/v1"
PRE_SPLIT_CHUNK=false # pre split chunk in client end

MEMOBASE_API_KEY="xxxxx"
MEMOBASE_PROJECT_URL="http://xxx.xxx.xxx.xxx:8019"

# Configuration Only For Scheduler
# RabbitMQ Configuration
MEMSCHEDULER_RABBITMQ_HOST_NAME=rabbitmq-cn-***.cn-***.amqp-32.net.mq.amqp.aliyuncs.com
MEMSCHEDULER_RABBITMQ_USER_NAME=***
MEMSCHEDULER_RABBITMQ_PASSWORD=***
MEMSCHEDULER_RABBITMQ_VIRTUAL_HOST=memos
MEMSCHEDULER_RABBITMQ_ERASE_ON_CONNECT=true
MEMSCHEDULER_RABBITMQ_PORT=5672
MEMOS_URL="http://127.0.0.1:8001"
MEMOS_ONLINE_URL="https://memos.memtensor.cn/api/openmem/v1"

# OpenAI Configuration
MEMSCHEDULER_OPENAI_API_KEY=sk-***
MEMSCHEDULER_OPENAI_BASE_URL=http://***.***.***.***:3000/v1
MEMSCHEDULER_OPENAI_DEFAULT_MODEL=gpt-4o-mini
# other memory agents
MEM0_API_KEY="m0-xxx"
ZEP_API_KEY="z_xxx"
MEMU_API_KEY="mu_xxx"
SUPERMEMORY_API_KEY="sm_xxx"
MEMOBASE_API_KEY="xxx"
MEMOBASE_PROJECT_URL="http://***.***.***.***:8019"

# Graph DB Configuration
MEMSCHEDULER_GRAPHDBAUTH_URI=bolt://localhost:7687
MEMSCHEDULER_GRAPHDBAUTH_USER=neo4j
MEMSCHEDULER_GRAPHDBAUTH_PASSWORD=***
MEMSCHEDULER_GRAPHDBAUTH_DB_NAME=neo4j
MEMSCHEDULER_GRAPHDBAUTH_AUTO_CREATE=true
42 changes: 36 additions & 6 deletions evaluation/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Evaluation Memory Framework

This repository provides tools and scripts for evaluating the LoCoMo dataset using various models and APIs.
This repository provides tools and scripts for evaluating the `LoCoMo`, `LongMemEval`, `PrefEval`, `personaMem` dataset using various models and APIs.

## Installation

Expand All @@ -16,16 +16,35 @@ This repository provides tools and scripts for evaluating the LoCoMo dataset usi
```

## Configuration
Copy the `.env-example` file to `.env`, and fill in the required environment variables according to your environment and API keys.

1. Copy the `.env-example` file to `.env`, and fill in the required environment variables according to your environment and API keys.
## Setup MemOS
### local server
```bash
# modify {project_dir}/.env file and start server
uvicorn memos.api.server_api:app --host 0.0.0.0 --port 8001 --workers 8

# configure {project_dir}/evaluation/.env file
MEMOS_URL="http://127.0.0.1:8001"
```
### online service
```bash
# get your api key at https://memos-dashboard.openmem.net/cn/quickstart/
# configure {project_dir}/evaluation/.env file
MEMOS_KEY="Token mpg-xxxxx"
MEMOS_ONLINE_URL="https://memos.memtensor.cn/api/openmem/v1"

```

2. Copy the `configs-example/` directory to a new directory named `configs/`, and modify the configuration files inside it as needed. This directory contains model and API-specific settings.
## Supported frameworks
We support `memos-api` and `memos-api-online` in our scripts.
And give unofficial implementations for the following memory frameworks:`zep`, `mem0`, `memobase`, `supermemory`, `memu`.


## Evaluation Scripts

### LoCoMo Evaluation
⚙️ To evaluate the **LoCoMo** dataset using one of the supported memory frameworks — `memos`, `mem0`, or `zep` — run the following [script](./scripts/run_locomo_eval.sh):
⚙️ To evaluate the **LoCoMo** dataset using one of the supported memory frameworks — run the following [script](./scripts/run_locomo_eval.sh):

```bash
# Edit the configuration in ./scripts/run_locomo_eval.sh
Expand All @@ -45,10 +64,21 @@ First prepare the dataset `longmemeval_s` from https://huggingface.co/datasets/x
./scripts/run_lme_eval.sh
```

### prefEval Evaluation
### PrefEval Evaluation
Downloading benchmark_dataset/filtered_inter_turns.json from https://github.com/amazon-science/PrefEval/blob/main/benchmark_dataset/filtered_inter_turns.json and save it as `./data/prefeval/filtered_inter_turns.json`.
To evaluate the **Prefeval** dataset — run the following [script](./scripts/run_prefeval_eval.sh):

### personaMem Evaluation
```bash
# Edit the configuration in ./scripts/run_prefeval_eval.sh
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
./scripts/run_prefeval_eval.sh
```

### PersonaMem Evaluation
get `questions_32k.csv` and `shared_contexts_32k.jsonl` from https://huggingface.co/datasets/bowen-upenn/PersonaMem and save them at `data/personamem/`
```bash
# Edit the configuration in ./scripts/run_pm_eval.sh
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
# If you want to use MIRIX, edit the the configuration in ./scripts/personamem/config.yaml
./scripts/run_pm_eval.sh
```
51 changes: 0 additions & 51 deletions evaluation/configs-example/mem_cube_config.json

This file was deleted.

Loading
Loading