Skip to content

Commit a5322fe

Browse files
Nyakult2RantfridayL
authored
Eval scripts (#377)
* feat: check nodes existence * feat: use different template for different language input * feat: use different template for different language input * fix: eval script * feat: memos-api eval scripts * feat: mem reader * feat: 实现äºprefeval memos-api evaluation scripts * refactor:format code * feat: add PersonaMem eval scripts * docs(evaluation): update PersonaMem eval readme * feat:memos-api ingest batch message * feat: refactor search * feat: refactor search * update: add api for memory * feat: add memory api return memory and memory type * refactor(server):重构服务器路由模块以优化内存管理 * format: ruff format code * feat(server): 增加LLM最大令牌数 * test * fix: user query embedding for search * count memory_size by user * fix(server):修复记忆读取逻辑中的列表展开问题 * feat(nebular):优化图数据库查询性能 * refactor(memory): - 移除了对 `_refresh_memory_size` 方法的调用- 保留原有逻辑以便后续恢复或重构 * feat: remove user idx_memory_user_name * feat(graph):优化Nebula图数据库查询性能 * feat: rollback remove_oldest_memory * feat:nebula gql add index * feat: align code * feat: update memos_api * feat: update memos_api * feat: 更新默认选项 * feat:memory client * feat:refactor lme * feat: memu & supermemory client * feat: locomo memu * feat: locomo supermemory * New 'add' and 'process' modes. * feat: lme supermemory & memu * feat: default args * api and local * api and local * memobase fix * memos fix * default args * fix memos-api search data * prefeval pipeline * fix lme memos-api * personamem pipeline * personamem pipeline * lme scrips * align dev * format code * refactor: remove old files * format code * pm and prefeval pipeline * format code * format code * pm and prefeval pipeline * pm and prefeval pipeline * format code * format code --------- Co-authored-by: 2Rant <[email protected]> Co-authored-by: fridayL <[email protected]>
1 parent cdc3bd8 commit a5322fe

31 files changed

+16713
-1776
lines changed

evaluation/.env-example

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
# memory process model
12
MODEL="gpt-4o-mini"
23
OPENAI_API_KEY="sk-***REDACTED***"
34
OPENAI_BASE_URL="http://***.***.***.***:3000/v1"
@@ -6,10 +7,18 @@ MEM0_API_KEY="m0-***REDACTED***"
67

78
ZEP_API_KEY="z_***REDACTED***"
89

10+
# response model
911
CHAT_MODEL="gpt-4o-mini"
1012
CHAT_MODEL_BASE_URL="http://***.***.***.***:3000/v1"
1113
CHAT_MODEL_API_KEY="sk-***REDACTED***"
1214

15+
MEMOS_KEY="Token mpg-xxxxx"
16+
MEMOS_URL="https://apigw-pre.memtensor.cn/api/openmem/v1"
17+
PRE_SPLIT_CHUNK=false # pre split chunk in client end
18+
19+
MEMOBASE_API_KEY="xxxxx"
20+
MEMOBASE_PROJECT_URL="http://xxx.xxx.xxx.xxx:8019"
21+
1322
# Configuration Only For Scheduler
1423
# RabbitMQ Configuration
1524
MEMSCHEDULER_RABBITMQ_HOST_NAME=rabbitmq-cn-***.cn-***.amqp-32.net.mq.amqp.aliyuncs.com
@@ -29,4 +38,4 @@ MEMSCHEDULER_GRAPHDBAUTH_URI=bolt://localhost:7687
2938
MEMSCHEDULER_GRAPHDBAUTH_USER=neo4j
3039
MEMSCHEDULER_GRAPHDBAUTH_PASSWORD=***
3140
MEMSCHEDULER_GRAPHDBAUTH_DB_NAME=neo4j
32-
MEMSCHEDULER_GRAPHDBAUTH_AUTO_CREATE=true
41+
MEMSCHEDULER_GRAPHDBAUTH_AUTO_CREATE=true

evaluation/README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,3 +34,21 @@ This repository provides tools and scripts for evaluating the LoCoMo dataset usi
3434
```
3535

3636
✍️ For evaluating OpenAI's native memory feature with the LoCoMo dataset, please refer to the detailed guide: [OpenAI Memory on LoCoMo - Evaluation Guide](./scripts/locomo/openai_memory_locomo_eval_guide.md).
37+
38+
### LongMemEval Evaluation
39+
First prepare the dataset `longmemeval_s` from https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned
40+
, and save it as `data/longmemeval/longmemeval_s.json`
41+
42+
```bash
43+
# Edit the configuration in ./scripts/run_lme_eval.sh
44+
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
45+
./scripts/run_lme_eval.sh
46+
```
47+
48+
### prefEval Evaluation
49+
50+
### personaMem Evaluation
51+
get `questions_32k.csv` and `shared_contexts_32k.jsonl` from https://huggingface.co/datasets/bowen-upenn/PersonaMem and save them at `data/personamem/`
52+
```bash
53+
./scripts/run_pm_eval.sh
54+
```

evaluation/data/personamem/.gitkeep

Whitespace-only changes.

evaluation/scripts/PrefEval/irrelevant_conv.py

Lines changed: 13870 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)