You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: evaluation/README.md
+4-3Lines changed: 4 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Evaluation Memory Framework
2
2
3
-
This repository provides tools and scripts for evaluating the LoCoMo dataset using various models and APIs.
3
+
This repository provides tools and scripts for evaluating the `LoCoMo`, `LongMemEval`, `PrefEval`, `personaMem` dataset using various models and APIs.
4
4
5
5
## Installation
6
6
@@ -68,7 +68,8 @@ First prepare the dataset `longmemeval_s` from https://huggingface.co/datasets/x
68
68
```
69
69
70
70
### PrefEval Evaluation
71
-
To evaluate the **Prefeval** dataset using one of the supported memory frameworks — run the following [script](./scripts/run_prefeval_eval.sh):
71
+
Downloading benchmark_dataset/filtered_inter_turns.json from https://github.com/amazon-science/PrefEval/blob/main/benchmark_dataset/filtered_inter_turns.json and save it as `./data/prefeval/filtered_inter_turns.json`.
72
+
To evaluate the **Prefeval** dataset — run the following [script](./scripts/run_prefeval_eval.sh):
72
73
73
74
```bash
74
75
# Edit the configuration in ./scripts/run_prefeval_eval.sh
@@ -83,4 +84,4 @@ get `questions_32k.csv` and `shared_contexts_32k.jsonl` from https://huggingface
83
84
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
84
85
# If you want to use MIRIX, edit the the configuration in ./scripts/personamem/config.yaml
0 commit comments