You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**MemOS** is an operating system for Large Language Models (LLMs) that enhances them with long-term memory capabilities. It allows LLMs to store, retrieve, and manage information, enabling more context-aware, consistent, and personalized interactions.
@@ -66,19 +61,12 @@ MemOS demonstrates significant improvements over baseline memory solutions in mu
66
61
67
62
> 💡 **Temporal reasoning accuracy improved by 159% compared to the OpenAI baseline.**
68
63
69
-
70
-
71
64
### Details of End-to-End Evaluation on LOCOMO
72
65
73
66
> [!NOTE]
74
67
> Comparison of LLM Judge Scores across five major tasks in the LOCOMO benchmark. Each bar shows the mean evaluation score judged by LLMs for a given method-task pair, with standard deviation as error bars. MemOS-0630 consistently outperforms baseline methods (LangMem, Zep, OpenAI, Mem0) across all task types, especially in multi-hop and temporal reasoning scenarios.
We welcome contributions from the community! Please read our [contribution guidelines](https://memos.openmem.net/docs/contribution/overview) to get started.
230
+
We welcome contributions from the community! Please read our [contribution guidelines](https://memos-docs.openmem.net/contribution/overview) to get started.
Copy file name to clipboardExpand all lines: evaluation/README.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,10 +25,12 @@ This repository provides tools and scripts for evaluating the LoCoMo dataset usi
25
25
## Evaluation Scripts
26
26
27
27
### LoCoMo Evaluation
28
-
To evaluate the **LoCoMo** dataset using one of the supported memory frameworks — `memos`, `mem0`, or `zep` — run the following command:
28
+
⚙️ To evaluate the **LoCoMo** dataset using one of the supported memory frameworks — `memos`, `mem0`, or `zep` — run the following [script](./scripts/run_locomo_eval.sh):
29
29
30
30
```bash
31
31
# Edit the configuration in ./scripts/run_locomo_eval.sh
32
32
# Specify the model and memory backend you want to use (e.g., mem0, zep, etc.)
33
33
./scripts/run_locomo_eval.sh
34
34
```
35
+
36
+
✍️ For evaluating OpenAI's native memory feature with the LoCoMo dataset, please refer to the detailed guide: [OpenAI Memory on LoCoMo - Evaluation Guide](./scripts/locomo/openai_memory_locomo_eval_guide.md).
0 commit comments