[Paper Discrepancy] Data Inconsistency in Table 1: A-Mem Baseline Metrics vs. Original Paper

Hi Mem0 team,

I enjoyed reading your paper "[Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory](https://alphaxiv.org/pdf/2504.19413)".

While reviewing the baseline comparisons, I noticed a discrepancy in **Table 1** regarding the reported performance of the **A-Mem** system.

In **Table 1** of the Mem0 paper, the **F1** and **BLEU-1** scores for A-Mem are listed as follows:

|Question Type|A-Mem's F1 score|A-Mem's BLEU-1 score|
|:--:|:--:|:--:|
|Single Hop|27.02|20.09|
|Multi-Hop|12.14|12.00|
|Open Domain|44.65|37.06|
|Temporal|45.85|36.67|

However, these values appear to contradict the data reported in the original **A-Mem paper** ([A-MEM: Agentic Memory for LLM Agents](https://alphaxiv.org/pdf/2502.12110), Table 1).

Comparing the two tables, it seems that the results for **Single Hop**, **Multi-Hop**, and **Open Domain** may have been **mixed up or mislabeled** in the Mem0 paper. For example, the values assigned to Single Hop in Mem0 seem to correspond to a different category (or vice versa) in the original source.

**Reference from A-Mem Paper (Table 1):**

<img width="2044" height="1254" alt="Image" src="https://github.com/user-attachments/assets/201a85ba-f0bd-4486-8506-89276a497ad2" />

Could you please verify if this is a transcription error during the compilation of the results?

Thank you!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Paper Discrepancy] Data Inconsistency in Table 1: A-Mem Baseline Metrics vs. Original Paper #4003

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question Type	A-Mem's F1 score	A-Mem's BLEU-1 score
Single Hop	27.02	20.09
Multi-Hop	12.14	12.00
Open Domain	44.65	37.06
Temporal	45.85	36.67

[Paper Discrepancy] Data Inconsistency in Table 1: A-Mem Baseline Metrics vs. Original Paper #4003

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions