Skip to content

[Paper Discrepancy] Data Inconsistency in Table 1: A-Mem Baseline Metrics vs. Original Paper #4003

@onford

Description

@onford

Hi Mem0 team,

I enjoyed reading your paper "Mem0: Building Production-Ready AI Agents with Scalable Long-Term Memory".

While reviewing the baseline comparisons, I noticed a discrepancy in Table 1 regarding the reported performance of the A-Mem system.

In Table 1 of the Mem0 paper, the F1 and BLEU-1 scores for A-Mem are listed as follows:

Question Type A-Mem's F1 score A-Mem's BLEU-1 score
Single Hop 27.02 20.09
Multi-Hop 12.14 12.00
Open Domain 44.65 37.06
Temporal 45.85 36.67

However, these values appear to contradict the data reported in the original A-Mem paper (A-MEM: Agentic Memory for LLM Agents, Table 1).

Comparing the two tables, it seems that the results for Single Hop, Multi-Hop, and Open Domain may have been mixed up or mislabeled in the Mem0 paper. For example, the values assigned to Single Hop in Mem0 seem to correspond to a different category (or vice versa) in the original source.

Reference from A-Mem Paper (Table 1):

Image

Could you please verify if this is a transcription error during the compilation of the results?

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions