Skip to content
This repository was archived by the owner on Feb 1, 2025. It is now read-only.

AdLinke, paperwork #70

@Adlinke

Description

@Adlinke

Hi,

I was trying to reproduce results by running your code, and couldn't get exactly the same precision on SQuAD.
Here is what I got for bert_large model on SQuAD:
all_samples: 303
list_of_results: 303
global MRR: 0.3018861233236291
global Precision at 10: 0.5676567656765676
global Precision at 1: 0.16831683168316833

However, in the paper, the table shows that there should be 305 samples and the precision should be 17.4%.

At first, I guessed that it is because 2 samples are excluded because their object labels are out of the common vocabulary, but even after testing without common vocabulary, I got global Precision at 1: 0.1704918, which is still different to results in the paper.

Is there a way to reproduce the same results in the paper?
Please correct me if I made any mistakes! Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions