Skip to content

COCO's val/test sets captions incomplete? #8

@armandvilalta

Description

@armandvilalta

I realized that in the validation / test provided for COCO there are 5000 images and 5000 captions where for 5000 images should be 25000 captions. Actually, the last 4000 images do not have corresponding caption. So, if we only use the provided captions we are evaluating over a 1000 images subset.
In readme file says:

Flickr8K comes with a pre-defined train/dev/test split, while for Flickr30K and MS COCO we use the splits produced by Andrej Karpathy.

While Karpathy's paper indicates:

For MSCOCO we use 5,000 images for both validation and testing

Is the original test actually over 1000 images or the caption list provided is incomplete?
Thanks,
Armand

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions