Skip to content

Data download incomplete #175

@mrpeerat

Description

@mrpeerat

Hi,

I ran the data download script from https://github.com/TIGER-AI-Lab/VLM2Vec/blob/main/experiments/public/data/download_data.sh

There is no error, but when I checked the data, only half of them were downloaded.

So, when I ran ls -1 data/vlm2vec_train/MMEB-train/images/DocVQA/Train/ | wc -l, The data is available only 13.9k samples, while the data should have around 30k++.

I already installed Git LFS.

Is there anything I need to do to get the full data?

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions