-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Labels
enhancementNew feature or requestNew feature or request
Description
In Dockerfile, we call the script scripts/download_datasets.py that downloads all datasets to /var/tmp/pv211, so that the datasets are shared by all students that use JupyterHub, saving time and disk space. For example, here we download the ARQMath datasets: 1->2. Here, the students load them: 3->4->5.
Since #3, we've supported BEIR datasets. However, the BEIR datasets are not downloaded in Dockerfile and they are saved and loaded from the ./datasets directory, which slows down the students and duplicates disk space occupied.
Tasks
- Download BEIR datasets to
/var/tmp/pv211inscripts/download_datasets.py. - Load BEIR datasets from
/var/tmp/pv211inpv211_utils.beir.loader.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request