Skip to content

Commit 711bb7c

Browse files
authored
Update README.md
1 parent 9e2aa8b commit 711bb7c

File tree

1 file changed

+11
-4
lines changed

1 file changed

+11
-4
lines changed

README.md

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -16,17 +16,24 @@ To be completed...
1616

1717
### Reproducing the results from the paper
1818

19-
0. Clone this repository and install the required libraries by running
19+
0. Clone this repository and install the required libraries by running.
2020

2121
```shell
2222
pip install -e .
2323
```
2424

25-
1. Download the [BIOSCAN-5M dataet from the git repo](https://github.com/bioscan-ml/BIOSCAN-5M)
25+
1. Download the [metadata file](https://drive.google.com/drive/u/0/folders/1TLVw0P4MT_5lPrgjMCMREiP8KW-V4nTb) and copy it into the data folder
26+
2. Split the metadata file into smaller files according to the different partitions as presented in the [BIOSCAN-5M paper](https://arxiv.org/abs/2406.12723)
2627

27-
2. Pretrain BarcodeMAE
28+
```shell
29+
cd data/
30+
python data_split.py BIOSCAN-5M_Dataset_metadata.tsv
31+
```
32+
3. Pretrain BarcodeMAE
2833

29-
To be completed...
34+
```shell
35+
36+
```
3037

3138

3239
## Citation

0 commit comments

Comments
 (0)