Skip to content

Commit fc3fb2e

Browse files
authored
CItation (#271)
* Add citation * code block
1 parent e559206 commit fc3fb2e

File tree

1 file changed

+16
-2
lines changed

1 file changed

+16
-2
lines changed

README.md

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -463,11 +463,25 @@ It takes 3.7h to download 18M pictures
463463
downloading 2 parquet files of 18M items (result 936GB) took 7h24
464464
average of 1345 image/s
465465

466-
## 190M benchmark
466+
### 190M benchmark
467467

468468
downloading 190M images from the [crawling at home dataset](https://github.com/rom1504/cah-prepro) took 41h (result 5TB)
469469
average of 1280 image/s
470470

471-
## 5B benchmark
471+
### 5B benchmark
472472

473473
downloading 5.8B images from the [laion5B dataset](https://laion.ai/laion-5b-a-new-era-of-open-large-scale-multi-modal-datasets/) took 7 days (result 240TB), average of 9500 sample/s on 10 machines, [technical details](https://rom1504.medium.com/semantic-search-at-billions-scale-95f21695689a)
474+
475+
476+
477+
## Citation
478+
```
479+
@misc{beaumont-2021-img2dataset,
480+
author = {Romain Beaumont},
481+
title = {img2dataset: Easily turn large sets of image urls to an image dataset},
482+
year = {2021},
483+
publisher = {GitHub},
484+
journal = {GitHub repository},
485+
howpublished = {\url{https://github.com/rom1504/img2dataset}}
486+
}
487+
```

0 commit comments

Comments
 (0)