Skip to content

Commit 2cfefad

Browse files
authored
Update README.md
1 parent a84de50 commit 2cfefad

File tree

1 file changed

+31
-35
lines changed

1 file changed

+31
-35
lines changed

README.md

Lines changed: 31 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -5,42 +5,10 @@ FastDup is a tool for fast detection of duplicate and near duplicate images. Fas
55

66
![alt text](https://github.com/visualdatabase/fastdup/blob/main/gallery/git_main-min.png)
77

8-
## Quick Installation
9-
For Python 3.7 and 3.8
10-
```python
11-
pip install fastdup
12-
```
13-
14-
[Install from stable release](INSTALL.md)
15-
16-
17-
## Running the code
18-
19-
### Python
20-
```python
21-
python3
22-
import fastdup
23-
fastdup.run(input_dir="/path/to/your/folder", work_dir="/path/to/your/folder") #main running function
24-
```
25-
26-
### C++
27-
```bash
28-
/usr/bin/fastdup /path/to/your/folder --work_dir="/tmp/fastdup_files"
29-
```
30-
31-
[Detailed running instructions](RUN.md)
32-
33-
34-
35-
### Support for s3 cloud/ google storage
36-
[Detailed instructions](CLOUD.md)
37-
38-
398
## Results on Key Datasets
40-
We have thourougly tested fastdup across various famous computer-vision dataset. Ranging from Academic datasets to Kaggle competitions. A key finding we have made using FastDup is that there are ~1.2M (!) duplicate images on the ImageNet21K dataset, a new unknown result! Full results are below.
9+
We have thourougly tested fastdup across various famous visual dataset. Ranging from Academic datasets to Kaggle competitions. A key finding we have made using FastDup is that there are ~1.2M (!) duplicate images on the ImageNet21K dataset, a new unknown result! Full results are below.
4110

4211
### FastDup is FAST
43-
4412
|Dataset |Total Images |Owner |Image Res |cost [$]|spot cost [$]|processing [sec]|throughput [1/sec]|
4513
|-----------------------|---------------|-----------------------|--------------|--------|-------|-------|-----|
4614
|[imagenet21k-resized](https://www.image-net.org/challenges/LSVRC/) |11,582,724 |alibaba/academy |133x200 |4.98 |1.24 |11,561 |1,002|
@@ -58,8 +26,6 @@ We have thourougly tested fastdup across various famous computer-vision dataset.
5826
* We run on the full ImageNet dataset (11.5M images) to compare all pairs of images in less than 3 hours WITHOUT a GPU (with Google cloud cost of 5$).
5927

6028
### FastDup is ACCURATE
61-
62-
6329
Dataset| Identical Pairs| Near-Identical Pairs
6430
-------|----------------------|--------------------
6531
[imagenet21k-resized](https://www.image-net.org/challenges/LSVRC/) |1,194,059| 53,358
@@ -76,3 +42,33 @@ Dataset| Identical Pairs| Near-Identical Pairs
7642
[snakeclef2022-fgvc9](https://www.kaggle.com/competitions/snakeclef2022/data) |6,953 |33,128
7743
[fungiclef2022-fgvc9](https://www.kaggle.com/competitions/fungiclef2022/data) |2,205 |75
7844
[hotel-id-to-combat-human-trafficking-2022-fgvc9](https://www.kaggle.com/competitions/hotel-id-to-combat-human-trafficking-2022-fgvc9/data)| 3,544 |2,704
45+
46+
## Quick Installation
47+
For Python 3.7 and 3.8
48+
```python
49+
pip install fastdup
50+
```
51+
52+
[Install from stable release](INSTALL.md)
53+
54+
55+
## Running the code
56+
57+
### Python
58+
```python
59+
python3
60+
import fastdup
61+
fastdup.run(input_dir="/path/to/your/folder", work_dir="/path/to/your/folder") #main running function
62+
```
63+
64+
### C++
65+
```bash
66+
/usr/bin/fastdup /path/to/your/folder --work_dir="/tmp/fastdup_files"
67+
```
68+
69+
[Detailed running instructions](RUN.md)
70+
71+
72+
73+
### Support for s3 cloud/ google storage
74+
[Detailed instructions](CLOUD.md)

0 commit comments

Comments
 (0)