Skip to content

Commit a01296e

Browse files
authored
Add benchmarks README (#49)
1 parent a429ec3 commit a01296e

File tree

2 files changed

+43
-0
lines changed

2 files changed

+43
-0
lines changed

_quarto.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@ website:
6969
contents:
7070
- href: "documentation/index.md"
7171
- href: "documentation/Building.md"
72+
- href: "documentation/Benchmarks.md"
7273

7374
# - section: "Examples"
7475
# contents:

documentation/Benchmarks.md

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# Benchmarks
2+
3+
We have implemented a [big-ann-benchmarks](https://big-ann-benchmarks.com) interface for TileDB-Vector-Search,
4+
which is available in the `tiledb` branch of our fork:
5+
- [https://github.com/TileDB-Inc/big-ann-benchmarks/tree/tiledb](https://github.com/TileDB-Inc/big-ann-benchmarks/tree/tiledb). This interface implements two new algorithms: `tiledb-flat` and `tiledb-ivf-flat`, which are usable within the framework's runner.
6+
7+
## Building
8+
9+
1) Build the `Dockerfile` at the root of this repository:
10+
11+
```
12+
cd tiledb-vector-search
13+
docker build -f Dockerfile . -t tiledb_vs
14+
```
15+
16+
2) Build the TileDB docker image in the big-ann fork (requires image from step 1):
17+
18+
```
19+
git clone https://github.com/TileDB-Inc/big-ann-benchmarks/tree/tiledb
20+
cd big-ann-benchmarks
21+
docker build -f install/Dockerfile.tiledb . -t billion-scale-benchmark-tiledb
22+
```
23+
24+
## Running benchmarks
25+
26+
1) Create a local dataset.
27+
28+
note: the `create_dataset.py` command will download
29+
remote files the first time it runs, some of which can total >100GB). Use `--skip-data`
30+
to avoid downloading the large base set.
31+
32+
*This* command will download 7.7MB of data:
33+
34+
```
35+
python create_dataset.py --dataset bigann-10M --skip-data
36+
```
37+
38+
2) Run the benchmarks, choosing either `tiledb-flat` or `tiledb-ivf-flat`:
39+
40+
```
41+
python run.py --dataset bigann-10M --algorithm tiledb-flat
42+
```

0 commit comments

Comments
 (0)