This is an instruction to benchmark B-tree, Data Calculator, and AirIndex (auto-tuned index) for experiments in AirIndex: Versatile Index Tuning Through Data and Storage.
Please follow dataset (dataset_setup.md) and query key set (keyset_setup.md) instructions to setup the benchmarking environment. These are examples of environment reset scripts (reload_examples.md). The following assumes that the dataset are under /path/to/data/ and key sets are under /path/to/keyset/.
cargo build --releaseOptionally, you can run the unit tests to check compatibility.
cargo testFor each storage (e.g., NFS) you would like benchmark on, tune and build indexes for all datasets.
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/manual btree btree build 1 ~/reload_nfs.sh nfs
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_nfs enb step,band_greedy,band_equal build 1 ~/reload_nfs.sh nfsAfterwards, benchmark over 40 key set of 1M keys.
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/manual btree btree benchmark 40 ~/reload_nfs.sh nfs
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_nfs enb step,band_greedy,band_equal benchmark 40 ~/reload_nfs.sh nfsThe measurements will be recorded in sosd_benchmark_out.jsons.
Inspect a breakdown of the latency from existing built indexes by following commands.
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/manual btree btree breakdown 40 ~/reload_nfs.sh nfs
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_nfs enb step,band_greedy,band_equal breakdown 40 ~/reload_nfs.sh nfsThe measurements will be recorded in sosd_breakdown_out.jsons.
Generate skewed Zipfian keysets by following the instruction (keyset_setup.md).
Then use the benchmark script by pointing to the skewed keysets.
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset/skew file:///path/to/airindex_nfs enb step,band_greedy,band_equal benchmark 40 ~/reload_nfs.sh nfsSimilarly to 5.2, build the AirIndex variants.
bash scripts/sosd_variants.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_variants_index build 1 ~/reload_nfs.sh nfsThen, benchmark all of them
bash scripts/sosd_variants.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_variants_index benchmark 40 ~/reload_nfs.sh nfsLet AirIndex tune indexes over a variety of affine storage profiles. Highly recommend executing on a CPU-rich machine; otherwise, this will take a considerable time.
bash scripts/storage_explore.sh file:///path/to/data file:///path/to/keyset file:///path/to/storage_explore enbThen, read the index structures.
bash scripts/inspect.sh file:///path/to/data file:///path/to/keyset file:///path/to/storage_explore enbTo measure the build time, run the build script.
bash scripts/scale.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_scalability enb scalability.jsonsBuild indexes with varying hyperparameter k by using a different action buildtopk.
bash scripts/sosd_experiment.sh file:///path/to/data file:///path/to/keyset file:///path/to/airindex_nfs enb step,band_greedy,band_equal buildtopk 1 ~/reload_nfs.sh nfsTo execute Data Calculator's auto-completion.
bash scripts/data_calculator_sosd.sh file:///path/to/data file:///path/to/keyset file:///path/to/data_calc autocomplete 1 ~/reload_nfs.sh nfsThen copy the suggested structure at the end (load and number of layers) to insert to scripts/data_calculator_sosd.sh (lines 51-69).
Build and benchmark similarly to AirIndex
bash scripts/data_calculator_sosd.sh file:///path/to/data file:///path/to/keyset file:///path/to/data_calc build 1 ~/reload_nfs.sh nfs
bash scripts/data_calculator_sosd.sh file:///path/to/data file:///path/to/keyset file:///path/to/data_calc benchmark 40 ~/reload_nfs.sh nfsTo benchmark on skewed workload (6.3), generate skewed keysets and change the keyset path accordingly.