This repository contains code for "Distributed Metadata Querying on HPC Systems" paper accepted in 32nd IEEE International Conference on High Performance Computing, Data, & Analytics
/src - implementation code pht, range hash, optimized skiplist and supporting code and libraries /test - test code for individual data structure, pht, range hash and others ./generator_.py - generate slurm jobs for different configuration ./runner_.sh - run the generated slurm jobs
We expect you to run this jobs in NERSC perlmutter. We give some sample running instructions below:
salloc --time 01:00:00 --constraint cpu -A account --qos interactive --nodes=2 --ntasks-per-node=4
srun --output=logs/output.log --unbuffered -n 8 ./build/test/test
srun -n 8 stdbuf -oL ./build/test/test
Example batch runner
#!/bin/bash
#SBATCH -C cpu
#SBATCH -q regular
#SBATCH --job-name=rocks_normal_1nodes_48cores
#SBATCH --nodes=1
#SBATCH --ntasks=48
#SBATCH --time=00:45:00
#SBATCH --output=/pscratch/sd/s/user/out/rocks_normal_32.csv
#SBATCH --error=/pscratch/sd/s/user/err/rocks_normal_32.err
#SBATCH --mail-user=user@gmail.com
#SBATCH --mail-type=ALL
#SBATCH -A m2621
module load PrgEnv-gnu
module load cray-mpich
WORKDIR=/global/u1/s/user/pdc_range_query
DATADIR=/pscratch/sd/s/user/data2/int
# Execute the parallel program
for type in boss_pht; do
srun "$WORKDIR/build/test/test" "$type" int "$DATADIR/normal/1000k"
done