This is mainly inspired from tidesdb from the author Alex Padula But i'll try to add some experimental ideas.
Only targeted for unix like systems (eg: Mac, Ubuntu). If you want windows support please ask by raising a PR.
macOS
brew install lz4 zstd snappyUbuntu / Debian
sudo apt install liblz4-dev libzstd-dev libsnappy-devFedora / RHEL
sudo dnf install lz4-devel libzstd-devel snappy-devel# delete existing build dir, configure, and build
rm -rf build && cmake -S . -B build
cmake --build build
cmake --build buildautomatically rebuilds only files that changed since the last run.
./build/*_tests@dborchard has a great collection of educational database projects worth exploring:
- mini-lsm — Structured course for building an LSM-Tree storage engine in Rust (memtables, SSTables, WAL, compaction, MVCC)
- tidesdb — The primary inspiration for kvarkDB
- lsm-tree — LSM Tree demo (directly relevant to kvarkDB's architecture)
- cometkv — Comparing different memtable implementations
- tiny-txn — Serializable Snapshot Isolation transactions
- isolation_levels — Database isolation levels implemented in Go
- colexec-db — Educational vectorized execution engine
- tiny-db — Query engine + storage engine using Calcite and ANTLR
- awesome-dbdev — Curated materials on database development
Niv Dayan (University of Toronto) is a leading researcher on LSM-tree optimization. Notable papers:
- Monkey (SIGMOD 2017) — Optimal bloom filter allocation across LSM levels
- Dostoevsky (SIGMOD 2018) — Better space-time trade-offs via adaptive merging
- Chucky (SIGMOD 2021) — Succinct cuckoo filter for LSM-trees
- Spooky (VLDB 2022) — Correct compaction granularity for LSM-trees
- KV-Tandem (2024) — Modular approach to high-speed LSM storage engines
- Transaction Isolation in Postgres, Explained — Covers SQL92 isolation levels, MVCC, and real-world concurrency tradeoffs (relevant to kvarkDB's planned MVCC support)
- Implement WAL
- Implement Compression
- Implement BloomFilter
- Implement SkipList
- Implement cursor
- Implement memtable
- Implement sstable
- Implement Column families
- Write a minimal key-value db
- Support REPL
- Add Python bindings
- LSM based with levelled compaction (for now only has single level)
- Transaction support
- WAL
- Compression support (LZ4, ZSTD, Snappy)