Skip to content

Commit 98740d6

Browse files
committed
docs: readme updated
1 parent 56ce545 commit 98740d6

File tree

2 files changed

+368
-6
lines changed

2 files changed

+368
-6
lines changed

README.md

Lines changed: 367 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,367 @@
1-
todo!()
1+
# 🧸 ShorterDB
2+
3+
ShorterDB is a lightweight, embedded key-value store inspired by popular databases like RocksDB and LevelDB. It is designed to provide a simple and extensible architecture for learning and experimentation. While it may not match the performance of production-grade systems, it offers a clear and modular implementation of key-value store concepts, including Write-Ahead Logging (WAL), Memtables, and Sorted String Tables (SSTs).
4+
5+
6+
7+
## Table of Contents
8+
9+
1. [Introduction](#introduction)
10+
2. [Installation](#installation)
11+
3. [Features](#features)
12+
4. [Examples](#examples)
13+
- [Embedded Database](#embedded-database)
14+
- [gRPC Server](#grpc-server)
15+
- [CSV Import with REPL](#csv-import-with-repl)
16+
5. [Code Walkthrough](#code-walkthrough)
17+
- [Error Handling](#error-handling)
18+
- [Database Core (`ShorterDB`)](#database-core-shorterdb)
19+
6. [Limitations](#limitations)
20+
7. [Future Work](#future-work)
21+
8. [Architecture Overview](#architecture-overview)
22+
- [Write-Ahead Log (WAL)](#write-ahead-log-wal)
23+
- [Memtable](#memtable)
24+
- [Sorted String Table (SST)](#sorted-string-table-sst)
25+
9. [Conclusion](#conclusion)
26+
10. [Contributing](#contributing)
27+
28+
---
29+
30+
## Installation
31+
32+
To use ShorterDB in your Rust project, add the following to your `Cargo.toml`:
33+
34+
```toml
35+
[dependencies]
36+
shorterdb = "0.1.0"
37+
```
38+
39+
For building the project locally, ensure you have Rust installed. Clone the repository and run:
40+
41+
```bash
42+
git clone https://github.com/your-repo/shorterdb.git
43+
cd shorterdb
44+
cargo build
45+
```
46+
47+
---
48+
49+
## Introduction
50+
51+
ShorterDB is a simple key-value store built using a De-LSM architecture. It is designed for educational purposes and provides a modular implementation of database components. The project includes examples for embedded usage, gRPC-based remote access, and CSV imports.
52+
53+
---
54+
55+
## Features
56+
57+
- **Embedded Database**: Use ShorterDB as a lightweight, file-based key-value store.
58+
- **gRPC Server**: Access the database remotely using gRPC.
59+
- **REPL Interface**: Interact with the database in a command-line interface.
60+
- **Write-Ahead Logging (WAL)**: Ensure durability by logging all writes.
61+
- **Memtable**: An in-memory data structure for fast reads and writes.
62+
- **Sorted String Table (SST)**: Persistent storage for key-value pairs.
63+
64+
---
65+
66+
## Examples
67+
68+
### Embedded Database
69+
70+
The [`embedded`](examples/embedded) example demonstrates how to use ShorterDB as an embedded database.
71+
72+
```rust
73+
let mut db = ShorterDB::new(Path::new("./embedded_db")).unwrap();
74+
db.set(b"hello", b"world").unwrap();
75+
let value = db.get(b"hello").unwrap();
76+
assert_eq!(value, Some(b"world".to_vec()));
77+
```
78+
79+
### gRPC Server
80+
81+
The [`grpc`](examples/grpc) example provides a gRPC interface for remote database access.
82+
83+
```rust
84+
#[tonic::async_trait]
85+
impl Basic for DbOperations {
86+
async fn get(&self, request: tonic::Request<GetRequest>) -> Result<tonic::Response<GetResponse>, tonic::Status> {
87+
let key = request.get_ref().key.clone();
88+
let db = self.db.lock().await;
89+
match db.get(key.as_bytes()) {
90+
Ok(Some(value)) => Ok(tonic::Response::new(GetResponse { value: String::from_utf8(value).unwrap() })),
91+
Ok(None) => Err(tonic::Status::not_found("Key not found")),
92+
Err(_) => Err(tonic::Status::internal("Error reading from the database")),
93+
}
94+
}
95+
}
96+
```
97+
98+
### CSV Import with REPL
99+
100+
The [`repl_csv`](examples/repl_csv) example imports data from a CSV file and provides a REPL interface.
101+
102+
```rust
103+
let mut rdr = ReaderBuilder::new().has_headers(false).from_reader(File::open("data.csv").unwrap());
104+
for result in rdr.records() {
105+
let record = result.unwrap();
106+
db.set(record.get(0).unwrap().as_bytes(), record.get(1).unwrap().as_bytes()).unwrap();
107+
}
108+
```
109+
110+
---
111+
112+
## Code Walkthrough
113+
114+
### Error Handling
115+
116+
ShorterDB uses the `thiserror` crate for error handling. Custom error types are defined in `errors.rs`.
117+
118+
```rust
119+
#[derive(Error, Debug)]
120+
pub enum ShortDBErrors {
121+
#[error("{0}")]
122+
Io(#[from] io::Error),
123+
#[error("Key not found")]
124+
KeyNotFound,
125+
#[error("Value not set")]
126+
ValueNotSet,
127+
#[error("Flush needed from Memtable")]
128+
FlushNeededFromMemTable,
129+
}
130+
```
131+
132+
### Database Core (`ShorterDB`)
133+
134+
The `ShorterDB` struct ties together the Memtable and SST components.
135+
136+
```rust
137+
pub struct ShorterDB {
138+
pub(crate) memtable: Memtable,
139+
pub(crate) sst: SST,
140+
pub(crate) data_dir: PathBuf,
141+
}
142+
143+
impl ShorterDB {
144+
pub fn set(&mut self, key: &[u8], value: &[u8]) -> Result<()> {
145+
self.memtable.set(key, value)?;
146+
Ok(())
147+
}
148+
}
149+
```
150+
151+
---
152+
153+
## Code Walkthrough
154+
155+
### Error Handling
156+
157+
ShorterDB uses the `thiserror` crate for error handling. Custom error types are defined in `errors.rs`.
158+
159+
```rust
160+
#[derive(Error, Debug)]
161+
pub enum ShortDBErrors {
162+
#[error("{0}")]
163+
Io(#[from] io::Error),
164+
#[error("Key not found")]
165+
KeyNotFound,
166+
#[error("Value not set")]
167+
ValueNotSet,
168+
#[error("Flush needed from Memtable")]
169+
FlushNeededFromMemTable,
170+
}
171+
```
172+
173+
### Database Core (`ShorterDB`)
174+
175+
The `ShorterDB` struct ties together the WAL, Memtable, and SST components.
176+
177+
```rust
178+
pub struct ShorterDB {
179+
pub(crate) memtable: Memtable,
180+
pub(crate) wal: WAL,
181+
pub(crate) sst: SST,
182+
pub(crate) data_dir: PathBuf,
183+
}
184+
185+
impl ShorterDB {
186+
pub fn set(&mut self, key: &[u8], value: &[u8]) -> Result<()> {
187+
let entry = WALEntry {
188+
key: Bytes::copy_from_slice(key),
189+
value: Bytes::copy_from_slice(value),
190+
};
191+
self.wal.write(&entry)?;
192+
self.memtable.set(key, value)?;
193+
Ok(())
194+
}
195+
}
196+
```
197+
198+
---
199+
200+
## Examples
201+
202+
### Embedded Database
203+
204+
The `embedded` example demonstrates how to use ShorterDB as an embedded database.
205+
206+
```rust
207+
let mut db = ShorterDB::new(Path::new("./embedded_db")).unwrap();
208+
db.set(b"hello", b"world").unwrap();
209+
let value = db.get(b"hello").unwrap();
210+
assert_eq!(value, Some(b"world".to_vec()));
211+
```
212+
213+
### gRPC Server
214+
215+
The `grpc` example provides a gRPC interface for remote database access.
216+
217+
```rust
218+
#[tonic::async_trait]
219+
impl Basic for DbOperations {
220+
async fn get(&self, request: tonic::Request<GetRequest>) -> Result<tonic::Response<GetResponse>, tonic::Status> {
221+
let key = request.get_ref().key.clone();
222+
let db = self.db.lock().await;
223+
match db.get(key.as_bytes()) {
224+
Ok(Some(value)) => Ok(tonic::Response::new(GetResponse { value: String::from_utf8(value).unwrap() })),
225+
Ok(None) => Err(tonic::Status::not_found("Key not found")),
226+
Err(_) => Err(tonic::Status::internal("Error reading from the database")),
227+
}
228+
}
229+
}
230+
```
231+
232+
### CSV Import with REPL
233+
234+
The `repl_csv` example imports data from a CSV file and provides a REPL interface.
235+
236+
```rust
237+
let mut rdr = ReaderBuilder::new().has_headers(false).from_reader(File::open("data.csv").unwrap());
238+
for result in rdr.records() {
239+
let record = result.unwrap();
240+
db.set(record.get(0).unwrap().as_bytes(), record.get(1).unwrap().as_bytes()).unwrap();
241+
}
242+
```
243+
244+
---
245+
246+
## Limitations
247+
248+
- Performance is not optimized for production use.
249+
- Limited concurrency support.
250+
- No advanced features like compression or compaction.
251+
252+
---
253+
254+
## Future Work
255+
256+
- Add support for compression.
257+
- Implement advanced compaction strategies.
258+
- Improve concurrency and parallelism.
259+
260+
---
261+
262+
## Architecture Overview
263+
264+
ShorterDB is built using a modular architecture that separates concerns into distinct components:
265+
266+
### Write-Ahead Log (WAL)
267+
268+
The WAL ensures durability by logging all write operations before they are applied to the in-memory `Memtable`. This guarantees that data can be recovered in case of a crash.
269+
270+
```rust
271+
pub(crate) struct WAL {
272+
path: PathBuf,
273+
file: File,
274+
}
275+
276+
impl WAL {
277+
pub(crate) fn write(&mut self, entry: &WALEntry) -> io::Result<()> {
278+
self.file.write_all(&entry.key.len().to_le_bytes())?;
279+
self.file.write_all(entry.key.as_ref())?;
280+
self.file.write_all(&entry.value.len().to_le_bytes())?;
281+
self.file.write_all(entry.value.as_ref())?;
282+
self.file.flush()?;
283+
Ok(())
284+
}
285+
}
286+
```
287+
288+
### Memtable
289+
290+
The `Memtable` is an in-memory data structure that stores key-value pairs. It uses a `SkipMap` for efficient lookups and maintains a size limit to trigger flushing to SSTs.
291+
292+
```rust
293+
pub(crate) struct Memtable {
294+
pub(crate) memtable: Arc<SkipMap<Bytes, Bytes>>,
295+
pub(crate) size: u64,
296+
}
297+
298+
impl Memtable {
299+
pub(crate) fn set(&mut self, key: &[u8], value: &[u8]) -> Result<()> {
300+
self.memtable.insert(Bytes::copy_from_slice(key), Bytes::copy_from_slice(value));
301+
self.size += 1;
302+
if self.size >= 256 {
303+
return Err(ShortDBErrors::FlushNeededFromMemTable);
304+
}
305+
Ok(())
306+
}
307+
}
308+
```
309+
310+
### Sorted String Table (SST)
311+
312+
The SST is a persistent, sorted, and immutable data structure stored on disk. It is used for long-term storage of key-value pairs.
313+
314+
```rust
315+
pub(crate) struct SST {
316+
pub(crate) dir: PathBuf,
317+
pub(crate) levels: Vec<PathBuf>,
318+
pub(crate) queue: VecDeque<Memtable>,
319+
}
320+
321+
impl SST {
322+
pub(crate) fn set(&mut self) {
323+
let mem = self.queue.pop_front().unwrap();
324+
for entry in mem.memtable.iter() {
325+
let key = entry.key();
326+
let value = entry.value();
327+
let mut path_of_kv_file = self.dir.clone();
328+
path_of_kv_file.push("l0");
329+
path_of_kv_file.push(bytes_to_string(key));
330+
let mut file = File::create_new(&path_of_kv_file);
331+
file.unwrap().write_all(value).unwrap();
332+
}
333+
}
334+
}
335+
```
336+
337+
---
338+
339+
## Limitations
340+
341+
- Performance is not optimized for production use.
342+
- Limited concurrency support.
343+
- No advanced features like compression or compaction.
344+
345+
---
346+
347+
## Future Work
348+
349+
- Add support for compression.
350+
- Implement advanced compaction strategies.
351+
- Improve concurrency and parallelism.
352+
353+
---
354+
355+
## Contributing
356+
357+
Contributions are welcome! To contribute:
358+
359+
1. Fork the repository.
360+
2. Create a new branch for your feature or bugfix.
361+
3. Submit a pull request with a clear description of your changes.
362+
363+
---
364+
365+
## Conclusion
366+
367+
ShorterDB is a simple and modular key-value store designed for learning and experimentation. While it may not match the performance of production-grade systems, it provides a clear and extensible implementation of database concepts. Explore the [examples](#examples) to get started!

0 commit comments

Comments
 (0)