Skip to content

Commit 172cc7d

Browse files
authored
Merge pull request #7 from BoakDB/readme
Readme
2 parents 41ad246 + 32d996f commit 172cc7d

File tree

13 files changed

+421
-62
lines changed

13 files changed

+421
-62
lines changed

README.md

Lines changed: 228 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,228 @@
1-
# B-Oak-DB
1+
# Disk Manager - B+ Tree Storage Engine
2+
3+
A high-performance, concurrent disk-based storage engine written in Java, implementing a B+ tree indexing structure with sophisticated buffer pool management and asynchronous disk I/O operations.
4+
5+
## Project Overview
6+
7+
This project is a complete database storage engine that provides:
8+
- **B+ Tree Indexing**: Efficient data storage and retrieval with logarithmic time complexity
9+
- **Buffer Pool Management**: Intelligent memory management with LRU eviction policies
10+
- **Concurrent Access Control**: Thread-safe operations with fine-grained locking
11+
- **Asynchronous Disk I/O**: Non-blocking disk operations for improved performance
12+
- **Type System**: Support for multiple data types with efficient serialization
13+
14+
## Architecture Overview
15+
16+
The system follows a layered architecture with clear separation of concerns:
17+
18+
```
19+
┌─────────────────────────────────────────────────────────────┐
20+
│ Application Layer │
21+
├─────────────────────────────────────────────────────────────┤
22+
│ Index Manager │
23+
│ (Collection Management) │
24+
├─────────────────────────────────────────────────────────────┤
25+
│ B+ Tree │
26+
│ (Indexing & Query Processing) │
27+
├─────────────────────────────────────────────────────────────┤
28+
│ Buffer Pool │
29+
│ (Memory Management) │
30+
├─────────────────────────────────────────────────────────────┤
31+
│ Disk Manager │
32+
│ (Persistent Storage) │
33+
└─────────────────────────────────────────────────────────────┘
34+
```
35+
36+
## Core Components
37+
38+
### 1. Disk Manager (`src/diskmanager/`)
39+
40+
**Purpose**: Handles all disk I/O operations with asynchronous request processing.
41+
42+
**Key Features**:
43+
- **Asynchronous I/O**: Uses a blocking queue and thread pool for non-blocking disk operations
44+
- **File Management**: Automatic file creation and management in the `storage/` directory
45+
- **Page Allocation**: Pre-allocates pages in chunks (1024 pages = 8MB) for better performance
46+
- **Concurrent Access**: Thread-safe operations with per-file resize locks
47+
48+
**Key Classes**:
49+
- `BasicDiskManager`: Main implementation with request queue processing
50+
- `DiskRequest`: Encapsulates read/write requests with completion futures
51+
- `RandomAccessDiskFile`: File abstraction for page-based I/O
52+
53+
### 2. Buffer Pool (`src/bufferpool/`)
54+
55+
**Purpose**: Manages memory efficiently with caching and eviction policies.
56+
57+
**Key Features**:
58+
- **LRU Eviction**: Least Recently Used replacement policy with configurable K parameter
59+
- **Pin/Unpin Mechanism**: Pages can be pinned to prevent eviction during operations
60+
- **Guard System**: Read/Write guards provide safe concurrent access to pages
61+
- **Dirty Page Management**: Automatic flushing of modified pages to disk
62+
63+
**Key Classes**:
64+
- `BufferPool`: Main buffer pool implementation
65+
- `Frame`: Represents a page in memory with metadata
66+
- `LRU`: Implements the LRU replacement algorithm
67+
- `ReadGuard`/`WriteGuard`: Provide safe concurrent access to pages
68+
69+
### 3. B+ Tree (`src/btree/`)
70+
71+
**Purpose**: Primary indexing structure providing efficient data storage and retrieval.
72+
73+
**Key Features**:
74+
- **Optimistic Concurrency**: Uses optimistic locking for better performance
75+
- **Node Splitting**: Automatic node splitting when capacity is exceeded
76+
- **Range Queries**: Support for cursor-based range scanning
77+
- **Composite Keys**: Support for multi-column keys through `Compositekey`
78+
79+
**Key Classes**:
80+
- `Btree`: Main B+ tree implementation with insert/search/delete operations
81+
- `BtreeHeader`: Manages tree metadata (root page ID, height)
82+
- `Cursor`: Provides iterator-style access for range queries
83+
84+
### 4. Page Management (`src/page/`)
85+
86+
**Purpose**: Manages the structure of internal and leaf nodes in the B+ tree.
87+
88+
**Key Features**:
89+
- **Node Types**: Separate implementations for internal and leaf nodes
90+
- **Variable-Length Records**: Efficient storage of variable-sized data
91+
- **Split Operations**: Sophisticated node splitting with sibling redistribution
92+
- **Linked Leaf Nodes**: Leaf nodes are linked for efficient range queries
93+
94+
**Key Classes**:
95+
- `LeafNode`: Stores actual key-value pairs
96+
- `InternalNode`: Stores routing information for tree navigation
97+
- `TreeNodeHeader`: Common header structure for all node types
98+
99+
### 5. Type System (`src/types/`)
100+
101+
**Purpose**: Provides efficient serialization and comparison for various data types.
102+
103+
**Key Features**:
104+
- **Primitive Types**: Support for Integer, Long, Double, Float, Short, Byte
105+
- **Composite Keys**: Multi-column keys with proper comparison semantics
106+
- **Memory Codecs**: Efficient binary serialization/deserialization
107+
- **Type Safety**: Compile-time type checking with generic templates
108+
109+
**Key Classes**:
110+
- `Compositekey`: Multi-column key implementation
111+
- `Template`: Type definition for keys and values
112+
- `Key`: Individual key component with type-specific codecs
113+
114+
### 6. Index Manager (`src/indexmanager/`)
115+
116+
**Purpose**: Provides higher-level index management and collection support.
117+
118+
**Key Features**:
119+
- **Collection Management**: Organizes indexes by collection name
120+
- **Index Naming**: Uses "collectionName-fieldName" naming convention
121+
- **Lifecycle Management**: Handles index creation and cleanup
122+
123+
## Configuration
124+
125+
Key system parameters are defined in `src/globals/Globals.java`:
126+
127+
```java
128+
public static final int PAGE_SIZE = 2 * 4096; // 8KB pages
129+
public static final int CLUSTER_PAGE_SIZE = 4 * 4096; // 16KB clusters
130+
public static final int PRE_ALLOCATED_PAGES_COUNT = 1024; // 8MB pre-allocation
131+
public static final long INVALID_PAGE_ID = -1; // Invalid page marker
132+
```
133+
134+
## Performance Characteristics
135+
136+
- **Time Complexity**: O(log n) for search, insert, and delete operations
137+
- **Space Complexity**: Configurable buffer pool size with LRU eviction
138+
- **Concurrency**: Optimistic locking with fine-grained page-level locks
139+
- **I/O Efficiency**: Page-based storage with asynchronous disk operations
140+
141+
## Usage Example
142+
143+
```java
144+
// Initialize components
145+
DiskManager diskManager = new BasicDiskManager();
146+
BufferPool bufferPool = new BufferPool(4000, 10, diskManager);
147+
148+
// Create B+ tree
149+
Template keyType = new Template(Integer.class);
150+
Template valueType = new Template(String.class);
151+
Btree btree = new Btree(keyType, valueType, "myindex",
152+
Globals.INVALID_PAGE_ID, bufferPool);
153+
154+
// Insert data
155+
Compositekey key = new Compositekey(keyType);
156+
key.set(0, 42, Integer.class);
157+
Compositekey value = new Compositekey(valueType);
158+
value.set(0, "Hello World", String.class);
159+
btree.insert(key, value);
160+
161+
// Search data
162+
Compositekey result = btree.get(key);
163+
```
164+
165+
## Build and Run
166+
167+
### Prerequisites
168+
- Java 11 or higher
169+
- Maven 3.6 or higher
170+
171+
### Building
172+
```bash
173+
# Compile all source files
174+
javac -d bin $(find src -name "*.java")
175+
176+
# Or use Maven
177+
mvn compile
178+
```
179+
180+
### Running Tests
181+
```bash
182+
# Run with Maven
183+
mvn test
184+
185+
# Or run manually
186+
java -cp bin org.junit.runner.JUnitCore test.btree.BtreeTest
187+
```
188+
189+
### Running the Application
190+
```bash
191+
java -cp bin Main
192+
```
193+
194+
## Storage Layout
195+
196+
The system stores data in the `storage/` directory with the following structure:
197+
- Each index is stored as a separate file
198+
- Files use page-based layout (8KB pages)
199+
- Header pages contain metadata (root page ID, tree height)
200+
- Data pages contain either internal nodes or leaf nodes
201+
202+
## Thread Safety
203+
204+
The system is designed for high concurrency:
205+
- **Buffer Pool**: Uses fine-grained locking with per-frame locks
206+
- **Disk Manager**: Asynchronous request processing with thread-safe queues
207+
- **B+ Tree**: Optimistic concurrency control with context-based locking
208+
209+
## Testing
210+
211+
Comprehensive test suites are provided:
212+
- **Unit Tests**: Individual component testing
213+
- **Integration Tests**: Cross-component functionality
214+
- **Performance Tests**: Benchmarking with up to 1M operations
215+
- **Concurrency Tests**: Multi-threaded stress testing
216+
217+
## Future Enhancements
218+
219+
Potential areas for improvement:
220+
- **Compression**: Page-level compression for better storage efficiency
221+
- **Logging**: Write-ahead logging for crash recovery
222+
- **Clustering**: Support for clustered indexes
223+
- **String Types**: Enhanced support for variable-length strings
224+
- **Transactions**: ACID transaction support
225+
226+
## License
227+
228+
This project is licensed under the terms specified in the LICENSE file.

src/btree/Btree.java

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44
import bufferpool.ReadGuard;
55
import bufferpool.WriteGuard;
66
import globals.Globals;
7+
import indexmanager.Index;
8+
79
import java.nio.ByteBuffer;
810
import java.util.ArrayDeque;
911
import java.util.Deque;

src/globals/Globals.java

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,10 @@
11
package globals;
22

3-
import java.io.File;
4-
53
public class Globals {
6-
public static final int PAGE_SIZE = 2 * 4096; // 4KB
7-
public static final int PRE_ALLOCATED_PAGES_COUNT = 1024; // 4Mb
4+
public static final int PAGE_SIZE = 2 * 4096; // 8KB
5+
public static final int CLUSTER_PAGE_SIZE = 4 * 4096; // 16KB
6+
public static final int PRE_ALLOCATED_PAGES_COUNT = 1024; // 8Mb
87
public static final long INVALID_PAGE_ID = -1;
98
public static final int INVALID_Frame_ID = -1;
10-
11-
public File logs;
9+
public static final String STORAGE_DIR = "storage/";
1210
}
Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
package indexmanager;
2+
3+
import java.io.Closeable;
4+
import java.io.FileInputStream;
5+
import java.io.FileNotFoundException;
6+
import java.io.FileOutputStream;
7+
import java.io.IOException;
8+
import java.io.ObjectInputStream;
9+
import java.io.ObjectOutputStream;
10+
import java.nio.file.Files;
11+
import java.nio.file.Path;
12+
import java.nio.file.Paths;
13+
import java.util.HashSet;
14+
import java.util.Set;
15+
16+
import globals.*;
17+
18+
public class CollectionMan implements Closeable {
19+
private final static String fileSuffix = ".man";
20+
private static String fileName;
21+
private static String collection;
22+
private static ObjectInputStream reader;
23+
private static ObjectOutputStream writer;
24+
private static Set<String> indexes;
25+
26+
public CollectionMan(String collectionName) throws IOException, ClassNotFoundException{
27+
collection = collectionName;
28+
fileName = collectionName + fileSuffix;
29+
initRW();
30+
}
31+
32+
public void initRW() throws IOException, ClassNotFoundException{
33+
try {
34+
reader = new ObjectInputStream(new FileInputStream(Globals.STORAGE_DIR + fileName));
35+
writer = new ObjectOutputStream(new FileOutputStream(Globals.STORAGE_DIR + fileName));
36+
read();
37+
} catch (FileNotFoundException e) {
38+
indexes = new HashSet<String>();
39+
Path filePath = Paths.get(Globals.STORAGE_DIR + fileName);
40+
Files.createFile(filePath);
41+
reader = new ObjectInputStream(new FileInputStream(Globals.STORAGE_DIR + fileName));
42+
writer = new ObjectOutputStream(new FileOutputStream(Globals.STORAGE_DIR + fileName));
43+
}
44+
}
45+
46+
public void close() throws IOException {
47+
flush();
48+
}
49+
50+
private static void flush() throws IOException {
51+
writer.writeObject(indexes);
52+
}
53+
54+
@SuppressWarnings("unchecked")
55+
private static void read() throws IOException, ClassNotFoundException {
56+
indexes = (Set<String>) reader.readObject();
57+
}
58+
59+
public boolean hasIndex(String indexName) {
60+
return indexes.contains(indexName);
61+
}
62+
63+
public String getIndexForField(String field) {
64+
String exepectedIndexName = collection + "-" + field;
65+
if (indexes.contains(exepectedIndexName)) {
66+
return exepectedIndexName;
67+
}
68+
return null;
69+
}
70+
}
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
package btree;
1+
package indexmanager;
22

33
import types.Compositekey;
44

src/indexmanager/IndexManager.java

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
package indexmanager;
2+
3+
import java.io.Closeable;
4+
import java.io.IOException;
5+
import java.util.HashMap;
6+
import java.util.Map;
7+
8+
9+
/**
10+
* index naming would be in this format
11+
* "collectionName-fieldName"
12+
*/
13+
public class IndexManager implements Closeable{
14+
private static Map<String, CollectionMan> collections;
15+
public IndexManager(){
16+
collections = new HashMap<String, CollectionMan>();
17+
}
18+
19+
public void close() {
20+
for(CollectionMan man: collections.values()) {
21+
try {
22+
man.close();
23+
} catch (IOException e) {
24+
e.printStackTrace();
25+
}
26+
}
27+
}
28+
29+
public boolean hasIndex(String collection, String indexName) {
30+
if(!collections.containsKey(collection)) {
31+
return false;
32+
}
33+
34+
CollectionMan man = collections.get(collection);
35+
return man.hasIndex(indexName);
36+
}
37+
38+
public String getIndexForField(String collection, String field) {
39+
if(!collections.containsKey(collection)) {
40+
return null;
41+
}
42+
CollectionMan man = collections.get(collection);
43+
return man.getIndexForField(field);
44+
}
45+
46+
}

0 commit comments

Comments
 (0)