|
1 | | -# B-Oak-DB |
| 1 | +# Disk Manager - B+ Tree Storage Engine |
| 2 | + |
| 3 | +A high-performance, concurrent disk-based storage engine written in Java, implementing a B+ tree indexing structure with sophisticated buffer pool management and asynchronous disk I/O operations. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +This project is a complete database storage engine that provides: |
| 8 | +- **B+ Tree Indexing**: Efficient data storage and retrieval with logarithmic time complexity |
| 9 | +- **Buffer Pool Management**: Intelligent memory management with LRU eviction policies |
| 10 | +- **Concurrent Access Control**: Thread-safe operations with fine-grained locking |
| 11 | +- **Asynchronous Disk I/O**: Non-blocking disk operations for improved performance |
| 12 | +- **Type System**: Support for multiple data types with efficient serialization |
| 13 | + |
| 14 | +## Architecture Overview |
| 15 | + |
| 16 | +The system follows a layered architecture with clear separation of concerns: |
| 17 | + |
| 18 | +``` |
| 19 | +┌─────────────────────────────────────────────────────────────┐ |
| 20 | +│ Application Layer │ |
| 21 | +├─────────────────────────────────────────────────────────────┤ |
| 22 | +│ Index Manager │ |
| 23 | +│ (Collection Management) │ |
| 24 | +├─────────────────────────────────────────────────────────────┤ |
| 25 | +│ B+ Tree │ |
| 26 | +│ (Indexing & Query Processing) │ |
| 27 | +├─────────────────────────────────────────────────────────────┤ |
| 28 | +│ Buffer Pool │ |
| 29 | +│ (Memory Management) │ |
| 30 | +├─────────────────────────────────────────────────────────────┤ |
| 31 | +│ Disk Manager │ |
| 32 | +│ (Persistent Storage) │ |
| 33 | +└─────────────────────────────────────────────────────────────┘ |
| 34 | +``` |
| 35 | + |
| 36 | +## Core Components |
| 37 | + |
| 38 | +### 1. Disk Manager (`src/diskmanager/`) |
| 39 | + |
| 40 | +**Purpose**: Handles all disk I/O operations with asynchronous request processing. |
| 41 | + |
| 42 | +**Key Features**: |
| 43 | +- **Asynchronous I/O**: Uses a blocking queue and thread pool for non-blocking disk operations |
| 44 | +- **File Management**: Automatic file creation and management in the `storage/` directory |
| 45 | +- **Page Allocation**: Pre-allocates pages in chunks (1024 pages = 8MB) for better performance |
| 46 | +- **Concurrent Access**: Thread-safe operations with per-file resize locks |
| 47 | + |
| 48 | +**Key Classes**: |
| 49 | +- `BasicDiskManager`: Main implementation with request queue processing |
| 50 | +- `DiskRequest`: Encapsulates read/write requests with completion futures |
| 51 | +- `RandomAccessDiskFile`: File abstraction for page-based I/O |
| 52 | + |
| 53 | +### 2. Buffer Pool (`src/bufferpool/`) |
| 54 | + |
| 55 | +**Purpose**: Manages memory efficiently with caching and eviction policies. |
| 56 | + |
| 57 | +**Key Features**: |
| 58 | +- **LRU Eviction**: Least Recently Used replacement policy with configurable K parameter |
| 59 | +- **Pin/Unpin Mechanism**: Pages can be pinned to prevent eviction during operations |
| 60 | +- **Guard System**: Read/Write guards provide safe concurrent access to pages |
| 61 | +- **Dirty Page Management**: Automatic flushing of modified pages to disk |
| 62 | + |
| 63 | +**Key Classes**: |
| 64 | +- `BufferPool`: Main buffer pool implementation |
| 65 | +- `Frame`: Represents a page in memory with metadata |
| 66 | +- `LRU`: Implements the LRU replacement algorithm |
| 67 | +- `ReadGuard`/`WriteGuard`: Provide safe concurrent access to pages |
| 68 | + |
| 69 | +### 3. B+ Tree (`src/btree/`) |
| 70 | + |
| 71 | +**Purpose**: Primary indexing structure providing efficient data storage and retrieval. |
| 72 | + |
| 73 | +**Key Features**: |
| 74 | +- **Optimistic Concurrency**: Uses optimistic locking for better performance |
| 75 | +- **Node Splitting**: Automatic node splitting when capacity is exceeded |
| 76 | +- **Range Queries**: Support for cursor-based range scanning |
| 77 | +- **Composite Keys**: Support for multi-column keys through `Compositekey` |
| 78 | + |
| 79 | +**Key Classes**: |
| 80 | +- `Btree`: Main B+ tree implementation with insert/search/delete operations |
| 81 | +- `BtreeHeader`: Manages tree metadata (root page ID, height) |
| 82 | +- `Cursor`: Provides iterator-style access for range queries |
| 83 | + |
| 84 | +### 4. Page Management (`src/page/`) |
| 85 | + |
| 86 | +**Purpose**: Manages the structure of internal and leaf nodes in the B+ tree. |
| 87 | + |
| 88 | +**Key Features**: |
| 89 | +- **Node Types**: Separate implementations for internal and leaf nodes |
| 90 | +- **Variable-Length Records**: Efficient storage of variable-sized data |
| 91 | +- **Split Operations**: Sophisticated node splitting with sibling redistribution |
| 92 | +- **Linked Leaf Nodes**: Leaf nodes are linked for efficient range queries |
| 93 | + |
| 94 | +**Key Classes**: |
| 95 | +- `LeafNode`: Stores actual key-value pairs |
| 96 | +- `InternalNode`: Stores routing information for tree navigation |
| 97 | +- `TreeNodeHeader`: Common header structure for all node types |
| 98 | + |
| 99 | +### 5. Type System (`src/types/`) |
| 100 | + |
| 101 | +**Purpose**: Provides efficient serialization and comparison for various data types. |
| 102 | + |
| 103 | +**Key Features**: |
| 104 | +- **Primitive Types**: Support for Integer, Long, Double, Float, Short, Byte |
| 105 | +- **Composite Keys**: Multi-column keys with proper comparison semantics |
| 106 | +- **Memory Codecs**: Efficient binary serialization/deserialization |
| 107 | +- **Type Safety**: Compile-time type checking with generic templates |
| 108 | + |
| 109 | +**Key Classes**: |
| 110 | +- `Compositekey`: Multi-column key implementation |
| 111 | +- `Template`: Type definition for keys and values |
| 112 | +- `Key`: Individual key component with type-specific codecs |
| 113 | + |
| 114 | +### 6. Index Manager (`src/indexmanager/`) |
| 115 | + |
| 116 | +**Purpose**: Provides higher-level index management and collection support. |
| 117 | + |
| 118 | +**Key Features**: |
| 119 | +- **Collection Management**: Organizes indexes by collection name |
| 120 | +- **Index Naming**: Uses "collectionName-fieldName" naming convention |
| 121 | +- **Lifecycle Management**: Handles index creation and cleanup |
| 122 | + |
| 123 | +## Configuration |
| 124 | + |
| 125 | +Key system parameters are defined in `src/globals/Globals.java`: |
| 126 | + |
| 127 | +```java |
| 128 | +public static final int PAGE_SIZE = 2 * 4096; // 8KB pages |
| 129 | +public static final int CLUSTER_PAGE_SIZE = 4 * 4096; // 16KB clusters |
| 130 | +public static final int PRE_ALLOCATED_PAGES_COUNT = 1024; // 8MB pre-allocation |
| 131 | +public static final long INVALID_PAGE_ID = -1; // Invalid page marker |
| 132 | +``` |
| 133 | + |
| 134 | +## Performance Characteristics |
| 135 | + |
| 136 | +- **Time Complexity**: O(log n) for search, insert, and delete operations |
| 137 | +- **Space Complexity**: Configurable buffer pool size with LRU eviction |
| 138 | +- **Concurrency**: Optimistic locking with fine-grained page-level locks |
| 139 | +- **I/O Efficiency**: Page-based storage with asynchronous disk operations |
| 140 | + |
| 141 | +## Usage Example |
| 142 | + |
| 143 | +```java |
| 144 | +// Initialize components |
| 145 | +DiskManager diskManager = new BasicDiskManager(); |
| 146 | +BufferPool bufferPool = new BufferPool(4000, 10, diskManager); |
| 147 | + |
| 148 | +// Create B+ tree |
| 149 | +Template keyType = new Template(Integer.class); |
| 150 | +Template valueType = new Template(String.class); |
| 151 | +Btree btree = new Btree(keyType, valueType, "myindex", |
| 152 | + Globals.INVALID_PAGE_ID, bufferPool); |
| 153 | + |
| 154 | +// Insert data |
| 155 | +Compositekey key = new Compositekey(keyType); |
| 156 | +key.set(0, 42, Integer.class); |
| 157 | +Compositekey value = new Compositekey(valueType); |
| 158 | +value.set(0, "Hello World", String.class); |
| 159 | +btree.insert(key, value); |
| 160 | + |
| 161 | +// Search data |
| 162 | +Compositekey result = btree.get(key); |
| 163 | +``` |
| 164 | + |
| 165 | +## Build and Run |
| 166 | + |
| 167 | +### Prerequisites |
| 168 | +- Java 11 or higher |
| 169 | +- Maven 3.6 or higher |
| 170 | + |
| 171 | +### Building |
| 172 | +```bash |
| 173 | +# Compile all source files |
| 174 | +javac -d bin $(find src -name "*.java") |
| 175 | + |
| 176 | +# Or use Maven |
| 177 | +mvn compile |
| 178 | +``` |
| 179 | + |
| 180 | +### Running Tests |
| 181 | +```bash |
| 182 | +# Run with Maven |
| 183 | +mvn test |
| 184 | + |
| 185 | +# Or run manually |
| 186 | +java -cp bin org.junit.runner.JUnitCore test.btree.BtreeTest |
| 187 | +``` |
| 188 | + |
| 189 | +### Running the Application |
| 190 | +```bash |
| 191 | +java -cp bin Main |
| 192 | +``` |
| 193 | + |
| 194 | +## Storage Layout |
| 195 | + |
| 196 | +The system stores data in the `storage/` directory with the following structure: |
| 197 | +- Each index is stored as a separate file |
| 198 | +- Files use page-based layout (8KB pages) |
| 199 | +- Header pages contain metadata (root page ID, tree height) |
| 200 | +- Data pages contain either internal nodes or leaf nodes |
| 201 | + |
| 202 | +## Thread Safety |
| 203 | + |
| 204 | +The system is designed for high concurrency: |
| 205 | +- **Buffer Pool**: Uses fine-grained locking with per-frame locks |
| 206 | +- **Disk Manager**: Asynchronous request processing with thread-safe queues |
| 207 | +- **B+ Tree**: Optimistic concurrency control with context-based locking |
| 208 | + |
| 209 | +## Testing |
| 210 | + |
| 211 | +Comprehensive test suites are provided: |
| 212 | +- **Unit Tests**: Individual component testing |
| 213 | +- **Integration Tests**: Cross-component functionality |
| 214 | +- **Performance Tests**: Benchmarking with up to 1M operations |
| 215 | +- **Concurrency Tests**: Multi-threaded stress testing |
| 216 | + |
| 217 | +## Future Enhancements |
| 218 | + |
| 219 | +Potential areas for improvement: |
| 220 | +- **Compression**: Page-level compression for better storage efficiency |
| 221 | +- **Logging**: Write-ahead logging for crash recovery |
| 222 | +- **Clustering**: Support for clustered indexes |
| 223 | +- **String Types**: Enhanced support for variable-length strings |
| 224 | +- **Transactions**: ACID transaction support |
| 225 | + |
| 226 | +## License |
| 227 | + |
| 228 | +This project is licensed under the terms specified in the LICENSE file. |
0 commit comments