|
| 1 | +# BPlusTreeMap Modularization Plan |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +The current `lib.rs` is 3,138 lines and contains multiple concerns mixed together. This plan breaks it into focused modules that group functionality that tends to change together and can be read end-to-end by humans. |
| 6 | + |
| 7 | +## Current Structure Analysis |
| 8 | + |
| 9 | +### Major Components Identified: |
| 10 | + |
| 11 | +1. **Error handling and type definitions** (~200 lines) |
| 12 | +2. **Core BPlusTreeMap struct and basic operations** (~800 lines) |
| 13 | +3. **LeafNode implementation** (~300 lines) |
| 14 | +4. **BranchNode implementation** (~300 lines) |
| 15 | +5. **Iterator implementations** (~400 lines) |
| 16 | +6. **Arena management helpers** (~200 lines) |
| 17 | +7. **Range query optimization** (~200 lines) |
| 18 | +8. **Tree validation and debugging** (~300 lines) |
| 19 | +9. **Tests** (~400 lines) |
| 20 | + |
| 21 | +## Proposed Module Structure |
| 22 | + |
| 23 | +### 1. `src/error.rs` - Error Handling & Types |
| 24 | + |
| 25 | +**Purpose**: All error types, result types, and error handling utilities |
| 26 | +**Size**: ~150 lines |
| 27 | +**Rationale**: Error handling changes together and is referenced throughout |
| 28 | + |
| 29 | +```rust |
| 30 | +// Contents: |
| 31 | +- BPlusTreeError enum and implementations |
| 32 | +- Result type aliases (BTreeResult, KeyResult, etc.) |
| 33 | +- BTreeResultExt trait |
| 34 | +- Error construction helpers |
| 35 | +``` |
| 36 | + |
| 37 | +### 2. `src/types.rs` - Core Types & Constants |
| 38 | + |
| 39 | +**Purpose**: Fundamental types, constants, and small utility types |
| 40 | +**Size**: ~100 lines |
| 41 | +**Rationale**: Core types are stable and referenced everywhere |
| 42 | + |
| 43 | +```rust |
| 44 | +// Contents: |
| 45 | +- NodeId type and constants (NULL_NODE, ROOT_NODE) |
| 46 | +- NodeRef enum |
| 47 | +- SplitNodeData enum |
| 48 | +- InsertResult and RemoveResult enums |
| 49 | +- MIN_CAPACITY and other constants |
| 50 | +``` |
| 51 | + |
| 52 | +### 3. `src/node/mod.rs` - Node Module Root |
| 53 | + |
| 54 | +**Purpose**: Module organization for node-related functionality |
| 55 | +**Size**: ~50 lines |
| 56 | + |
| 57 | +```rust |
| 58 | +// Contents: |
| 59 | +pub mod leaf; |
| 60 | +pub mod branch; |
| 61 | +pub mod operations; |
| 62 | + |
| 63 | +pub use leaf::LeafNode; |
| 64 | +pub use branch::BranchNode; |
| 65 | +``` |
| 66 | + |
| 67 | +### 4. `src/node/leaf.rs` - Leaf Node Implementation |
| 68 | + |
| 69 | +**Purpose**: Complete LeafNode struct and all its operations |
| 70 | +**Size**: ~400 lines |
| 71 | +**Rationale**: Leaf operations change together (insert, delete, split, merge) |
| 72 | + |
| 73 | +```rust |
| 74 | +// Contents: |
| 75 | +- LeafNode struct definition |
| 76 | +- Construction methods |
| 77 | +- Get/insert/delete operations |
| 78 | +- Split and merge operations |
| 79 | +- Borrowing operations |
| 80 | +- Utility methods (is_full, is_underfull, etc.) |
| 81 | +``` |
| 82 | + |
| 83 | +### 5. `src/node/branch.rs` - Branch Node Implementation |
| 84 | + |
| 85 | +**Purpose**: Complete BranchNode struct and all its operations |
| 86 | +**Size**: ~400 lines |
| 87 | +**Rationale**: Branch operations change together and mirror leaf operations |
| 88 | + |
| 89 | +```rust |
| 90 | +// Contents: |
| 91 | +- BranchNode struct definition |
| 92 | +- Construction methods |
| 93 | +- Child navigation operations |
| 94 | +- Insert/delete operations with child management |
| 95 | +- Split and merge operations |
| 96 | +- Rebalancing operations |
| 97 | +``` |
| 98 | + |
| 99 | +### 6. `src/node/operations.rs` - Cross-Node Operations |
| 100 | + |
| 101 | +**Purpose**: Operations that work across both leaf and branch nodes |
| 102 | +**Size**: ~200 lines |
| 103 | +**Rationale**: Shared node operations and utilities |
| 104 | + |
| 105 | +```rust |
| 106 | +// Contents: |
| 107 | +- Node validation helpers |
| 108 | +- Cross-node borrowing operations |
| 109 | +- Node type conversion utilities |
| 110 | +- Common node operation patterns |
| 111 | +``` |
| 112 | + |
| 113 | +### 7. `src/tree/mod.rs` - Tree Module Root |
| 114 | + |
| 115 | +**Purpose**: Module organization for tree-level functionality |
| 116 | +**Size**: ~50 lines |
| 117 | + |
| 118 | +```rust |
| 119 | +// Contents: |
| 120 | +pub mod core; |
| 121 | +pub mod operations; |
| 122 | +pub mod arena_helpers; |
| 123 | + |
| 124 | +pub use core::BPlusTreeMap; |
| 125 | +``` |
| 126 | + |
| 127 | +### 8. `src/tree/core.rs` - Core Tree Structure |
| 128 | + |
| 129 | +**Purpose**: BPlusTreeMap struct definition and basic operations |
| 130 | +**Size**: ~300 lines |
| 131 | +**Rationale**: Core tree structure and fundamental operations |
| 132 | + |
| 133 | +```rust |
| 134 | +// Contents: |
| 135 | +- BPlusTreeMap struct definition |
| 136 | +- Constructor (new) |
| 137 | +- Basic get/insert/remove public API |
| 138 | +- Tree structure management (root handling) |
| 139 | +- Arena allocation wrappers |
| 140 | +``` |
| 141 | + |
| 142 | +### 9. `src/tree/operations.rs` - Tree Operations Implementation |
| 143 | + |
| 144 | +**Purpose**: Complex tree operations and algorithms |
| 145 | +**Size**: ~600 lines |
| 146 | +**Rationale**: Tree algorithms change together and are complex |
| 147 | + |
| 148 | +```rust |
| 149 | +// Contents: |
| 150 | +- Recursive insert/delete/get implementations |
| 151 | +- Tree rebalancing logic |
| 152 | +- Root collapse/expansion |
| 153 | +- Tree traversal algorithms |
| 154 | +- Batch operations |
| 155 | +``` |
| 156 | + |
| 157 | +### 10. `src/tree/arena_helpers.rs` - Arena Management |
| 158 | + |
| 159 | +**Purpose**: Arena allocation and management helpers |
| 160 | +**Size**: ~200 lines |
| 161 | +**Rationale**: Arena operations change together and are performance-critical |
| 162 | + |
| 163 | +```rust |
| 164 | +// Contents: |
| 165 | +- Arena allocation helpers |
| 166 | +- Node ID management |
| 167 | +- Arena statistics |
| 168 | +- Memory management utilities |
| 169 | +``` |
| 170 | + |
| 171 | +### 11. `src/iterator/mod.rs` - Iterator Module Root |
| 172 | + |
| 173 | +**Purpose**: Module organization for all iterator types |
| 174 | +**Size**: ~50 lines |
| 175 | + |
| 176 | +```rust |
| 177 | +// Contents: |
| 178 | +pub mod item; |
| 179 | +pub mod range; |
| 180 | +pub mod key_value; |
| 181 | + |
| 182 | +pub use item::ItemIterator; |
| 183 | +pub use range::RangeIterator; |
| 184 | +// etc. |
| 185 | +``` |
| 186 | + |
| 187 | +### 12. `src/iterator/item.rs` - Item Iterator |
| 188 | + |
| 189 | +**Purpose**: ItemIterator and FastItemIterator implementations |
| 190 | +**Size**: ~300 lines |
| 191 | +**Rationale**: Item iteration logic changes together |
| 192 | + |
| 193 | +```rust |
| 194 | +// Contents: |
| 195 | +- ItemIterator struct and implementation |
| 196 | +- FastItemIterator struct and implementation |
| 197 | +- Leaf traversal logic |
| 198 | +- Iterator state management |
| 199 | +``` |
| 200 | + |
| 201 | +### 13. `src/iterator/range.rs` - Range Iterator |
| 202 | + |
| 203 | +**Purpose**: Range query iterator and optimization |
| 204 | +**Size**: ~300 lines |
| 205 | +**Rationale**: Range operations are complex and change together |
| 206 | + |
| 207 | +```rust |
| 208 | +// Contents: |
| 209 | +- RangeIterator struct and implementation |
| 210 | +- Range bounds resolution |
| 211 | +- Range start position finding |
| 212 | +- Range optimization helpers |
| 213 | +``` |
| 214 | + |
| 215 | +### 14. `src/iterator/key_value.rs` - Key/Value Iterators |
| 216 | + |
| 217 | +**Purpose**: KeyIterator and ValueIterator implementations |
| 218 | +**Size**: ~100 lines |
| 219 | +**Rationale**: Simple wrapper iterators that change together |
| 220 | + |
| 221 | +```rust |
| 222 | +// Contents: |
| 223 | +- KeyIterator implementation |
| 224 | +- ValueIterator implementation |
| 225 | +- Iterator adapter utilities |
| 226 | +``` |
| 227 | + |
| 228 | +### 15. `src/validation.rs` - Tree Validation & Debugging |
| 229 | + |
| 230 | +**Purpose**: Tree invariant checking and debugging utilities |
| 231 | +**Size**: ~400 lines |
| 232 | +**Rationale**: Validation logic changes together and is used for testing |
| 233 | + |
| 234 | +```rust |
| 235 | +// Contents: |
| 236 | +- Tree invariant checking |
| 237 | +- Detailed validation methods |
| 238 | +- Debug utilities |
| 239 | +- Test helpers |
| 240 | +- Integrity verification |
| 241 | +``` |
| 242 | + |
| 243 | +### 16. `src/lib.rs` - Public API & Re-exports |
| 244 | + |
| 245 | +**Purpose**: Public API surface and module organization |
| 246 | +**Size**: ~200 lines |
| 247 | +**Rationale**: Clean public interface with comprehensive documentation |
| 248 | + |
| 249 | +```rust |
| 250 | +// Contents: |
| 251 | +- Module declarations |
| 252 | +- Public re-exports |
| 253 | +- Top-level documentation |
| 254 | +- Usage examples |
| 255 | +- Public API traits and implementations |
| 256 | +``` |
| 257 | + |
| 258 | +## Module Dependencies |
| 259 | + |
| 260 | +``` |
| 261 | +lib.rs |
| 262 | +├── error.rs (no dependencies) |
| 263 | +├── types.rs (depends on: error) |
| 264 | +├── node/ |
| 265 | +│ ├── mod.rs |
| 266 | +│ ├── leaf.rs (depends on: error, types) |
| 267 | +│ ├── branch.rs (depends on: error, types, node/leaf) |
| 268 | +│ └── operations.rs (depends on: error, types, node/leaf, node/branch) |
| 269 | +├── tree/ |
| 270 | +│ ├── mod.rs |
| 271 | +│ ├── core.rs (depends on: error, types, node/*) |
| 272 | +│ ├── operations.rs (depends on: error, types, node/*, tree/core) |
| 273 | +│ └── arena_helpers.rs (depends on: error, types, node/*) |
| 274 | +├── iterator/ |
| 275 | +│ ├── mod.rs |
| 276 | +│ ├── item.rs (depends on: error, types, tree/core, node/leaf) |
| 277 | +│ ├── range.rs (depends on: error, types, tree/core, iterator/item) |
| 278 | +│ └── key_value.rs (depends on: iterator/item) |
| 279 | +└── validation.rs (depends on: all modules) |
| 280 | +``` |
| 281 | + |
| 282 | +## Benefits of This Structure |
| 283 | + |
| 284 | +### 1. **Cohesion**: Related functionality grouped together |
| 285 | + |
| 286 | +- Node operations stay with node implementations |
| 287 | +- Iterator types are grouped but separated by complexity |
| 288 | +- Tree-level operations are separate from node-level operations |
| 289 | + |
| 290 | +### 2. **Human Readability**: Each module can be read end-to-end |
| 291 | + |
| 292 | +- `leaf.rs`: Complete leaf node story (~400 lines) |
| 293 | +- `branch.rs`: Complete branch node story (~400 lines) |
| 294 | +- `core.rs`: Core tree structure (~300 lines) |
| 295 | +- `operations.rs`: Tree algorithms (~600 lines) |
| 296 | + |
| 297 | +### 3. **Change Locality**: Things that change together are together |
| 298 | + |
| 299 | +- All leaf operations in one place |
| 300 | +- All iterator implementations grouped |
| 301 | +- All error handling centralized |
| 302 | +- All validation logic together |
| 303 | + |
| 304 | +### 4. **Clear Dependencies**: Well-defined module boundaries |
| 305 | + |
| 306 | +- Core types have no dependencies |
| 307 | +- Nodes depend only on types and errors |
| 308 | +- Tree depends on nodes |
| 309 | +- Iterators depend on tree |
| 310 | +- Validation depends on everything (for testing) |
| 311 | + |
| 312 | +### 5. **Testability**: Each module can be tested independently |
| 313 | + |
| 314 | +- Node operations can be unit tested |
| 315 | +- Tree operations can be integration tested |
| 316 | +- Iterators can be tested with mock trees |
| 317 | +- Validation provides comprehensive testing utilities |
| 318 | + |
| 319 | +## Migration Strategy |
| 320 | + |
| 321 | +### Phase 1: Extract Stable Components |
| 322 | + |
| 323 | +1. Create `error.rs` and `types.rs` |
| 324 | +2. Update imports throughout codebase |
| 325 | +3. Verify compilation |
| 326 | + |
| 327 | +### Phase 2: Extract Node Implementations |
| 328 | + |
| 329 | +1. Create `node/` module structure |
| 330 | +2. Move `LeafNode` to `node/leaf.rs` |
| 331 | +3. Move `BranchNode` to `node/branch.rs` |
| 332 | +4. Create `node/operations.rs` for shared functionality |
| 333 | + |
| 334 | +### Phase 3: Extract Tree Implementation |
| 335 | + |
| 336 | +1. Create `tree/` module structure |
| 337 | +2. Move core `BPlusTreeMap` to `tree/core.rs` |
| 338 | +3. Move complex algorithms to `tree/operations.rs` |
| 339 | +4. Move arena helpers to `tree/arena_helpers.rs` |
| 340 | + |
| 341 | +### Phase 4: Extract Iterators |
| 342 | + |
| 343 | +1. Create `iterator/` module structure |
| 344 | +2. Move each iterator type to its own file |
| 345 | +3. Organize by complexity and relationships |
| 346 | + |
| 347 | +### Phase 5: Extract Validation |
| 348 | + |
| 349 | +1. Move all validation logic to `validation.rs` |
| 350 | +2. Create comprehensive test utilities |
| 351 | +3. Update test imports |
| 352 | + |
| 353 | +### Phase 6: Clean Up Public API |
| 354 | + |
| 355 | +1. Organize `lib.rs` as clean public interface |
| 356 | +2. Add comprehensive module documentation |
| 357 | +3. Verify all public APIs are properly exposed |
| 358 | + |
| 359 | +## File Size Targets |
| 360 | + |
| 361 | +| Module | Target Lines | Current Estimate | Rationale | |
| 362 | +| ----------------------- | ------------ | ---------------- | ------------------------------ | |
| 363 | +| `error.rs` | 150 | 200 | Error handling | |
| 364 | +| `types.rs` | 100 | 100 | Core types | |
| 365 | +| `node/leaf.rs` | 400 | 300 | Complete leaf implementation | |
| 366 | +| `node/branch.rs` | 400 | 300 | Complete branch implementation | |
| 367 | +| `node/operations.rs` | 200 | 150 | Shared node operations | |
| 368 | +| `tree/core.rs` | 300 | 200 | Core tree structure | |
| 369 | +| `tree/operations.rs` | 600 | 800 | Tree algorithms | |
| 370 | +| `tree/arena_helpers.rs` | 200 | 200 | Arena management | |
| 371 | +| `iterator/item.rs` | 300 | 250 | Item iteration | |
| 372 | +| `iterator/range.rs` | 300 | 200 | Range iteration | |
| 373 | +| `iterator/key_value.rs` | 100 | 50 | Simple iterators | |
| 374 | +| `validation.rs` | 400 | 300 | Validation and testing | |
| 375 | +| `lib.rs` | 200 | 150 | Public API | |
| 376 | + |
| 377 | +**Total**: ~3,650 lines (vs current 3,138 lines) |
| 378 | + |
| 379 | +The slight increase accounts for: |
| 380 | + |
| 381 | +- Module documentation |
| 382 | +- Clear separation boundaries |
| 383 | +- Some code duplication elimination |
| 384 | +- Better organization overhead |
| 385 | + |
| 386 | +## Success Criteria |
| 387 | + |
| 388 | +1. **No single module > 600 lines** |
| 389 | +2. **Each module readable end-to-end in 10-15 minutes** |
| 390 | +3. **Clear module responsibilities** |
| 391 | +4. **Minimal cross-module dependencies** |
| 392 | +5. **All tests pass after migration** |
| 393 | +6. **Public API unchanged** |
| 394 | +7. **Documentation improved** |
| 395 | + |
| 396 | +This modularization will make the codebase much more maintainable while preserving all existing functionality and improving code organization. |
0 commit comments