|
46 | 46 | import java.util.concurrent.locks.ReentrantLock; |
47 | 47 |
|
48 | 48 | /** |
49 | | - * Eviction Manager for the Out-Of-Core stream cache |
50 | | - * This is the base implementation for LRU, FIFO |
| 49 | + * Eviction Manager for the Out-Of-Core (OOC) stream cache. |
| 50 | + * <p> |
| 51 | + * This manager implements a high-performance, thread-safe buffer pool designed |
| 52 | + * to handle intermediate results that exceed available heap memory. It builds on |
| 53 | + * a <b>partitioned eviction</b> strategy to maximize disk throughput and a |
| 54 | + * <b>lock-striped</b> concurrency model to minimize thread contention. |
51 | 55 | * |
52 | | - * Purpose |
53 | | - * ------- |
54 | | - * Provides a bounded cache for matrix blocks produced and consumed by OOC |
55 | | - * streaming operators. When memory pressure exceeds a configured limit, |
56 | | - * blocks are evicted from memory and spilled to disk, and transparently |
57 | | - * restored on demand. |
58 | | - * </b> |
59 | | - * |
60 | | - * Design scope |
61 | | - * ------------ |
62 | | - * - Manages block lifecycle across the states: |
63 | | - * HOT : block in memory |
64 | | - * EVICTING: block spilled to disk |
65 | | - * COLD: block persisted on disk and to be reload when needed |
66 | | - * </b> |
67 | | - * - Guarantees correctness under concurrent get/put operations with: |
68 | | - * * per-block locks |
69 | | - * * explicit eviction state transitions |
70 | | - * </b> |
71 | | - * - Integration with Resettable to support: |
72 | | - * * multiple consumers |
73 | | - * * deterministic replay |
74 | | - * * eviction-safe reuse of shared inputs for tee operator |
75 | | - * </b> |
76 | | - * |
77 | | - * Eviction Strategy |
78 | | - * ----------------- |
79 | | - * - Uses FIFO or LRU ordering at block granularity. |
80 | | - * - Eviction is partition-based: |
81 | | - * * blocks are spilled in batches to a single partition file |
82 | | - * * enables high-throughput sequential disk I/O |
83 | | - * - Each evicted block records a (partitionId, offset) for direct reload. |
84 | | - * </b> |
| 56 | + * <h3>1. Purpose</h3> |
| 57 | + * Provides a bounded cache for {@code MatrixBlock}s produced and consumed by OOC |
| 58 | + * streaming operators (e.g., {@code tsmm}, {@code ba+*}). When memory pressure |
| 59 | + * exceeds a configured limit, blocks are transparently evicted to disk and restored |
| 60 | + * on demand. |
85 | 61 | * |
86 | | - * Disk Layout |
87 | | - * ----------- |
88 | | - * - Spill files are append-only partition files |
89 | | - * - Each partition may contain multiple serialized blocks |
90 | | - * - Metadata remains in-memory while block data can be on disk |
91 | | - * </b> |
92 | | - * |
93 | | - * Concurrency Model |
94 | | - * ----------------- |
95 | | - * - Global cache structure guarded by a cache-level lock. |
96 | | - * - Each block has an independent lock and condition variable. |
97 | | - * - Readers wait when a block is in EVICTING state. |
98 | | - * - Disk I/O is performed outside global locks to avoid blocking producers. |
99 | | - * / |
| 62 | + * <h3>2. Lifecycle Management</h3> |
| 63 | + * Blocks transition atomically through three states to ensure data consistency: |
| 64 | + * <ul> |
| 65 | + * <li><b>HOT:</b> The block is pinned in the JVM heap ({@code value != null}).</li> |
| 66 | + * <li><b>EVICTING:</b> A transition state. The block is currently being written to disk. |
| 67 | + * Concurrent readers must wait on the entry's condition variable.</li> |
| 68 | + * <li><b>COLD:</b> The block is persisted on disk. The heap reference is nulled out |
| 69 | + * to free memory, but the container (metadata) remains in the cache map.</li> |
| 70 | + * </ul> |
| 71 | + * |
| 72 | + * <h3>3. Eviction Strategy (Partitioned I/O)</h3> |
| 73 | + * To mitigate I/O thrashing caused by writing thousands of small blocks: |
| 74 | + * <ul> |
| 75 | + * <li>Eviction is <b>partition-based</b>: Groups of "HOT" blocks are gathered into |
| 76 | + * batches (e.g., 64MB) and written sequentially to a single partition file.</li> |
| 77 | + * <li>This converts random I/O into high-throughput sequential I/O.</li> |
| 78 | + * <li>A separate metadata map tracks the {@code (partitionId, offset)} for every |
| 79 | + * evicted block, allowing random-access reloading.</li> |
| 80 | + * </ul> |
| 81 | + * |
| 82 | + * <h3>4. Data Integrity (Re-hydration)</h3> |
| 83 | + * To prevent index corruption during serialization/deserialization cycles, this manager |
| 84 | + * uses a "re-hydration" model. The {@code IndexedMatrixValue} container is <b>never</b> |
| 85 | + * removed from the cache structure. Eviction only nulls the data payload. Loading |
| 86 | + * restores the data into the existing container, preserving the original {@code MatrixIndexes}. |
| 87 | + * |
| 88 | + * <h3>5. Concurrency Model (Lock-Striping)</h3> |
| 89 | + * <ul> |
| 90 | + * <li><b>Global Lock:</b> A coarse-grained lock guards the cache structure |
| 91 | + * (LinkedHashMap) for insertions and deletions.</li> |
| 92 | + * <li><b>Per-Block Locks:</b> Each cache entry has an independent {@code ReentrantLock}. |
| 93 | + * This allows a reader to load Block A from disk while the evictor writes |
| 94 | + * Block B to disk simultaneously.</li> |
| 95 | + * <li><b>Wait/Notify:</b> Readers attempting to access a block in the {@code EVICTING} |
| 96 | + * state will automatically block until the state transitions to {@code COLD}, |
| 97 | + * preventing race conditions.</li> |
| 98 | + * </ul> |
| 99 | + */ |
| 100 | + |
100 | 101 | public class OOCEvictionManager { |
101 | 102 |
|
102 | 103 | // Configuration: OOC buffer limit as percentage of heap |
|
0 commit comments