|
| 1 | +# Snapshot File Seam Evaluation |
| 2 | + |
| 3 | +## Purpose |
| 4 | + |
| 5 | +This document closes `M12-T04` by evaluating whether `snapshot_file` is now clean enough to |
| 6 | +extract as another shared internal runtime module after: |
| 7 | + |
| 8 | +- shared `retire_queue` |
| 9 | +- shared `wal` |
| 10 | +- shared `wal_file` |
| 11 | + |
| 12 | +The question is deliberately narrower than "can snapshot persistence be shared in theory?" The |
| 13 | +question is whether the current three-engine code on `main` justifies a real extraction now. |
| 14 | + |
| 15 | +## Decision |
| 16 | + |
| 17 | +Do not extract a shared `snapshot_file` crate yet. |
| 18 | + |
| 19 | +The seam is real only inside the smaller `quota-core` / `reservation-core` pair. It is not yet a |
| 20 | +clean three-engine runtime boundary. |
| 21 | + |
| 22 | +The correct outcome for this slice is: |
| 23 | + |
| 24 | +- record that `snapshot_file` is not ready for extraction |
| 25 | +- keep each engine's `snapshot_file` local |
| 26 | +- move on to `M13`, the internal engine authoring boundary |
| 27 | + |
| 28 | +## What Is Shared |
| 29 | + |
| 30 | +All three engines share the same high-level persistence discipline: |
| 31 | + |
| 32 | +- one snapshot file per engine |
| 33 | +- temp-file write, sync, rename, and parent-directory sync |
| 34 | +- snapshot bytes loaded before WAL replay |
| 35 | +- fail-closed behavior on decode or integrity errors |
| 36 | + |
| 37 | +That means there is still real family resemblance at the discipline level. |
| 38 | + |
| 39 | +## Where The Seam Breaks |
| 40 | + |
| 41 | +### `allocdb-core` uses a simpler file format |
| 42 | + |
| 43 | +`allocdb-core` still stores only encoded snapshot bytes: |
| 44 | + |
| 45 | +- no footer |
| 46 | +- no checksum |
| 47 | +- no explicit max-bytes bound |
| 48 | +- decode-time corruption detection only |
| 49 | + |
| 50 | +That is materially different from the newer engines. |
| 51 | + |
| 52 | +### `quota-core` and `reservation-core` share a stronger format |
| 53 | + |
| 54 | +`quota-core` and `reservation-core` both use the same stronger file-level discipline: |
| 55 | + |
| 56 | +- footer magic |
| 57 | +- persisted payload length |
| 58 | +- CRC32C checksum |
| 59 | +- explicit `max_snapshot_bytes` |
| 60 | +- oversize rejection before decode |
| 61 | + |
| 62 | +Those two modules are close enough to share helpers later, but that is not the same thing as a |
| 63 | +repository-wide extraction candidate. |
| 64 | + |
| 65 | +### The remaining commonality is below the current file wrapper |
| 66 | + |
| 67 | +The shared part is mostly: |
| 68 | + |
| 69 | +- temp-file naming |
| 70 | +- write, sync, rename, and parent-directory sync |
| 71 | +- footer read/write mechanics for the newer engines |
| 72 | + |
| 73 | +But the live module boundary still mixes those mechanics with engine-specific constructor and error |
| 74 | +surface choices: |
| 75 | + |
| 76 | +- `allocdb-core` has no size-bound constructor argument |
| 77 | +- `quota-core` and `reservation-core` expose integrity-specific error variants |
| 78 | +- the three wrappers are still tied to engine-local snapshot schemas and recovery expectations |
| 79 | + |
| 80 | +That makes a forced crate extraction likely to create awkward generic plumbing rather than reduce |
| 81 | +maintenance cost. |
| 82 | + |
| 83 | +## Why Extraction Is Premature |
| 84 | + |
| 85 | +Extracting now would create a misleading shared layer: |
| 86 | + |
| 87 | +- it would either erase the real allocdb-vs-quota/reservation format difference |
| 88 | +- or it would introduce configuration branches that mostly exist to paper over that difference |
| 89 | + |
| 90 | +That is the wrong direction for this roadmap. `M12` is about extracting only what is already |
| 91 | +mechanically shared, not about normalizing divergent modules by force. |
| 92 | + |
| 93 | +The current evidence supports: |
| 94 | + |
| 95 | +- shared `retire_queue` |
| 96 | +- shared `wal` |
| 97 | +- shared `wal_file` |
| 98 | + |
| 99 | +It does not yet support: |
| 100 | + |
| 101 | +- shared `snapshot_file` |
| 102 | + |
| 103 | +## What Would Change The Answer Later |
| 104 | + |
| 105 | +Revisit this seam only if one of these becomes true: |
| 106 | + |
| 107 | +- `allocdb-core` adopts the same footer/checksum/max-bytes discipline as the newer engines |
| 108 | +- repeated snapshot-file fixes land independently in multiple engines |
| 109 | +- a later authoring pass shows the snapshot-file helper boundary can stay below engine-local error |
| 110 | + and schema surfaces |
| 111 | + |
| 112 | +Until then, local duplication is still cheaper than a fake shared abstraction. |
| 113 | + |
| 114 | +## Recommended Next Step |
| 115 | + |
| 116 | +Treat `M12` as complete after this readout. |
| 117 | + |
| 118 | +The next step is `M13`, not more extraction pressure: |
| 119 | + |
| 120 | +1. define the internal engine authoring boundary |
| 121 | +2. write the runtime-vs-engine contract |
| 122 | +3. reassess whether a fourth-engine or reduced-copy proof is still required |
0 commit comments