Skip to content

Commit c29d1c7

Browse files
committed
docs: write runtime vs engine contract
1 parent b66733e commit c29d1c7

File tree

3 files changed

+164
-1
lines changed

3 files changed

+164
-1
lines changed

docs/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
- [Reservation Engine Semantics](./reservation-semantics.md)
3030
- [Reservation Runtime Seam Evaluation](./reservation-runtime-seam-evaluation.md)
3131
- [Runtime Extraction Roadmap](./runtime-extraction-roadmap.md)
32+
- [Runtime vs Engine Contract](./runtime-vs-engine-contract.md)
3233
- [Snapshot File Seam Evaluation](./snapshot-file-seam-evaluation.md)
3334
- [Revoke Safety Slice](./revoke-safety-slice.md)
3435
- [Operator Runbook](./operator-runbook.md)

docs/runtime-vs-engine-contract.md

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
# Runtime vs Engine Contract
2+
3+
## Purpose
4+
5+
This document is the focused internal contract for `M13-T01`.
6+
7+
Use it when deciding whether new code belongs in the shared runtime substrate or inside one
8+
engine.
9+
10+
The rule is simple:
11+
12+
- if the code only preserves bounded durable execution discipline, it may belong in shared runtime
13+
- if the code defines domain meaning, it belongs in the engine
14+
15+
## Shared Runtime Contract
16+
17+
The shared runtime exists to preserve trusted substrate behavior across engines.
18+
19+
It owns:
20+
21+
- bounded retirement bookkeeping
22+
- WAL frame encoding, validation, checksum, and torn-tail detection
23+
- append-only WAL file mechanics
24+
- rewrite and truncation file mechanics
25+
26+
It currently maps to:
27+
28+
- `allocdb-retire-queue`
29+
- `allocdb-wal-frame`
30+
- `allocdb-wal-file`
31+
32+
### Shared runtime may know about
33+
34+
- bytes
35+
- lengths
36+
- checksums
37+
- frame versions
38+
- file descriptors and paths
39+
- bounded queue behavior
40+
- truncation and rewrite discipline
41+
42+
### Shared runtime must not know about
43+
44+
- commands
45+
- result codes
46+
- resources, buckets, pools, holds, reservations, or leases
47+
- snapshot schemas
48+
- engine invariants
49+
- replay semantics above raw framing
50+
51+
## Engine Contract
52+
53+
Each engine owns the database-specific meaning layered on top of the substrate.
54+
55+
It owns:
56+
57+
- command surfaces
58+
- domain config
59+
- state-machine invariants
60+
- snapshot schemas
61+
- recovery semantics
62+
- read models and result surfaces
63+
64+
Today that means each engine keeps local ownership of:
65+
66+
- command enums and codecs above raw frame bytes
67+
- snapshot encode/decode
68+
- snapshot file wrappers while formats still differ
69+
- top-level recovery entry points
70+
- logical-slot behavior such as refill, expiry, revoke, reclaim, and fencing
71+
72+
## Placement Rules
73+
74+
When adding new code, apply these rules in order.
75+
76+
### Rule 1
77+
78+
Start engine-local by default.
79+
80+
Do not begin from "how can this be shared?" Begin from "what engine behavior am I expressing?"
81+
82+
### Rule 2
83+
84+
Move code into shared runtime only if the seam is already proven.
85+
86+
That means at least one of:
87+
88+
- the code is mechanically identical across engines
89+
- the same fix is being repeated in multiple engines
90+
- a new engine slice would clearly avoid copy-paste by using the shared layer
91+
92+
### Rule 3
93+
94+
Keep extraction below the semantic line.
95+
96+
Good shared-runtime candidates:
97+
98+
- durable bytes-on-disk framing
99+
- bounded retirement structures
100+
- file rewrite and truncation helpers
101+
102+
Bad shared-runtime candidates:
103+
104+
- generic state-machine traits
105+
- generic reserve/confirm/release APIs
106+
- generic snapshot schemas
107+
- generic engine config layers
108+
109+
### Rule 4
110+
111+
If an extraction needs engine-specific switches, it is not ready.
112+
113+
Examples of bad signals:
114+
115+
- feature flags that mirror engine names
116+
- runtime branches on allocator/quota/reservation semantics
117+
- generic types that only one engine can meaningfully use
118+
119+
## Current Map
120+
121+
### Shared now
122+
123+
- `allocdb-retire-queue`
124+
- `allocdb-wal-frame`
125+
- `allocdb-wal-file`
126+
127+
### Deferred
128+
129+
- `snapshot_file`
130+
- only clean inside the `quota-core` / `reservation-core` pair
131+
- bounded collections beyond `retire_queue`
132+
- still need stable multi-engine shape
133+
- recovery helpers above frame/file mechanics
134+
- still coupled to engine-local replay contracts
135+
136+
### Explicitly engine-local
137+
138+
- `allocdb-core` lease and fencing semantics
139+
- `quota-core` debit and refill semantics
140+
- `reservation-core` hold and expiry semantics
141+
142+
## Authoring Checklist
143+
144+
Before extracting any new module, answer these questions:
145+
146+
1. Is this code below the semantic line?
147+
2. Is the shape already proven across multiple engines?
148+
3. Would extraction reduce copy-paste immediately?
149+
4. Can the shared module avoid engine-specific branches?
150+
151+
If any answer is "no", keep the code local.
152+
153+
## Practical Use
154+
155+
When writing a new engine or engine slice:
156+
157+
1. use the shared runtime only for already-extracted substrate
158+
2. implement new semantics locally
159+
3. copy new runtime-adjacent code locally if the seam is still uncertain
160+
4. extract later only under demonstrated pressure
161+
162+
That keeps the repository honest and keeps future library claims evidence-based.

docs/status.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -217,4 +217,4 @@
217217
- the next recommended step remains downstream real-cluster e2e work such as `gpu_control_plane`, not more unplanned lease-kernel semantics work; the current deployment slice covers a first in-cluster `StatefulSet` shape, but bootstrap-primary routing, failover/rejoin orchestration, and background maintenance remain operator work, and the current staging unblock path is to publish `skel84/allocdb` from GitHub Actions rather than relying on the local Docker engine
218218
- PR `#107` merged the `M10` quota-engine proof on `main`, and PRs `#116`, `#117`, and `#118` merged the full `M11` reservation-core chain on `main`: the repository now has a second and third deterministic engine with bounded command sets, logical-slot refill/expiry, and snapshot/WAL recovery proofs
219219
- PRs `#132`, `#133`, and `#134` merged the first `M12` runtime extractions on `main`: `retire_queue`, `wal`, and `wal_file` are now shared internal substrate instead of copied engine-local modules, while `M12-T04` closed as a defer decision because `snapshot_file` is still only a clean seam inside the `quota-core` / `reservation-core` pair and `allocdb-core` keeps the simpler file format
220-
- the next roadmap step is now `M13`: define the internal engine authoring boundary in `runtime-extraction-roadmap.md` and stop extraction pressure until that contract is written down; the authoring rule is to keep shared runtime below the semantic line and keep command surfaces, snapshot schemas, recovery entry points, and state-machine meaning engine-local
220+
- the next roadmap step is now `M13`: define the internal engine authoring boundary in `runtime-extraction-roadmap.md` and stop extraction pressure until that contract is written down; the authoring rule is to keep shared runtime below the semantic line and keep command surfaces, snapshot schemas, recovery entry points, and state-machine meaning engine-local, then publish the focused `runtime-vs-engine-contract` note as the shorter authoring reference for future engine work

0 commit comments

Comments
 (0)