|
| 1 | +# Invariants |
| 2 | + |
| 3 | +## State transitions |
| 4 | + |
| 5 | +This section deals with the in-memory state of a running node, the on-disk persistence of a chain, and how they relate to one another with respect to timing and data equivalence. |
| 6 | +The EVM-specific concept of state (e.g. accounts) is orthogonal to the topics in this document. |
| 7 | + |
| 8 | +1. Timing details the ordering guarantees that exist between, for example, in-memory objects and their equivalents on disk. |
| 9 | +2. Data equivalence details how in-memory objects can be recovered from disk after a restart. |
| 10 | + |
| 11 | +### Guiding principles |
| 12 | + |
| 13 | +1. Wherever possible, reuse Ethereum-native concepts i.f.f. there is a clear rationale for equivalence. |
| 14 | + 1. Conversely, introducing a new concept is preferred to ambiguous overloading of an existing one. |
| 15 | +2. Timing guarantees SHOULD be sufficient to avoid race conditions if relied upon and, if insufficient, MUST be documented as such. |
| 16 | + |
| 17 | +### Height mapping |
| 18 | + |
| 19 | +#### Background |
| 20 | + |
| 21 | +The `rawdb` package allows for arbitrary blocks to be stored, only coupling height and hash for canonical blocks; i.e. those in the chain history. |
| 22 | +The canonical block with the greatest height is typically stored as the "head". |
| 23 | +Ethereum consensus has delayed finality and `rawdb` caters for storage of the last-finalized block. |
| 24 | + |
| 25 | +#### Mapping |
| 26 | + |
| 27 | +| SAE | `rawdb` | |
| 28 | +| -------- | --------- | |
| 29 | +| Accepted | Canonical | |
| 30 | +| Executed | Head | |
| 31 | +| Settled | Finalized | |
| 32 | + |
| 33 | +#### Rationale |
| 34 | + |
| 35 | +Accepted and canonical are effectively identical concepts, save for nuanced differences between the consensus mechanisms. |
| 36 | + |
| 37 | +From an API perspective, we treat "latest" (i.e. the chain head) as the last-executed block. |
| 38 | +Mirroring this on disk allows for simple integration with the upstream API implementations. |
| 39 | + |
| 40 | +Although we don't support the "finalized" block label via APIs (using "safe" for settled blocks), the `rawdb` support is otherwise unused in SAE chains. |
| 41 | +As (a) there is no existing, alternative database concept (e.g. `rawdb` support for the latest safe block), and (b) there will be no overloading of concepts (see *Guiding principles*), we reserve `rawdb` functionality intended for the finalized block for the last settled block. |
| 42 | + |
| 43 | +> [!NOTE] |
| 44 | +> These also provide an unambigous inverse, allowing for recovery from disk. |
| 45 | +
|
| 46 | +### Ordering guarantees |
| 47 | + |
| 48 | +The goal of these guarantees is to avoid race conditions. |
| 49 | +These guarantees are not concerned with low-level races typically protected against with locks and atomics, but with the sort that arise due to the interplay of system components like consensus, the execution stream, and APIs. |
| 50 | + |
| 51 | +> [!NOTE] |
| 52 | +> This document is the source of truth and if the code differs there is a bug until proven otherwise. |
| 53 | +> If it is impossible to correct the bug then this document MUST be updated to reflect reality, and an audit of potential race conditions SHOULD be performed. |
| 54 | +
|
| 55 | +#### Definitions |
| 56 | + |
| 57 | +| | | |
| 58 | +| ------ | - |
| 59 | +| $B_n$ | Block at height $n$ |
| 60 | +| $A$ | Set of Accepted blocks |
| 61 | +| $E$ | Set of Executed blocks |
| 62 | +| $S$ | Set of Settled blocks |
| 63 | +| $C$ | Arbitrary condition; e.g. $B_n \in E$ |
| 64 | +| $D(C)$ | Disk artefacts of some $C$ |
| 65 | +| $M(C)$ | Memory artefacts of some $C$ |
| 66 | +| $I(C)$ | Internal indicator of some $C$ |
| 67 | +| $X(C)$ | eXternal indicator of some $C$ |
| 68 | +| $G \implies P$ | Some condition $G$ guarantees another condition $P$ |
| 69 | + |
| 70 | +An internal indicator is any signal of state that can only be accessed by the SAE implementation. |
| 71 | +Examples include the last-accepted, -executed, and -settled block pointers. |
| 72 | + |
| 73 | +An external indicator is any signal of state that can be accessed outside of the SAE implementation, even if in the same process. |
| 74 | +An example is a chain-head subscription, be it in the same binary or over a websocket API. |
| 75 | + |
| 76 | +#### Guarantees |
| 77 | + |
| 78 | +> [!TIP] |
| 79 | +> Guarantees should be thought of as "if I witness the guarantor $G$ then I can assume some prerequisite $P$". |
| 80 | +> This must not be confused with cause and effect as it is the *temporal inverse*: if $G$ guarantees $P$ then $t_P < t_G$ i.e. $P$ *happens before* $G$ in Golang terminology. |
| 81 | +
|
| 82 | +| Guarantor | Prerequisite | Notes | |
| 83 | +| ----------- | ------------------ | ----- | |
| 84 | +| $B_n \in S$ | $B_n \in E$ | Settlement after execution |
| 85 | +| $B_n \in E$ | $B_n \in A$ | Execution after settlement |
| 86 | +| $X(C)$ | $I(C)$ | External indicator after internal indicator |
| 87 | +| $I(C)$ | $M(C)$ | Internal indicator after memory |
| 88 | +| $M(C)$ | $D(C)$ | Memory after disk |
| 89 | +| $D(C)$ | $C$ | Disk after condition |
| 90 | +| $f(B_n)$ | $f(B_{n-1})$ | See below |
| 91 | + |
| 92 | +> [!WARNING] |
| 93 | +> No atomic guarantees are provided with respect to different blocks entering different states. |
| 94 | +> Of note is the lack of atomicity when updating the last-accepted and last-settled internal indicators even though settlement of a block only occurs with acceptance of some later block. |
| 95 | +
|
| 96 | +Realisation of any condition $f(\cdot)$ of a block MUST occur after the *same* condition of the parent block. |
| 97 | +Importantly, this means that code MAY be batched in a way that interleaves state transitions of *different* types. |
| 98 | + |
| 99 | +For example, all blocks settled by acceptance of another block MAY have their in-memory state updated *before* the last-settled block pointer is changed. |
| 100 | +In this case, $M(B_n \in S) \implies M(B_{n-1} \in S)$ (e.g. calls to `Block.MarkSettled()` are strictly ordered) but nothing can be inferred about $I(B_{n-1} \in S)$ until $I(B_n \in S)$ is witnessed (e.g. updating the last-settled pointer MAY occur after all the calls to `MarkSettled()`). |
| 101 | + |
| 102 | +#### Examples |
| 103 | + |
| 104 | +1. If `blocks.Block.Executed()` returns `true` then the block's receipts can be read from the database because $M(B_n \in E) \implies D(B_n \in E)$. |
| 105 | +2. If the atomic pointer to the last-executed block (an internal indicator) is at height $\ge n$ then the post-execution state of block $n$ can be opened because $I(B_{k \ge n} \in E) \implies M(B_k \in E) \implies M(B_n \in E)$. |
0 commit comments