|
| 1 | +# Introduction |
| 2 | + |
| 3 | +This work-in-progress document summarizes the structure and behaviors of the Haskell Leios simulator. |
| 4 | +The desired level of detail is between the code itself and the various work-in-progress Leios specifications. |
| 5 | + |
| 6 | +The Leios node is modeled as a set of threads that maintain shared state via the [stm](https://hackage.haskell.org/package/stm) library, as with the existing `ouroboros-network` and `ouroboros-consensus` implementation libraries. |
| 7 | +This document will primarily describe those threads and the components of their shared state. |
| 8 | + |
| 9 | +TODO also discuss the mini protocol multiplexing and TCP model |
| 10 | + |
| 11 | +# Lifetime of an object |
| 12 | + |
| 13 | +The objects within the simulator include Input Blocks (IBs), Endorser Blocks (EBs) (aka Endorsement Blocks, aka Endorse Blocks), Vote Bundles (VBs), and Ranking Blocks (RBs). |
| 14 | +Certificates are not explicit; for example, a certificate's computational cost is instead associated with the containing RB. |
| 15 | + |
| 16 | +Within a single simulated node, the lifetime of every such object follows a common sequence. |
| 17 | + |
| 18 | +- *Generate*, the duration when the node is generating an object. |
| 19 | +- *Receive*, the moment a node receives (the entirety of) an ojbect from a peer. |
| 20 | + (It is often useful to consider a node to have received an object when it finished generating that object.) |
| 21 | +- *Wait*, the duration when the node cannot yet validate an object (eg a known necessary input is missing). |
| 22 | +- *Validate*, the duration when when node is computing whether the object is valid. |
| 23 | +- *Diffuse*, the duration when the node is offering/sending the object to its peers. |
| 24 | +- *Adopt*, the moment the node updates its state in response to the successful validation. |
| 25 | +- *Prune*/*Forget*, the moment the node updates its state once the object is no longer necessary/relevant. |
| 26 | + |
| 27 | +Only generation and validation consume modeled CPU, and nothing consumes any modeled RAM/disk capacity/bandwidth. |
| 28 | + |
| 29 | +Modeled CPU consumption for a some object happens all-at-once and at most once. |
| 30 | +For example, the IBs transitively included by an RB does not affect the cost of adopting that RB. |
| 31 | + |
| 32 | +# Threads |
| 33 | + |
| 34 | +The `LeiosProtocol.Short.Node.leiosNode` function constructs the node's set of threads. |
| 35 | + |
| 36 | +## Generate threads |
| 37 | + |
| 38 | +At the onset of each slot, the node generates whichever IBs, EBs, VBs, and RBs are required by the mocked Praos and Leios election lotteries. |
| 39 | + |
| 40 | +### Mocked leader schedules |
| 41 | + |
| 42 | +Different objects arise at different rates, but the simulator reuses some common infrastructure for them. |
| 43 | +In particular, for each slot, each node is given a random number between 0 and 1 (inclusive) (ie `uniformR (0, 1 :: Double)` from the [random](https://hackage-content.haskell.org/package/random-1.3.1/docs/System-Random.html#v:uniformR) package). |
| 44 | +For each object, that number will be mapped to a number of "wins", ie elections, via [Inverse transform sampling](https://en.wikipedia.org/wiki/Inverse_transform_sampling). |
| 45 | + |
| 46 | +The probability distribution of wins is parameterized on the node's stake, which varies per node but not per slot, and on a corresponding protocol parameter, which only varies per kind of object. |
| 47 | + |
| 48 | +### Generating IBs |
| 49 | + |
| 50 | +The IB election lottery allows for a node to generate multiple IBs in a slot. |
| 51 | +Each opportunity within a slot is called a "subslot", but the generated IB required by the subslots of some slot are all made (in subslot order) at the slot's onset. |
| 52 | + |
| 53 | +The probability distribution of the node's IB elections in a slot is determined by the `inputBlockFrequencyPerSlot` parameter. |
| 54 | + |
| 55 | +- If the rate is ≤1, then the distribution is `Bernoulli(stake*inputBlockFrequencyPerSlot)`. |
| 56 | +- If the rate is >1, then the distribution is `Poisson(stake*inputBlockFrequencyPerSlot)`. |
| 57 | + |
| 58 | +Each IB (see `LeiosProtocol.Common.InputBlock`) consists of the following fields. |
| 59 | + |
| 60 | +- A globally unique ID, which for convenience is the ID of the issuing node and an incrementing counter. |
| 61 | +- The slot and subslot of its (implicit) election proof. |
| 62 | +- The hash of an RB. |
| 63 | +- The byte size of the IB header. |
| 64 | +- The byte size of the IB body. |
| 65 | + |
| 66 | +More details for some fields. |
| 67 | + |
| 68 | +- The RB hash is the youngest RB on the node's selection for which the node has already computed the ledger state. |
| 69 | +- The header byte size is the constant `inputBlockHeader`. |
| 70 | +- The body byte size is the constant `inputBlockBodyAvgSize`. |
| 71 | + |
| 72 | +Each generated IB begins diffusing immediately and is adopted immediately. |
| 73 | +If the node should validate its IB before diffusion and adoption, then that cost should be included in the generation cost. |
| 74 | + |
| 75 | +### Generating EBs |
| 76 | + |
| 77 | +The EB leader schedule allows for a node to generate at most one EB in a slot. |
| 78 | +TODO the Short Leios specification requires that all EBs are all created at the beginning of Endorse, even if they're election slot is not the first slot in the stage. |
| 79 | + |
| 80 | +The probability distribution of the node's EB elections in a slot is determined by the `endorseBlockFrequencyPerStage` parameter. |
| 81 | + |
| 82 | +- If the rate is ≤1, then the distribution is `Bernoulli(stake*inputBlockFrequencyPerSlot)`. |
| 83 | +- If the rate is >1, then the distribution is `Bernoulli(1 - PoissonPmf(0; stake*inputBlockFrequencyPerSlot))`. |
| 84 | + (Note the subtle `min 1 . f` in the definition of `endorseBlockRate`.) |
| 85 | + |
| 86 | +*Remark*. |
| 87 | +Those probability distributions converge as `stake` approaches 0. |
| 88 | + |
| 89 | +Each EB (see `LeiosProtocol.Common.EndorseBlock`) consists of the following fields. |
| 90 | + |
| 91 | +- A globally unique ID, which for convenience is the ID of the issuing node and an incrementing counter. |
| 92 | +- The slot of its (implicit) election proof. |
| 93 | +- The list of IB IDs. |
| 94 | +- The list of EB IDs. |
| 95 | +- The byte size. |
| 96 | + |
| 97 | +More details for some fields. |
| 98 | + |
| 99 | +- An EB from iteration `i` includes the IDs of all IBs that were already adopted, are also from iteration `i`, and arrived before the end of `i`'s Deliver2 stage. |
| 100 | +- If the Leios variant is set to `short`, this EB includes no EB IDs. |
| 101 | +- If the Leios variant is set to `full`, an EB from iteration `i` includes the ID of the best eligible EB from each iteration with any eligible EBs. |
| 102 | + - An eligible EB has already been adopted, has already been certified, and is from an iteration in the closed interval `[i - min i (2 + pipelinesToReferenceFromEB), i-3]`. |
| 103 | + - The best eligible EB from the eligible EBs within a particular iteration has more IBs and on a tie arrived earlier. |
| 104 | +- The byte size is computed as `ebSizeBytesConstant + ebSizeBytesPerIb * (#IBs + #EBs)`. |
| 105 | +- (TODO The field with EB IDs is `endorseBlocksEarlierPipeline`, not `endorseBlocksEarlierStage`. |
| 106 | + The latter is a stub related to equivocation detection; it is always empty during the simulation.) |
| 107 | + |
| 108 | +Each generated EB begins diffusing immediately and is adopted immediately. |
| 109 | +If the node should validate its EB before diffusion and adoption, then that cost should be included in the generation cost. |
| 110 | + |
| 111 | +### Generating VBs |
| 112 | + |
| 113 | +The VB election lottery schedules a node to generate a VB at the onset of exactly one slot within the first `activeVotingStageLength`-many slots of the voting stage of every iteration. |
| 114 | +The probability distribution of the number of votes in that VB is determined by the `votingFrequencyPerStage` parameter. |
| 115 | +The distribution is `Poisson(stake*votingFrequencyPerStage)`. |
| 116 | + |
| 117 | +Each VB (see `LeiosProtocol.Common.VoteMsg`) consists of the following fields. |
| 118 | + |
| 119 | +- A globally unique ID, which for convenience is the ID of the issuing node and an incrementing counter. |
| 120 | +- The slot of its (implicit) election proof. |
| 121 | +- The number of lottery wins in this slot. |
| 122 | +- The list of voted EB IDs. |
| 123 | +- The byte size. |
| 124 | + |
| 125 | +More details for some fields. |
| 126 | + |
| 127 | +- If all votes are considered to have the same weight, then a VB determines `#wins * #EBs`-many unweighted votes. |
| 128 | + Otherwise, a VB determines `#EBs`-many weighted votes. |
| 129 | +- A VB from iteration `i` includes the IDs of all EBs that satisfy the following. |
| 130 | + - The EB must have already been adopted. |
| 131 | + - The EB must also be from iteration `i`. |
| 132 | + - The EB must only include IBs that have already been adopted, are from iteration `i`, and arrived before the end of `i`'s Endorse stage. |
| 133 | + - The EB must include all IBs that have already been adopted, are from iteration `i`, and arrived before the end of `i`'s Deliver1 stage. |
| 134 | + - If the Leios variant is set to `full`, then let X be the EB's included EBs in iteration order; let Y be the EBs this node would have considered eligible if it were to retroactively create an EB for iteration `i` right now with the only extra restriction being ignore EBs that arrived within Δ_hdr of the end of iteration `i`; then `and (zipWith elem X Y)` must be `True`. |
| 135 | + (TODO the `zipWith` is suspicious; whether it would misbehave in various scenarios depends on many implementation details.) |
| 136 | +- The byte size is computed as `voteBundleSizeBytesConstant + voteBundleSizeBytesPerEb * #EBs` (which implies the weighted-vote perspective). |
| 137 | + |
| 138 | +Each generated VB begins diffusing immediately and is adopted immediately. |
| 139 | +If the node should validate its VB before diffusion and adoption, then that cost should be included in the generation cost. |
| 140 | + |
| 141 | +### Generating RBs |
| 142 | + |
| 143 | +The RB leader schedule allows for a node to generate at most one RB in a slot. |
| 144 | + |
| 145 | +The probability distribution of the node's RB elections in a slot is determined by the `blockFrequencyPerSlot` parameter. |
| 146 | +The distribution is `Bernoulli(stake*inputBlockFrequencyPerSlot)`. |
| 147 | + |
| 148 | +*Remark*. |
| 149 | +That distribution converges to Praos's `Bernoulli(ϕ_stake(inputBlockFrequencyPerSlot))` as `stake` approaches 0. |
| 150 | + |
| 151 | +Each RB (see `LeiosProtocol.Common.RankingBlock`) consists of the following fields. |
| 152 | + |
| 153 | +- The byte size of the RB header. |
| 154 | +- The slot of its (implicit) election proof. |
| 155 | +- The hash of the header content. |
| 156 | +- The hash of the body content. |
| 157 | +- The block number. |
| 158 | +- The hash of its predecessor RB. |
| 159 | +- The byte size of the RB body. |
| 160 | +- A list (TODO which is always length 0 or 1) of EB IDs paired with the IDs and weights of a quorum of votes for that EB. |
| 161 | +- The size of the RB's (implicit) tx payload. |
| 162 | +- The ID of the issuing node. |
| 163 | + |
| 164 | +More details for some fields. |
| 165 | + |
| 166 | +- The RB extends the node's preferred chain. |
| 167 | +- The tx payload is the constant `rankingBlockLegacyPraosPayloadAvgSize`. |
| 168 | +- The EB is the best eligible EB, if any. |
| 169 | + - An eligible EB is certified, from an iteration that doesn't already have a certificate on the extended chain, only references IBs that are already adopted, and is not more than `maxEndorseBlockAgeSlots` slots older than the RB. |
| 170 | + - If the Leios variant is set to `short`, the best of the eligible EBs is oldest, on a tie has more IBs, and on a tie arrived earlier. |
| 171 | + - If the Leios variant is set to `full`, the best of the eligible EBs is youngest, on a tie has more IBs, and on a tie arrived earlier. |
| 172 | + |
| 173 | +Each generated RB begins diffusing immediately and is adopted immediately. |
| 174 | +If the node should validate its VB before diffusion and adoption, then that cost should be included in the generation cost. |
| 175 | + |
| 176 | +## Leios diffusion threads |
| 177 | + |
| 178 | +IBs, VBs, and EBs are each diffused via a corresonding instance of the Relay mini protocol and Relay buffer. |
| 179 | +This is a generalization of the TxSubmission mini protocol and the Mempool in `ouroboros-network` and `ouroboros-consensus`. |
| 180 | + |
| 181 | +Each Relay instance involves one thread per inbound connection (aka "peers") and one thread per outbound connection (aka "followers"). |
| 182 | +For an inbound connection, the node is (aggressively/rapidly) pulling IB headers (ie merely IDs for VBs and EBs paired with a slot) and then selectively pulling the IB body (ie VBs and EBs) it wants in a configurable order/prioritization, which is usually FreshestFirst. |
| 183 | +It is also configurable which of the peers offering the same body the node fetches it from, which is either just the first or all---all can sometimes reduce latency. |
| 184 | +(TODO the real node will likely request from the second peer if the first hasn't yet replied but not the third.) |
| 185 | +For an outbound connection, the roles are switched. |
| 186 | + |
| 187 | +*Remark*. |
| 188 | +The reason RBs do not diffuse via Relay is because they form a chain, so one block can't be validated without its predecessors: an otherwise-valid block is invalid if it extends an invalid block. |
| 189 | + |
| 190 | +TODO discuss the other Relay parameters, backpressure, pipelining, etc? |
| 191 | + |
| 192 | +When an IB header arrives, its validation task is enqueued on the model CPU---for VBs and EBs it's just an ID, not a header, so there's no validation. |
| 193 | +Once that finishes, the Relay logic will decide whether it needs to fetch the body. |
| 194 | + |
| 195 | +- An IB body is not fetched if it's older than the slot to which the buffer as has already been pruned or if it's already in the buffer. |
| 196 | +- An EB is not fetched if it's older than the slot to which the buffer has already been pruned, it's too old to be included by an RB (see `maxEndorseBlockAgeSlots`), or if it's already in the buffer. |
| 197 | +- A VB is not fetched if it's older than the slot to which the buffer has already been pruned or if it's already in the buffer. |
| 198 | + |
| 199 | +Different objects are handled differently when the arrived. |
| 200 | + |
| 201 | +- When an IB that extends the genesis block arrives, its validate-and-adopt task is enqueued on the model CPU. |
| 202 | +- When an IB that extends a non-genesis RB arrives, its validate-and-adopt task is added to `waitingForLedgerStateVar`. |
| 203 | +- When an EB arrives, its validate-and-adopt task is enqueued on the model CPU. |
| 204 | +- When a VB arrives, its validate-and-adopt task is enqueued on the model CPU. |
| 205 | + |
| 206 | +## Praos diffusion threads |
| 207 | + |
| 208 | +TODO it's ChainSync and BlockFetch, but how much of `ouroboros-network` and `ouroboros-consensus` was left out? |
| 209 | + |
| 210 | +- When an RB that extends the genesis block arrives, its validate-and-adopt task is enqueued on the model CPU. |
| 211 | +- When an RB that extends a non-genesis RB and has no tx payload arrives, its validate-and-adopt task is added to `waitingForRBVar`. |
| 212 | +- When an RB that extends a non-genesis RB and has some tx payload arrives, its validate-and-adopt task is added to `waitingForLedgerStateVar`. |
| 213 | + |
| 214 | +## Wait-Validate-Adopt threads |
| 215 | + |
| 216 | +There are three threads that reactively notice when a heretofore missing input becomes available, analogous to out-of-order execution via functional units in a superscalar processor. |
| 217 | + |
| 218 | +- A thread triggered by the adoption of an RB; see `waitingForRBVar`. |
| 219 | +- A thread triggered by the construction of an RB's ledger state; see `waitingForLedgerStateVar`. |
| 220 | + (With some Leios variants, the RB validation no longer necessarily provides a ledger state.) |
| 221 | +- A thread triggered by the adoption of an IB, see `ibsNeededForEBVar`. |
| 222 | + |
| 223 | +There's also a similar, more general thread that models the scheduling of outstanding tasks on a set of CPU cores, since a block cannot be validated until some modeled CPU core is available; see `taskQueue`. |
| 224 | + |
| 225 | +Those threads enable the following tasks to happen as soon as the necessary inputs and some CPU core are available. |
| 226 | +Because those threads use STM to read both the state of pending tasks as well as the state of available inputs, it does not matter if the task or the final input arrives first. |
| 227 | + |
| 228 | +- The node must adopt the preceding RB before validating an RB that has no tx payload. |
| 229 | +- The node must construct the ledger state resulting from the preceding RB before it can validate an RB that has some tx payload. |
| 230 | +- The node must construct the ledger state resulting from the identified RB before it can validate an IB. |
| 231 | +- The node must adopt all transitively included IBs before it can construct the ledger state resulting from an RB with a certified EB. |
| 232 | + (TODO this thread has some complicated and unrealistic logic, since the simulator has no way to acquire "missing" that are no longer diffusing.) |
| 233 | + |
| 234 | +The existence of those threads enable very simple logic for the adoption tasks. |
| 235 | + |
| 236 | +- The node adopts a validated IB by starting to diffuse it, adding its `UTCTime` arrival to `ibDeliveryTimesVar`, and removing the IB from the todo lists in `ibsNeededForEBVar`. |
| 237 | +- The node adopts a validated EB by starting to diffuse it, adding it to `relayEBState`, and adding a corresponding todo list of the not-already-available IBs to `ibsNeededForEBVar`. |
| 238 | +- The node adopts a validated VB by starting to diffuse it and adding it to `votesForEBVar`. |
| 239 | +- The node adopts a validated RB by starting to diffuse it and including it whenever calculating its selection; see `preferredChain`. |
| 240 | + |
| 241 | +*Remark*. |
| 242 | +The "starting to diffuse" element of each step is somewhat hard to see in the code because it's achieved via callbacks. |
| 243 | +The Relay component invokes the given callback when some object arrives, and that invocation includes another callback that starts diffusing the object. |
| 244 | + |
| 245 | +## Pruning threads |
| 246 | + |
| 247 | +- *IBs 1*. |
| 248 | + At the end of the Vote(Send) stage for iteration `i`, the node stops diffusing all IBs from `i`. |
| 249 | + (TODO this should happen at the end of the Endorse stage, but this buffer is being abused as the adoption buffer as well.) |
| 250 | + It also forgets any of those IBs it had adopted, with the exception of their arrival time, which is used when generating VBs. |
| 251 | + See `relayIBState`. |
| 252 | +- *EBs 1*. |
| 253 | + At the end of the Vote(Recv) stage for iteration `i`, the node stops diffusing and completely forgets all EBs from `i` that are not already certified. |
| 254 | + See `relayEBState`, `votesForEBVar`, and `ibsNeededForEBVar`. |
| 255 | +- *VBs* and *IBs 2*. |
| 256 | + At the end of the Vote(Recv) stage for iteration `i`, the node stops diffusing and completely forgets all VBs from `i`, except that certified EBs from `i` remember the ID and multiplicity of the VBs that first met quorum. |
| 257 | + It also forgets the arrival time of IBs from `i`. |
| 258 | + See `relayVoteState` and `ibDeliveryTimesVar`. |
| 259 | +- *EBs 2*. |
| 260 | + If the Leios variant is set to `short`, then `maxEndorseBlockAgeSlots` after the end of the Endorse stage for iteration `i`, the node stops diffusing and forgets all EBs from `i` that were certified but are not included by an RB on the selected chain. |
| 261 | + (TODO these blocks should have stopped diffusing a long time ago, assuming `maxEndorseBlockAgeSlots >> sliceLength`) |
| 262 | + If the Leios variant is set to `full`, the node never forgets a certified EB. |
| 263 | + See `relayEBState`, `votesForEBVar`, and `ibsNeededForEBVar`. |
| 264 | +- The node never forgets an RB. |
| 265 | + |
| 266 | +# State |
| 267 | + |
| 268 | +The `LeiosProtocol.Short.Node.LeiosNodeState` record type declares the state shared by the threads. |
| 269 | + |
| 270 | +## Leios Diffusion state |
| 271 | + |
| 272 | +TODO `relayIBState`, `relayEBState`, `relayVoteState` |
| 273 | + |
| 274 | +## Waiting&Validation state |
| 275 | + |
| 276 | +TODO `ibsNeededForEBVar`, `waitingForRBVar`, `waitingForLedgerStateVar`, `ledgerStateVar`, `ibsValidationActionsVar` |
| 277 | + |
| 278 | +TODO include `taskQueue` |
| 279 | + |
| 280 | +## Adopted IBs state |
| 281 | + |
| 282 | +TODO `relayIBState` abuse |
| 283 | + |
| 284 | +TODO `ibDeliveryTimesVar` |
| 285 | + |
| 286 | +## Adopted EBs state |
| 287 | + |
| 288 | +TODO `relayEBState` abuse |
| 289 | + |
| 290 | +## Adopted VBs state |
| 291 | + |
| 292 | +TODO `votesForEBVar` |
| 293 | + |
| 294 | +## Adopted RBs & Praos Diffusion state |
| 295 | + |
| 296 | +TODO |
0 commit comments