Skip to content

Commit f0c6eae

Browse files
committed
sim-rs: moar documentation
1 parent 9ec2819 commit f0c6eae

File tree

5 files changed

+193
-3
lines changed

5 files changed

+193
-3
lines changed

sim-rs/IMPLEMENTATION.md

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
# Overall Architecture
2+
3+
The simulation is made of two crates. The sim-core crate actually runs the simulation, producing an in-memory stream of events. The sim-cli crate is mostly a simple CLI wrapper around sim-core; it lightly processes that event stream, logs interesting stats about it, and saves it to disk. The sim uses tokio as an asynchronous runtime, and sim-cli runs it as a multithreaded program.
4+
5+
The primary actors of the simulation are "nodes". A node can be either a stake pool or a relay; relays are simply stake pools without any stake. Nodes communicate to each other by sending messages over a simulated network, which is also implemented as an actor.
6+
7+
The simulation is not fully deterministic: if two events can happen at the same time, either could be simulated "first". For instance, if a node learned about a new transaction from three peers at the same instant, it could choose to request that transaction from any of the three.
8+
9+
## Clock
10+
11+
The simulation uses a virtual clock decoupled from real time. Time passes at the same rate for every node (no actor is ever "ahead" of any others). When an actor has finished simulating the current moment, it can either `wait_until` a specific timestamp or `wait_forever`, and the clock only advances once all actors are waiting for time to pass. Actors can also stop waiting at any time.
12+
13+
The clock has configurable timestamp resolution, which is implemented by rounding up the timestamp passed to `wait_until`. This improves performance by causing more events to happen simultaneously, letting the sim take advantage of multiple cores.
14+
15+
## Network
16+
17+
Nodes communicate with each other over the network through explicitly-configured connections. A connection has latency (in ms) and bandwidth (in bytes per second). Every connection tracks bandwidth independently, so a node with 10 connections at 1MiB/s effectively has 10MiB/s of bandwidth. We also track bandwidth independently in each direction; if there's a 1MiB/s connection between A and B, A can send 1MiB/s to B while B sends 1MiB/s to A.
18+
19+
Every message is tagged with a mini-protocol; within a connection, bandwidth is split equally between all active mini-protocols. If there's a 1MiB/s connection between A and B, and A is sending a transaction and an input block to B, 512MiB/s is dedicated to the TX and 512MiB/s is dedicated to the IB (and no bandwidth is dedicated to EBs/votes/RBs). Messages with the same mini-protocol are sent serially, messages with different mini-protocols are sent in parallel.
20+
The network is implemented as an actor, running on its own dedicated task and communicating with other actors through message passing.
21+
22+
We do not need to explicitly support pipelining, since mini-protocols are not implemented as state machines and network activity is fire-and-forget.
23+
24+
When a message is sent over a connection, the delay is computed by applying the bandwidth first, followed by the latency. On a connection with 10MiB/s of bandwidth and 20ms latency, a 1 MiB message will be delayed by 100ms due to bandwidth, plus 20ms due to latency. Multiple messages can be in flight over a connection at a time; all bandwidth will be applied to the first message in the queue until it has been fully "sent".
25+
26+
27+
# Protocol implementation
28+
29+
## Mini-Protocols
30+
31+
Every mini-protocol uses roughly the same implementation. To propagate some resource X throughout the network, there are three messages involved:
32+
33+
1. `AnnounceX`
34+
2. `RequestX`
35+
3. `X`
36+
37+
When a node first receives or creates X, it sends an `AnnounceX` message to its consumers. When a node receives an `AnnounceX` message, it may decide whether or not to request the resource; if it does, it sends a `RequestX` message in response. When a node receives a `RequestX` message, it responds to the sender with an `X` message. This is not technically a pull-based protocol, but it has similar performance properties to one.
38+
39+
The only resources which break this announce-request-send pattern are input blocks. Input block headers and bodies are propagated separately. More information about this is in the "Input Blocks" section of the Leios variant document.
40+
41+
The `relay-strategy` configuration option controls what a node does when two producers have both announced the same resource. If this option is `"request-from-first"`, the node will only request the resource from the first producer to announce it. If the option is `"request-from-all"`, the node will request the resource from every peer which announces it until one of them has successfully delivered it. Most simulations have been run with this set to `"request-from-first"`.
42+
43+
## Transactions
44+
45+
Each transaction has
46+
* A unique integer id
47+
* A size in bytes
48+
* A randomly-assigned shard
49+
* An "input_id", representing a single TXO which the transaction consumed
50+
* An "overcollateralization factor", representing how much extra collateral the TX includes (only used by Leios)
51+
52+
All transactions are "produced" by a single actor called the `TransactionProducer`. This actor samples from an exponential distribution to control how long to wait between producing new transactions, and a log-normal distribution to control the size in bytes (both distributions are configurable). Each new transaction is sent to a randomly selected node, which then "generates" it and propagates it to its peers.
53+
54+
Note that in the rust sim, transactions travel in the same direction as all other blocks (from producer to consumer). All topologies used to test are fully connected, so this hasn’t resulted in any lost transactions.
55+
56+
Transaction production can be disabled by setting `simulate_transactions` to false in config. When transactions are disabled, the `TransactionProducer` will do nothing, and nodes will not randomly generate or propagate transactions. The simulation will produce "fake" transactions to fill any IBs/EBs/RBs as needed, so that these blocks still have realistic sizes.
57+
58+
## Variants
59+
There are three implementations of Leios in this simulation. Similar variants tend to use the same implementations.
60+
61+
* [Leios](./implementations/CLASSIC_LEIOS.md)
62+
* Short Leios: IBs contain transactions, EBs reference IBs
63+
* "Full" Leios: IBs contain (or reference) transactions, EBs reference IBs and older EBs
64+
* [Stracciatella](./implementations/STRACCIATELLA.md): EBs reference transactions, as well as other EBs
65+
* [Linear](./implementations/LINEAR_LEIOS.md): EBs contain (or reference) transactions, and do not reference other EBs. New EBs are produced whenever new blocks are produced.

sim-rs/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
This directory contains a (very heavily WIP) simulation of the Leios protocol. It produces a stream of events which can be used to visualize or analyze the behavior of Simplified Leios.
44

5+
For more information about the simulation, see [./IMPLEMENTATION.md](./IMPLEMENTATION.md).
6+
57
## Running the project
68

79
```sh
Lines changed: 88 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,88 @@
1+
## (Classic) Leios Implementation
2+
3+
This implementation covers Short Leios and "Full" Leios variants.
4+
It's used for the following values of `leios-variant`:
5+
- `short`
6+
- `full`
7+
- `full-with-tx-references`
8+
9+
## Mempool(s)
10+
11+
This implementation has two mempools: one for Praos, and one for Leios. The Praos mempool is used when building ranking blocks, and the Leios mempool is used when building input or endorser blocks.
12+
13+
When a node first receives a transaction, it will check whether that transaction conflicts with something already in the Leios and Praos mempools; it adds the transaction to whichever mempool does not conflict. When a node sees an IB or EB with a transaction, it removes that transaction from the Leios mempool. When a node sees an RB which references a transaction, it removes that transaction from both Praos and Leios mempools.
14+
15+
When the `leios-mempool-aggressive-pruning` option is on, nodes will take TX conflicts into account when removing TXs from their Leios mempool. When an input or endorser block contains a transaction which consumes an input, any other transactions which consume that input will be filtered from the mempool.
16+
17+
Mempools do not have a maximum size.
18+
19+
## Sampling from mempools
20+
21+
When building a ranking block, the node will select transactions from its Praos mempool. This can be disabled by setting `praos-fallback-enabled` to `false`.
22+
23+
When building an input block, the node will select transactions from its Leios mempool. Input blocks contain an "rb reference"; and the ledger state used by that IB is computed from that RB. The node will not select transactions which conflict with any transactions already in the ledger state.
24+
25+
## Input Blocks
26+
27+
Input blocks are produced on a schedule. They contain transactions sampled from the Leios mempool.
28+
29+
Input blocks belong to a pipeline, based on the slot they were produced for. Input blocks for a pipeline could theoretically be produced at any slot during that pipeline, but in practice we always configure them to be produced in the first slot. A single node can produce multiple input blocks in one pipeline.
30+
31+
Input blocks are disseminated through the network using a Freshest First strategy. Input block headers (which are small) are distributed throughout the network automatically; nodes will download IB headers as soon as they are announced, and track the header arrival times.
32+
33+
When a peer announces that it has an IB body, the node will add that IB to a queue of IBs to download from that peer. The queue is a priority queue, ordered by arrival timestamp. Once a node has downloaded the header to an IB, it will add that IB to a queue of IBs to fetch; that queue is a priority queue ordered by arrival time of the IB header. The node will download one IB body from each peer at a time, and will download each body from at most one peer.
34+
35+
## Endorser Blocks
36+
37+
Endorser blocks are produced on a schedule. They contain references to input blocks, as well as (in Full Leios) older endorser blocks. Endorser blocks belong to a pipeline based on the slot they were produced for; they will always be produced in the first slot of that pipeline. A single node will only ever produce a single endorser block in one pipeline.
38+
39+
The contents of endorser blocks are deterministic. When a node produces an endorser block, it includes references to all input blocks produced in the same pipeline. In Full Leios, it also references endorser blocks from older pipelines (though skipping the two most recent pipelines). When choosing an endorser block from a pipeline, a node only considers EBs which
40+
* Have received enough votes
41+
* Contain only IBs which the node has already seen
42+
If more than one EB in a single pipeline meets these requirements, we use for tiebreakers
43+
1. Whichever EB references the most transactions through IBs
44+
2. Whichever EB has the most votes
45+
46+
## Voting for Endorser Blocks
47+
48+
The Rust simulation propagates votes through "vote bundles", which are not described in the Leios spec. A vote bundle is a message containing all votes produced by a single node in a single pipeline. At the beginning of each pipeline, nodes run VRF lotteries to determine how many votes they can produce in that pipeline. Nodes are allowed to use the "same" VRF lottery win to vote for multiple EBs; if a node has 3 EBs to vote for in a given slot and the right to produce 4 votes, it will produce a vote bundle with 4 votes for each EB.
49+
50+
A node will only vote for an EB if it satisfies all of the following rules:
51+
* For every IB referenced by the EB,
52+
* The node has downloaded and validated that IB
53+
* That IB was not equivocated
54+
* The IB’s header was received "in time" according to equivocation rules
55+
* That IB was produced in the right pipeline
56+
* (For variants where the IB holds references to TXs instead of full TXs) the node has downloaded and validated every TX in the IB
57+
* If our variant is a "full" Leios,
58+
* For every EB referenced by the EB,
59+
* That referenced EB has received enough votes
60+
* That referenced EB belongs to a valid earlier pipeline
61+
* For every valid earlier pipeline with a certified EB
62+
* Our EB references a certified EB from that pipeline
63+
64+
A node considers an EB to be "certified" as soon as that node has seen some threshold of votes for it.
65+
66+
## Ranking Blocks
67+
A ranking block contains transactions, as well as an optional endorsement. The endorsement references a single EB which
68+
* Has enough votes to be "certified"
69+
* Is not older than a configurable max age
70+
* Is not from a pipeline already represented by the chain.
71+
If more than one EB matches all of these criteria, we use for tiebreakers
72+
1. The age of the EB (older EBs take priority in Short Leios, newer EBs in Full Leios)
73+
2. The number of TXs in the EB
74+
3. The number of votes for the EB
75+
76+
## CPU model
77+
|Task name in logs|Task name in code|When does it run|What happens when it completes|CPU cost
78+
|---|---|---|---|---|
79+
|`ValTX`|`TransactionValidated`|After a transaction has been received from a peer.|That TX is announced to other peers.|`tx-validation-cpu-time-ms`|
80+
|`GenRB`|`RBBlockGenerated`|After a new ranking block has been generated.|That RB is announced to peers.|`rb-generation-cpu-time-ms` + `cert-generation-cpu-time-ms-constant` + `cert-generation-cpu-time-ms-per-node` for each node that voted for the endorsed EB|
81+
|`ValRB`|`RBBlockValidated`|After a ranking block has been received.|That RB body is announced to peers and (potentially) accepted as the tip of the chain.|`rb-body-legacy-praos-payload-validation-cpu-time-ms-constant` + `rb-body-legacy-praos-payload-validation-cpu-time-ms-per-byte` for each byte of TX + `cert-generation-cpu-time-ms-constant` + `cert-generation-cpu-time-ms-per-node` for each node that voted for the endorsed EB|
82+
|`GenIB`|`IBBlockGenerated`|After a new IB has been generated.|That IB is announced to peers.|`ib-generation-cpu-time-ms`|
83+
|`ValIH`|`IBHeaderValidated`|After an IB header has been received from a peer.|The IB header is announced to peers, and the body is queued for download.|`ib-head-validation-cpu-time-ms`|
84+
|`ValIB`|`IBBlockValidated`|After an IB has been received from a peer.|The IB body is announced to peers, and the Leios mempool is updated.|`ib-body-validation-cpu-time-ms-constant` + `ib-body-validation-cpu-time-ms-per-byte` for each byte of TX|
85+
|`GenEB`|`EBBlockGenerated`|After a new EB has been generated.|That EB is announced to peers.|`eb-generation-cpu-time-ms`|
86+
|`ValEB`|`EBBlockValidated`|After an EB's body has been received from a peer.|That EB is announced to peers.|`eb-validation-cpu-time-ms`|
87+
|`GenVote`|`VTBundleGenerated`|After a vote bundle has been generated.|That vote bundle is announced to peers.|`vote-generation-cpu-time-ms-constant` + `vote-generation-cpu-time-ms-per-ib` for each IB in the EB|
88+
|`ValVote`|`VTBundleValidated`|After a vote bundle has been received from a peer.|The votes in that bundle are stored, and the bundle is propagated to peers.|`vote-validation-cpu-time-ms` for each EB voted for (in parallel)|

sim-rs/LINEAR_LEIOS.md renamed to sim-rs/implementations/LINEAR_LEIOS.md

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Linear Leios (rust simulation)
1+
# Linear Leios Implementation
22

33
To run Linear Leios with entire transactions stored in EBs, set `leios-variant` to `linear`.
44
To run Linear Leios with transaction references stored in EBs, set `leios-variant` to `linear-with-tx-references`.
@@ -15,7 +15,7 @@ When a node receives an RB body, it immediately removes all referenced/conflicti
1515

1616
When a node receives an EB body, it runs lightweight validation and then propagates the body to peers. After this lightweight validation, it runs more expensive complete validation (presumably at the TX level) before voting.
1717

18-
When voting, a node runs a VRF lottery to decide how many times it can vote for that EB; if it has any votes, it will transmit them to all peers. If the EB has been certified after L_vote + L_diff slots have passed, the node removes all of its transactions from the mempool (under the assumption that the EB will make it on-chain).
18+
When voting, a node runs a VRF lottery to decide how many times it can vote for that EB; if it has any votes, it will transmit them to all peers. If the EB has been certified after `L_vote` + `L_diff` slots have passed, the node removes all of its transactions from the mempool (under the assumption that the EB will make it on-chain).
1919

2020
## New parameters
2121

@@ -29,8 +29,19 @@ When voting, a node runs a VRF lottery to decide how many times it can vote for
2929
|`eb-body-validation-cpu-time-ms-per-byte`|The time taken to validate the transactions in an EB _after_ we propagate it to peers.|50.0|
3030
|`vote-generation-cpu-time-ms-per-tx`|A per-transaction CPU cost to apply when generating new vote bundles.|0|
3131

32+
## CPU model
33+
|Task name in logs|Task name in code|When does it run|What happens when it completes|CPU cost
34+
|---|---|---|---|---|
35+
|`ValTX`|`TransactionValidated`|After a transaction has been received from a peer.|That TX is announced to other peers.|`tx-validation-cpu-time-ms`|
36+
|`GenRB`|`RBBlockGenerated`|After a new ranking block has been generated.|That RB and its EB are announced to peers.|`rb-generation-cpu-time-ms` and `eb-generation-cpu-time-ms` (in parallel)|
37+
|`ValRH`|`RBHeaderValidated`|After a ranking block header has been received.|That RB is announced to peers.<br/>The referenced EB is queued to be downloaded when available.|`rb-head-validation-cpu-time-ms`|
38+
|`ValRB`|`RBBlockValidated`|After a ranking block body has been received.|That RB body is announced to peers and (potentially) accepted as the tip of the chain.|`rb-body-legacy-praos-payload-validation-cpu-time-ms-constant` + `rb-body-legacy-praos-payload-validation-cpu-time-ms-per-byte` for each byte of TX|
39+
|`ValEH`|`EBHeaderValidated`|After an EB header has been received and validated.|That EB is announced to peers, and body validation begins in the background.|`eb-header-validation-cpu-time-ms`|
40+
|`ValEB`|`EBBlockValidated`|After an EB's body has been validated.|If eligible, the node will vote for that EB.|`eb-body-validation-cpu-time-ms-constant` + `eb-body-validation-cpu-time-ms-per-byte` for each byte of TX|
41+
|`GenVote`|`VTBundleGenerated`|After a vote bundle has been generated.|That vote bundle is announced to peers.|`vote-generation-cpu-time-ms-constant` + `vote-generation-cpu-time-ms-per-tx` for each TX in the EB|
42+
|`ValVote`|`VTBundleValidated`|After a vote bundle has been received from a peer.|The votes in that bundle are stored, and the bundle is propagated to peers.|`vote-validation-cpu-time-ms`|
43+
3244
## Not yet implemented
3345
- Freshest first delivery is not implemented for EBs, though EBs are created infrequently enough that this likely doesn't matter.
34-
- We are not yet applying voting rules; if you’re allowed to vote, you will always vote.
3546
- We are not yet accounting for equivocation.
3647
- Nodes are supposed to wait until the diffuse stage to vote for an EB, they are currently voting as soon as they can.

0 commit comments

Comments
 (0)