Skip to content

Commit 2761871

Browse files
authored
Merge branch 'main' into conformance-testing
2 parents 452b490 + c1d11de commit 2761871

File tree

38 files changed

+1683
-680
lines changed

38 files changed

+1683
-680
lines changed

Logbook.md

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,91 @@
11
# Leios logbook
22

3+
## 2025-02-16
4+
5+
### DeltaQ Update
6+
7+
The `topology-checker` has been upgraded with a new option to perform a ΔQSD analysis:
8+
it will extract inter-node latencies from the given topology, classify them into near/far components, and use these to build a parameterised ΔQ model.
9+
This model is then fitted against the distribution of minimal completion latencies as obtained with the Dijkstra algorithm, averaging over the distributions obtained by using each node as starting point.
10+
The resulting completion outcome is plotted to the terminal together with the extracted completion latency distribution to give a visual impression of the quality of the model fit.
11+
12+
Learnings from this exercise:
13+
14+
- latencies within the topologies examined (from the topology generator as well as the “realistic” set from the Rust simulation) very clearly consist of differing near/far components (there’s a “knee” in the graphs)
15+
- latency-weighted Dijkstra shortest paths are _extremely long_ in terms of hop count, much longer than I expected (mean 4–5, max 8 for the topology-100; min 8, max 20 for the “realistic” topology)
16+
- composing a ΔQ model as a sequence of some number of near hops (with early exit probabilities) and some number of far hops (also with early exit) yields models that roughly fit the overall shape, which is dominated by the `far` component of latencies, but there always is a significant deviation at low latencies
17+
- the deviation is that observed completion starts out much slower than the model would predict, so the model goes faster first, then “pauses” for 100ms or so, crossing the observed CDF again, then catching up — thereafter the high latency behaviour works out quite well
18+
19+
My hope had been to use a model that can be understood behaviourally, not just statistically, so that the resource usage tracking features of the `delta_q` library could be brought to bear.
20+
**This has not yet been achieved.**
21+
While the timeliness graph can be made to match to some degree, doing this results in a ΔQ expression for which at least I no longer understand the load multiplication factors that should be applied — in other words, how many peer connections are supposedly used at each step of the process.
22+
The remaining part of this week’s plan was to be able to use the fitted model to obtain a formula for creating such models algebraically;
23+
this has been put on hold because it seems easier to just generate a topology with the desired properties and then use the `topology-checker` to get the corresponding ΔQSD model.
24+
25+
In any case, the `topology-checker` now outputs the fitted ΔQSD model in the syntax needed for the `delta_q` web app, so that you can directly play with the results.
26+
27+
## 2025-02-14
28+
29+
### Formal methods
30+
31+
- Added conformance testing client of the executable Short Leios specification that
32+
is tested against the model using the executable Short Leios specification as well.
33+
- Merged executable specification for Simplified Leios into main
34+
35+
### Haskell simulation
36+
37+
- Updated config defaults for block sizes and timings, PR waiting for
38+
additional reviews by research.
39+
- Added support for idealized simulation conditions
40+
- realism features that can be individually dropped:
41+
- requesting block body from a single peer.
42+
- tcp congestion window modeling
43+
- also supports unlimited bandwidth links.
44+
- mini-protocol multiplexing
45+
- see data/simulation/config-idealised.yaml
46+
- Started work on comparison to idealised diffusion report.
47+
- simulation final output includes `raw` field containing the
48+
accumulated data and simulation parameters.
49+
- other stats can be computed from this field.
50+
- implemented extraction of block diffusion cdf for required
51+
percentiles.
52+
- TODO: expose it as a command that takes `raw` field as input
53+
- small gnuplot script to plot multiple cdfs at once (y axis in logscale).
54+
55+
### Rust simulation
56+
57+
- Optimized decoding the CBOR stream in the visualization
58+
- Added total TX count to the visualization's view of blocks
59+
- Added total CPU time to TaskFinished events
60+
361
## 2025-02-13
462

63+
### Brainstorming succinct schemes for Leios BLS key registration and witnessing
64+
65+
Recall that we have the following situation/requirements:
66+
67+
- We want to evolve the BLS keys and have forward security.
68+
- We don't want a registration process (commitment) that involves a big message.
69+
- We also don't want the proof of possession to involve a large message.
70+
- Individual votes must be small and contain no redundancy: i.e., we don't want to include a large witness for the proof of possession.
71+
72+
Here's a snapshot of one recent proposal, but much discussion is underway.
73+
74+
- Every 90 days, as part of its operational certificate each SPO includes a commitment to Leios keys: let's say that Leios keys evolve every epoch or KES period, so we'd need 18 or 60 commitments, respectively. This commitment is only 124 bytes if KZG commitments are used.
75+
- Before the start of a new key evolution period, each SPOs diffuses a message opening their keys for the new period.
76+
- This message would have to be 316 bytes:
77+
- 28 bytes for the pool ID
78+
- 96 bytes for the public key
79+
- 96 bytes for the opening
80+
- 2 * 48 = 96 bytes for the proof of possession
81+
- 316 bytes/pool * 3000 pools = 948 kB would have to be stored permanently, so that syncing from genesis is possible.
82+
- This is a lot of data to squeeze into RBs.
83+
- However, It's not clear if it is safe to put it into IBs or a non-RB block.
84+
- Instead of storing this on the ledger, would could a single SNARK attest to the following?
85+
- Input is (ID of key evolution period, pool ID, public key)
86+
- Output is whether the proof of possession exists.
87+
- Certificates are really small because we've already recorded the proofs of possession at the start of the key evolution period.
88+
589
### Certificate CPU benchmarks as a function of number of voters
690

791
In support of the Haskell and Rust simulations, we've benchmarked certificate operations as a function of the number of voters. A realistic distribution of stake is used in these measurements.
@@ -19,6 +103,17 @@ Serialization and deserialization likely also exhibit the same trend.
19103

20104
A recipe for parallelizing parts of the certificate operations has been added to the [Specification for BLS certificates](crypto-benchmarks.rs/Specification.md).
21105

106+
### Rust simulation
107+
108+
Updated event format to more closely match standards:
109+
110+
- Timestamps are in seconds instead of nanoseconds.
111+
- CPU subtasks have a duration attached to the "started" event, and no "finished" event.
112+
113+
Started tracking vote bundle sizes to display in visualization.
114+
115+
Added support for CBOR output (with identical schema to JSON output).
116+
22117
## 2025-02-12
23118

24119
### Added BLS crypto to CI
@@ -28,6 +123,12 @@ The CI job [crypto-benchmarks-rs](.github/workflows/crypto-benchmarks-rs.yaml) d
28123
- Runs the tests for the BLS reference implementation
29124
- Runs the BLS vote and certificate benchmarks
30125

126+
### Rust simulation
127+
128+
Minor build fixes (specify a MSRV, use a fixed toolchain version in CI)
129+
130+
Visualization now displays a breakdown of the size of each block, as well as total bytes sent/received by each node.
131+
31132
## 2025-02-11
32133

33134
### Reference implementation and benchmarking BLS certificates
@@ -39,6 +140,7 @@ The [BLS benchmarking Rust code for Leios](crypto-benchmarks.rs/) was overhauled
39140
- Benchmarks for the inputs to the Leios and Haskell, Rust, and DeltaQ simulations.
40141
- CBOR serialization and deserialization of Leios messages.
41142
- Command-line interface (with example) for trying out Leios's cryptography: create and verity votes, certificates, etc.
143+
42144
* Document specifying the algorithms and tabulating benchmark results.
43145

44146
Note that this BLS scheme is just one viable option for Leios. Ongoing work and ALBA, MUSEN, and SNARKs might result in schemes superior to this BLS approach. The key drawback is the need for periodic registration of ephemeral keys. Overall, this scheme provides the following:
@@ -47,6 +149,12 @@ Note that this BLS scheme is just one viable option for Leios. Ongoing work and
47149
- Certificate generation and verification in 90 ms and 130 ms, respectively.
48150
- Votes smaller than 200 bytes.
49151

152+
### Rust simulation
153+
154+
Added support for `ib-diffusion-strategy` (freshest-first, oldest-first, or peer-order). Unlike the Haskell sim, this doesn't affect EB or vote diffusion; nodes can download an unlimited number of EBs or vote bundles from any given peer.
155+
156+
Added support for `relay-strategy`: it affects TXs, IBs, EBs, votes, and RBs.
157+
50158
## 2025-02-07
51159

52160
### Haskell simulation

data/simulation/config.default.yaml

Lines changed: 91 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -42,29 +42,54 @@ tx-max-size-bytes: 16384
4242
# Ranking Block Configuration
4343
################################################################################
4444

45+
# 1/leios-stage-length-slots, targeting one RB per pipeline.
46+
# Also 20s is current rate of praos blocks.
4547
rb-generation-probability: 5.0e-2
46-
rb-generation-cpu-time-ms: 300.0
47-
rb-head-validation-cpu-time-ms: 1.0
48-
rb-head-size-bytes: 32
48+
# Eng. team targets 1kB as worst case upper bound.
49+
# Actual size fairly close.
50+
rb-head-size-bytes: 1024
4951
rb-body-max-size-bytes: 90112
52+
# Note: certificate generation/validation is not included in the
53+
# timings here, see cert-* fields.
54+
rb-generation-cpu-time-ms: 1.0
55+
rb-head-validation-cpu-time-ms: 1.0
5056

57+
# On average, no Txs directly embedded in blocks.
58+
rb-body-legacy-praos-payload-avg-size-bytes: 0
5159
rb-body-legacy-praos-payload-validation-cpu-time-ms-constant: 50.0
60+
# the -per-byte component is meant to be using size as a (bad)
61+
# proxy for the complexity of the Txs included.
5262
rb-body-legacy-praos-payload-validation-cpu-time-ms-per-byte: 0.0005
53-
rb-body-legacy-praos-payload-avg-size-bytes: 0
5463

5564
################################################################################
5665
# Input Block Configuration
5766
################################################################################
5867

5968
ib-generation-probability: 5.0
60-
ib-generation-cpu-time-ms: 300.0
6169
ib-shards: 1
62-
ib-head-size-bytes: 32
70+
71+
# ProducerId 32
72+
# SlotNo 64
73+
# VRF proof 80
74+
# Body hash 32
75+
# RB Ref 32
76+
# Signature 64
77+
# Total 304
78+
#
79+
# NOTE: using a KES Signature (like for Praos headers)
80+
# would instead more than double the total to 668.
81+
# And even 828 including Op Cert.
82+
ib-head-size-bytes: 304
83+
# 100kB, using praos max size as ballpark estimate.
84+
ib-body-avg-size-bytes: 102400
85+
ib-body-max-size-bytes: 327680
86+
# Here we also use praos blocks as ballpark estimate.
87+
# Sec 2.3 Forging, of the benchmark cluster report, lists
88+
# * Slot start to announced: 0.12975s
89+
ib-generation-cpu-time-ms: 130.0
6390
ib-head-validation-cpu-time-ms: 1.0
6491
ib-body-validation-cpu-time-ms-constant: 50.0
6592
ib-body-validation-cpu-time-ms-per-byte: 0.0005
66-
ib-body-max-size-bytes: 327680
67-
ib-body-avg-size-bytes: 327680
6893
ib-diffusion-strategy: "freshest-first"
6994

7095
# Haskell prototype relay mini-protocol parameters.
@@ -76,11 +101,27 @@ ib-diffusion-max-window-size: 100
76101
# Endorsement Block Configuration
77102
################################################################################
78103

79-
eb-generation-probability: 5.0
80-
eb-generation-cpu-time-ms: 300.0
81-
eb-validation-cpu-time-ms: 1.0
104+
# We want one per pipeline, but not too many.
105+
eb-generation-probability: 1.5
106+
# ProducerId 32
107+
# SlotNo 64
108+
# VRF proof 80
109+
# Signature 64
110+
# Total 240
111+
#
112+
# See Note about signatures on ib-head-size-bytes.
82113
eb-size-bytes-constant: 32
114+
# IB hash
83115
eb-size-bytes-per-ib: 32
116+
# Collecting the IBs to reference and cryptography are the main tasks.
117+
# A comparable task is maybe mempool snapshotting.
118+
# Sec 2.3 Forging, of the benchmark cluster report, lists
119+
# * Mempool snapshotting: 0.07252s
120+
# 75ms then seems a generous estimate for eb generation.
121+
eb-generation-cpu-time-ms: 75.0
122+
# Validating signature and vrf proof, as in other headers.
123+
eb-validation-cpu-time-ms: 1.0
124+
84125
eb-diffusion-strategy: "peer-order"
85126

86127
# Haskell prototype relay mini-protocol parameters.
@@ -92,13 +133,32 @@ eb-diffusion-max-window-size: 100
92133
# Vote Configuration
93134
################################################################################
94135

136+
# Cryptography related values taken from [vote-spec](crypto-benchmarks.rs/Specification.md)
137+
# using weighted averages of 80% persistent and 20% non-persistent.
138+
139+
# vote-spec#Committe and quorum size
140+
#
141+
# Note: this is used as the expected amount of total weight of
142+
# generated votes in the sims.
95143
vote-generation-probability: 500.0
96-
vote-generation-cpu-time-ms-constant: 1.0
97-
vote-generation-cpu-time-ms-per-ib: 1.0
98-
vote-validation-cpu-time-ms: 3.0
99-
vote-threshold: 150
100-
vote-bundle-size-bytes-constant: 32
101-
vote-bundle-size-bytes-per-eb: 32
144+
# vote-spec#"Committe and quorum size"
145+
# 60% of `vote-generation-probability`
146+
vote-threshold: 300
147+
# vote-spec#"Generate vote" 0.8*135e-3 + 0.2*280e-3
148+
vote-generation-cpu-time-ms-constant: 164e-3
149+
# No benchmark yet.
150+
vote-generation-cpu-time-ms-per-ib: 0
151+
# vote-spec#"Verify vote" 0.8*670e-3 + 0.2*1.4
152+
vote-validation-cpu-time-ms: 816e-3
153+
# The `Vote` structure counted in the -per-eb already identifies slot
154+
# (in Eid) and voter. We can assume a vote bundle is all for the same
155+
# voter and slot, so for non-persistent voters we could factor their
156+
# PoolKeyHash (28bytes) here, but that is for 20% of cases.
157+
# More relevant if EB generation is very high.
158+
vote-bundle-size-bytes-constant: 0
159+
# vote-spec#Votes 0.8*90 + 0.2*164
160+
vote-bundle-size-bytes-per-eb: 105
161+
102162
vote-diffusion-strategy: "peer-order"
103163

104164
# Haskell prototype relay mini-protocol parameters.
@@ -110,9 +170,17 @@ vote-diffusion-max-window-size: 100
110170
# Certificate Configuration
111171
################################################################################
112172

113-
cert-generation-cpu-time-ms-constant: 50.0
114-
cert-generation-cpu-time-ms-per-node: 1.0
115-
cert-validation-cpu-time-ms-constant: 50.0
116-
cert-validation-cpu-time-ms-per-node: 1.0
117-
cert-size-bytes-constant: 32
118-
cert-size-bytes-per-node: 32
173+
# vote-spec#"certificate bytes"
174+
cert-size-bytes-constant: 136
175+
# vote-spec#"certificate bytes" ((80/8) + 76 * (100 - 80))/100
176+
cert-size-bytes-per-node: 15
177+
178+
# For certificate timings we have bulk figures for realistic scenarios,
179+
# so we do not attempt to give -per-node (i.e. per-voter) timings.
180+
#
181+
# vote-spec#"Generate certificate"
182+
cert-generation-cpu-time-ms-constant: 90.0
183+
cert-generation-cpu-time-ms-per-node: 0
184+
# vote-spec#"Verify certificate"
185+
cert-validation-cpu-time-ms-constant: 130.0
186+
cert-validation-cpu-time-ms-per-node: 0

0 commit comments

Comments
 (0)