input-output-hk
diff --git a/‎Logbook.md‎
Lines changed: 108 additions & 0 deletions b/‎Logbook.md‎
Lines changed: 108 additions & 0 deletions
diff --git a/‎data/simulation/config.default.yaml‎
Lines changed: 91 additions & 23 deletions b/‎data/simulation/config.default.yaml‎
Lines changed: 91 additions & 23 deletions
@@ -1,7 +1,91 @@
 # Leios logbook
 
+## 2025-02-16
+
+### DeltaQ Update
+
+The `topology-checker` has been upgraded with a new option to perform a ΔQSD analysis:
+it will extract inter-node latencies from the given topology, classify them into near/far components, and use these to build a parameterised ΔQ model.
+This model is then fitted against the distribution of minimal completion latencies as obtained with the Dijkstra algorithm, averaging over the distributions obtained by using each node as starting point.
+The resulting completion outcome is plotted to the terminal together with the extracted completion latency distribution to give a visual impression of the quality of the model fit.
+
+Learnings from this exercise:
+
+- latencies within the topologies examined (from the topology generator as well as the “realistic” set from the Rust simulation) very clearly consist of differing near/far components (there’s a “knee” in the graphs)
+- latency-weighted Dijkstra shortest paths are _extremely long_ in terms of hop count, much longer than I expected (mean 4–5, max 8 for the topology-100; min 8, max 20 for the “realistic” topology)
+- composing a ΔQ model as a sequence of some number of near hops (with early exit probabilities) and some number of far hops (also with early exit) yields models that roughly fit the overall shape, which is dominated by the `far` component of latencies, but there always is a significant deviation at low latencies
+- the deviation is that observed completion starts out much slower than the model would predict, so the model goes faster first, then “pauses” for 100ms or so, crossing the observed CDF again, then catching up — thereafter the high latency behaviour works out quite well
+
+My hope had been to use a model that can be understood behaviourally, not just statistically, so that the resource usage tracking features of the `delta_q` library could be brought to bear.
+**This has not yet been achieved.**
+While the timeliness graph can be made to match to some degree, doing this results in a ΔQ expression for which at least I no longer understand the load multiplication factors that should be applied — in other words, how many peer connections are supposedly used at each step of the process.
+The remaining part of this week’s plan was to be able to use the fitted model to obtain a formula for creating such models algebraically;
+this has been put on hold because it seems easier to just generate a topology with the desired properties and then use the `topology-checker` to get the corresponding ΔQSD model.
+
+In any case, the `topology-checker` now outputs the fitted ΔQSD model in the syntax needed for the `delta_q` web app, so that you can directly play with the results.
+
+## 2025-02-14
+
+### Formal methods
+
+- Added conformance testing client of the executable Short Leios specification that
+  is tested against the model using the executable Short Leios specification as well.
+- Merged executable specification for Simplified Leios into main
+
+### Haskell simulation
+
+- Updated config defaults for block sizes and timings, PR waiting for
+  additional reviews by research.
+- Added support for idealized simulation conditions
+  - realism features that can be individually dropped:
+    - requesting block body from a single peer.
+    - tcp congestion window modeling
+      - also supports unlimited bandwidth links.
+    - mini-protocol multiplexing
+  - see data/simulation/config-idealised.yaml
+- Started work on comparison to idealised diffusion report.
+  - simulation final output includes `raw` field containing the
+    accumulated data and simulation parameters.
+    - other stats can be computed from this field.
+  - implemented extraction of block diffusion cdf for required
+    percentiles.
+    - TODO: expose it as a command that takes `raw` field as input
+  - small gnuplot script to plot multiple cdfs at once (y axis in logscale).
+
+### Rust simulation
+
+- Optimized decoding the CBOR stream in the visualization
+- Added total TX count to the visualization's view of blocks
+- Added total CPU time to TaskFinished events
+
 ## 2025-02-13
 
+### Brainstorming succinct schemes for Leios BLS key registration and witnessing
+
+Recall that we have the following situation/requirements:
+
+- We want to evolve the BLS keys and have forward security.
+- We don't want a registration process (commitment) that involves a big message.
+- We also don't want the proof of possession to involve a large message.
+- Individual votes must be small and contain no redundancy: i.e., we don't want to include a large witness for the proof of possession.
+
+Here's a snapshot of one recent proposal, but much discussion is underway.
+
+- Every 90 days, as part of its operational certificate each SPO includes a commitment to Leios keys: let's say that Leios keys evolve every epoch or KES period, so we'd need 18 or 60 commitments, respectively.  This commitment is only 124 bytes if KZG commitments are used.
+- Before the start of a new key evolution period, each SPOs diffuses a message opening their keys for the new period.
+  - This message would have to be 316 bytes:
+    - 28 bytes for the pool ID
+    - 96 bytes for the public key
+    - 96 bytes for the opening
+    - 2 * 48 = 96 bytes for the proof of possession
+  - 316 bytes/pool * 3000 pools = 948 kB would have to be stored permanently, so that syncing from genesis is possible.
+    - This is a lot of data to squeeze into RBs.
+    - However, It's not clear if it is safe to put it into IBs or a non-RB block.
+  - Instead of storing this on the ledger, would could a single SNARK attest to the following?
+    - Input is (ID of key evolution period, pool ID, public key)
+    - Output is whether the proof of possession exists.
+- Certificates are really small because we've already recorded the proofs of possession at the start of the key evolution period.
+
 ### Certificate CPU benchmarks as a function of number of voters
 
 In support of the Haskell and Rust simulations, we've benchmarked certificate operations as a function of the number of voters. A realistic distribution of stake is used in these measurements.
@@ -19,6 +103,17 @@ Serialization and deserialization likely also exhibit the same trend.
 
 A recipe for parallelizing parts of the certificate operations has been added to the [Specification for BLS certificates](crypto-benchmarks.rs/Specification.md).
 
+### Rust simulation
+
+Updated event format to more closely match standards:
+
+- Timestamps are in seconds instead of nanoseconds.
+- CPU subtasks have a duration attached to the "started" event, and no "finished" event.
+
+Started tracking vote bundle sizes to display in visualization.
+
+Added support for CBOR output (with identical schema to JSON output).
+
 ## 2025-02-12
 
 ### Added BLS crypto to CI
@@ -28,6 +123,12 @@ The CI job [crypto-benchmarks-rs](.github/workflows/crypto-benchmarks-rs.yaml) d
 - Runs the tests for the BLS reference implementation
 - Runs the BLS vote and certificate benchmarks
 
+### Rust simulation
+
+Minor build fixes (specify a MSRV, use a fixed toolchain version in CI)
+
+Visualization now displays a breakdown of the size of each block, as well as total bytes sent/received by each node.
+
 ## 2025-02-11
 
 ### Reference implementation and benchmarking BLS certificates
@@ -39,6 +140,7 @@ The [BLS benchmarking Rust code for Leios](crypto-benchmarks.rs/) was overhauled
 - Benchmarks for the inputs to the Leios and Haskell, Rust, and DeltaQ simulations.
 - CBOR serialization and deserialization of Leios messages.
 - Command-line interface (with example) for trying out Leios's cryptography: create and verity votes, certificates, etc.
+
 * Document specifying the algorithms and tabulating benchmark results.
 
 Note that this BLS scheme is just one viable option for Leios. Ongoing work and ALBA, MUSEN, and SNARKs might result in schemes superior to this BLS approach. The key drawback is the need for periodic registration of ephemeral keys. Overall, this scheme provides the following:
@@ -47,6 +149,12 @@ Note that this BLS scheme is just one viable option for Leios. Ongoing work and
 - Certificate generation and verification in 90 ms and 130 ms, respectively.
 - Votes smaller than 200 bytes.
 
+### Rust simulation
+
+Added support for `ib-diffusion-strategy` (freshest-first, oldest-first, or peer-order). Unlike the Haskell sim, this doesn't affect EB or vote diffusion; nodes can download an unlimited number of EBs or vote bundles from any given peer.
+
+Added support for `relay-strategy`: it affects TXs, IBs, EBs, votes, and RBs.
+
 ## 2025-02-07
 
 ### Haskell simulation
 
@@ -42,29 +42,54 @@ tx-max-size-bytes: 16384
 # Ranking Block Configuration
 ################################################################################
 
+# 1/leios-stage-length-slots, targeting one RB per pipeline.
+# Also 20s is current rate of praos blocks.
 rb-generation-probability: 5.0e-2
-rb-generation-cpu-time-ms: 300.0
-rb-head-validation-cpu-time-ms: 1.0
-rb-head-size-bytes: 32
+# Eng. team targets 1kB as worst case upper bound.
+# Actual size fairly close.
+rb-head-size-bytes: 1024
 rb-body-max-size-bytes: 90112
+# Note: certificate generation/validation is not included in the
+# timings here, see cert-* fields.
+rb-generation-cpu-time-ms: 1.0
+rb-head-validation-cpu-time-ms: 1.0
 
+# On average, no Txs directly embedded in blocks.
+rb-body-legacy-praos-payload-avg-size-bytes: 0
 rb-body-legacy-praos-payload-validation-cpu-time-ms-constant: 50.0
+# the -per-byte component is meant to be using size as a (bad)
+# proxy for the complexity of the Txs included.
 rb-body-legacy-praos-payload-validation-cpu-time-ms-per-byte: 0.0005
-rb-body-legacy-praos-payload-avg-size-bytes: 0
 
 ################################################################################
 # Input Block Configuration
 ################################################################################
 
 ib-generation-probability: 5.0
-ib-generation-cpu-time-ms: 300.0
 ib-shards: 1
-ib-head-size-bytes: 32
+
+# ProducerId  32
+# SlotNo      64
+# VRF proof   80
+# Body hash   32
+# RB Ref      32
+# Signature   64
+# Total       304
+#
+# NOTE: using a KES Signature (like for Praos headers)
+#       would instead more than double the total to 668.
+#       And even 828 including Op Cert.
+ib-head-size-bytes: 304
+# 100kB, using praos max size as ballpark estimate.
+ib-body-avg-size-bytes: 102400
+ib-body-max-size-bytes: 327680
+# Here we also use praos blocks as ballpark estimate.
+# Sec 2.3 Forging, of the benchmark cluster report, lists
+#   * Slot start to announced: 0.12975s
+ib-generation-cpu-time-ms: 130.0
 ib-head-validation-cpu-time-ms: 1.0
 ib-body-validation-cpu-time-ms-constant: 50.0
 ib-body-validation-cpu-time-ms-per-byte: 0.0005
-ib-body-max-size-bytes: 327680
-ib-body-avg-size-bytes: 327680
 ib-diffusion-strategy: "freshest-first"
 
 # Haskell prototype relay mini-protocol parameters.
@@ -76,11 +101,27 @@ ib-diffusion-max-window-size: 100
 # Endorsement Block Configuration
 ################################################################################
 
-eb-generation-probability: 5.0
-eb-generation-cpu-time-ms: 300.0
-eb-validation-cpu-time-ms: 1.0
+# We want one per pipeline, but not too many.
+eb-generation-probability: 1.5
+# ProducerId  32
+# SlotNo      64
+# VRF proof   80
+# Signature   64
+# Total       240
+#
+# See Note about signatures on ib-head-size-bytes.
 eb-size-bytes-constant: 32
+# IB hash
 eb-size-bytes-per-ib: 32
+# Collecting the IBs to reference and cryptography are the main tasks.
+# A comparable task is maybe mempool snapshotting.
+# Sec 2.3 Forging, of the benchmark cluster report, lists
+#   * Mempool snapshotting: 0.07252s
+# 75ms then seems a generous estimate for eb generation.
+eb-generation-cpu-time-ms: 75.0
+# Validating signature and vrf proof, as in other headers.
+eb-validation-cpu-time-ms: 1.0
+
 eb-diffusion-strategy: "peer-order"
 
 # Haskell prototype relay mini-protocol parameters.
@@ -92,13 +133,32 @@ eb-diffusion-max-window-size: 100
 # Vote Configuration
 ################################################################################
 
+# Cryptography related values taken from [vote-spec](crypto-benchmarks.rs/Specification.md)
+# using weighted averages of 80% persistent and 20% non-persistent.
+
+# vote-spec#Committe and quorum size
+#
+# Note: this is used as the expected amount of total weight of
+# generated votes in the sims.
 vote-generation-probability: 500.0
-vote-generation-cpu-time-ms-constant: 1.0
-vote-generation-cpu-time-ms-per-ib: 1.0
-vote-validation-cpu-time-ms: 3.0
-vote-threshold: 150
-vote-bundle-size-bytes-constant: 32
-vote-bundle-size-bytes-per-eb: 32
+# vote-spec#"Committe and quorum size"
+# 60% of `vote-generation-probability`
+vote-threshold: 300
+# vote-spec#"Generate vote" 0.8*135e-3 + 0.2*280e-3
+vote-generation-cpu-time-ms-constant: 164e-3
+# No benchmark yet.
+vote-generation-cpu-time-ms-per-ib: 0
+# vote-spec#"Verify vote" 0.8*670e-3 + 0.2*1.4
+vote-validation-cpu-time-ms: 816e-3
+# The `Vote` structure counted in the -per-eb already identifies slot
+# (in Eid) and voter. We can assume a vote bundle is all for the same
+# voter and slot, so for non-persistent voters we could factor their
+# PoolKeyHash (28bytes) here, but that is for 20% of cases.
+# More relevant if EB generation is very high.
+vote-bundle-size-bytes-constant: 0
+# vote-spec#Votes 0.8*90 + 0.2*164
+vote-bundle-size-bytes-per-eb: 105
+
 vote-diffusion-strategy: "peer-order"
 
 # Haskell prototype relay mini-protocol parameters.
@@ -110,9 +170,17 @@ vote-diffusion-max-window-size: 100
 # Certificate Configuration
 ################################################################################
 
-cert-generation-cpu-time-ms-constant: 50.0
-cert-generation-cpu-time-ms-per-node: 1.0
-cert-validation-cpu-time-ms-constant: 50.0
-cert-validation-cpu-time-ms-per-node: 1.0
-cert-size-bytes-constant: 32
-cert-size-bytes-per-node: 32
+# vote-spec#"certificate bytes"
+cert-size-bytes-constant: 136
+# vote-spec#"certificate bytes" ((80/8) + 76 * (100 - 80))/100
+cert-size-bytes-per-node: 15
+
+# For certificate timings we have bulk figures for realistic scenarios,
+# so we do not attempt to give -per-node (i.e. per-voter) timings.
+#
+# vote-spec#"Generate certificate"
+cert-generation-cpu-time-ms-constant: 90.0
+cert-generation-cpu-time-ms-per-node: 0
+# vote-spec#"Verify certificate"
+cert-validation-cpu-time-ms-constant: 130.0
+cert-validation-cpu-time-ms-per-node: 0