Skip to content

Commit 2e4b28e

Browse files
CDDL (#420)
* docs(CDDL): added base definitions for blocks, votes & certificates and sharding-specific definitions
1 parent fb750be commit 2e4b28e

File tree

906 files changed

+73950
-1034
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

906 files changed

+73950
-1034
lines changed

Logbook.md

Lines changed: 160 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,164 @@
11
# Leios logbook
22

3+
## 2025-06-20
4+
5+
### CDDL Specification Draft
6+
7+
Created initial CDDL specifications for core Leios components:
8+
9+
- Input Blocks with VRF lottery and single IB/slot limits
10+
- Endorser Blocks as new aggregation block type
11+
- Ranking Blocks as Conway extension with optional certificates
12+
- BLS voting system with persistent/non-persistent voters and key registration
13+
- Follows crypto-benchmarks implementation approach, maintains Conway CDDL compatibility
14+
- First draft establishing foundational structures - incomplete but covers common base components
15+
- Upcoming iterations will add detailed specifications for design variants (full sharding, overcollateralization, protocol extensions)
16+
17+
### Formal methods
18+
19+
- Added support for `Late IB inclusion` to the formal spec of Full-Short Leios
20+
- Profiled leios-trace-verifier: About 60% of the time is spent in garbage collection. Switching to `--nonmoving-gc` improves the performance significantly
21+
22+
### Rust simulation
23+
24+
- Added support for generating conflicting transactions. The probability of conflicts is controlled with the `tx-conflict-fraction` setting, at a global or per-node level.
25+
- Added support for overcollateralization: transactions will randomly be generated with enough extra collateral to be included in multiple IB shards, controlled by the `tx-overcollateralization-factor-distribution` setting.
26+
- Added support for explicitly-weighted TX production. The `tx-generation-weight` setting can be used to make specific relays generate more transactions than others.
27+
- Stake pools no longer generate transactions (by default)
28+
29+
### Draft of second technical report
30+
31+
The initial draft of the [Leios Technical Report 2](docs/technical-report-2.md) now contains two major sections on the Leios mini protocols and the realism of the Haskell simulator.
32+
33+
1. Network specification
34+
1. Relay mini protocol
35+
2. Fetch mini protocol
36+
3. CatchUp mini protocol
37+
2. Haskell simulation realism
38+
* Six scenarios are analyzed
39+
40+
## 2025-06-19
41+
42+
### Metrics section in CIP
43+
44+
The performance of a protocol like Leios can be characterized in terms of its efficient use of resources, its total use of resources, the probabilities of negative outcomes due to the protocol's design, and the resilience to adverse conditions. Metrics measuring such performance depend upon the selection of protocol parameters, the network topology, and the submission of transactions. The table below summarizes key metrics for evaluating Leios as a protocol and individual scenarios (parameters, network, and load). A discussion of these metrics was added to [the draft Leios CIP](docs/cip/README.md#metrics).
45+
46+
| Category | Metric | Measurement |
47+
| ---------- | --------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- |
48+
| Efficiency | Spatial efficiency, $`\epsilon_\text{spatial}`$ | Ratio of total transactions size to persistent storage |
49+
| | Temporal efficiency, $`\epsilon_\text{temporal}(s)`$ | Time to include transaction on ledger |
50+
| | Network efficiency, $`\epsilon_\text{network}`$ | Ratio of total transaction size to node-averaged network usage |
51+
| Protocol | TX collision, $`p_\text{collision}`$ | Probability of a transaction being included in two IBs |
52+
| | TX inclusion, $`\tau_\text{inclusion}`$ | Mean number of slots for a transaction being included in any IB |
53+
| | Voting failure, $`p_\text{noquorum}`$ | Probability of sortition failure to elect sufficient voters for a quorum |
54+
| Resource | Network egress, $`q_\text{egress}`$ | Rate of bytes transmitted by a node |
55+
| | Disk usage, $`q_\text{disk}`$ | Rate of persistent bytes stored by a node |
56+
| | I/O operations, $`\bar{q}_\text{iops}(b)`$ | Mean number of I/O operations per second, where each operation writes a filesystem block of $`b`$ bytes |
57+
| | Mean CPU usage, $`\bar{q}_\text{vcpu}`$ | Mean virtual CPU cores used by a node |
58+
| | Peak CPU usage, $`\hat{q}_\text{vcpu}`$ | Maximum virtual CPU cores used by a node over a one-slot window |
59+
| Resilience | Bandwidth, $`\eta_\text{bandwidth}(b)`$ | Fractional loss in throughput at finite bandwidth $`b`$ |
60+
| | Adversarial stake, $`\eta_\text{adversary}(s)`$ | Fractional loss in throughput due to adversial stake of $`s`$ |
61+
| Fees | Collateral paid for success, $`\kappa_\text{success}(c)`$ | Average collateral paid for a successful transaction when it conflicts with a fraction $`c`$ of the memory pool |
62+
| | Collateral paid for failure, $`\kappa_\text{failure}(c)`$ | Average collateral paid for a failed transaction when it conflicts with a fraction $`c`$ of the memory pool |
63+
64+
## 2025-06-18
65+
66+
### Bandwidth measurements
67+
68+
Because bandwidth between nodes has been identified as a critical resource that limits Leios throughput, we conducted an unscientific experiment, using `iperf3` for bidirectional measurements between locations in North America and Europe:
69+
70+
| Client | Server | Send Mbps | Receive Mbps |
71+
|:-------------------------|:---------------|----------:|-------------:|
72+
| OVH Canada | OVH Poland | 219 | 217 |
73+
| OVH Canada | OVH Oregon USA | 363 | 360 |
74+
| OVH Oregon USA | OVH Poland | 142 | 144 |
75+
| CenturyLink Colorado USA | OVH Poland | 147 | 145 |
76+
| CenturyLink Colorado USA | OVH Oregon USA | 418 | 412 |
77+
| CenturyLink Colorado USA | OVH Canada | 97 | 95 |
78+
| CenturyLink Colorado USA | OVH Virginia | 311 | 309 |
79+
| CenturyLink Colorado USA | AWS Oregon USA | 826 | 824 |
80+
| AWS Oregon USA | OVH Oregon USA | 973 | 972 |
81+
| AWS Oregon USA | OVH Poland | 141 | 138 |
82+
| AWS Oregon USA | OVH Canada | 329 | 327 |
83+
| OVH Virginia USA | OVH Oregon USA | 369 | 367 |
84+
| OVH Virginia USA | OVH Poland | 231 | 229 |
85+
| OVH Virginia USA | OVH Canada | 469 | 467 |
86+
87+
The OVH machines are inexpensive instances, the AWS is a `r5a.4xlarge`, and CenturyLink is a local ISP. Overall, it looks like 100 Mbps is a conservative lower bound.
88+
89+
We also made some empirical measurements where a machine in one data center connects to four others to measure simultaneous bandwidth. It looks like there is a 5-20% reduction in individual link speed, as compared to only one connection at a time. (For most of the pairs, it is closer to 5%, and the 20% is an outlier.) However, we don't know what would happen if we measured with 25 simultaneous connections.
90+
91+
## 2025-06-17
92+
93+
### Mini-mainnet experiments
94+
95+
The 750-node [pseudo-mainnet](data/simulation/pseudo-mainnet/topology-v2.yaml) network was used in Haskell and Rust experiments to study the limits of transaction and IB throughput for realistic scenarios up to 300 TPS and 32 IB/s.
96+
97+
- [Analysis results](analysis/sims/2025w24/analysis.ipynb)
98+
- [Slides](analysis/sims/2025w24/summary.pdf)
99+
- Findings:
100+
- The 750 node mini-mainnet is a suitable replacement for the 10,000-node pseudo mainnet, in that either topology would result in similar performance measurements and resource recommendations.
101+
- The Haskell and Rust simulations substantially agree for mini-mainnet simulations.
102+
- Key metrics from these simulations:
103+
- Block propagation less than 1 second, which is consistent with empirical observations from pooltool.io. Note that this has implications for our discussion of the IB-concurrency period.
104+
- With 1 Gb/s links/NICs, the protocol can support 25 MB/s throughput before it starts degrading.
105+
- Mean time from mempool to ledger is about 150 seconds for transactions.
106+
- Disk-space efficiency is about 80%.
107+
- About 20% of network traffic is wasted.
108+
- Even at 300 ts/x, a 6-core VM is sufficient for peak demand, but average demand is less than 2 cores.
109+
110+
### Added features to simulation trace processor
111+
112+
The [`leios-trace-processor`](analysis/sims/trace-processor/) now reports message sizes for bandwidth-usage analysis.
113+
114+
### Pseudo-mainnet experiments
115+
116+
The 10,000-node [pseudo-mainnet](data/simulation/pseudo-mainnet/ReadMe.md) network was used in Rust experiments to study the limits of transaction and IB throughput for realistic scenarios up to 300 TPS and 32 IB/s.
117+
118+
- [Analysis results](analysis/sims/2025w23/analysis.ipynb)
119+
- [Slides](analysis/sims/2025w23/summary.pdf)
120+
- Findings:
121+
- Transactions took an average of 100 seconds to travel from the memory pool to the ledger.
122+
- Disk and network usage was approximately 80% efficient.
123+
- Even at high TPS, six CPU cores were sufficient to handle peak load.
124+
- Block propagation time averaged under one second.
125+
126+
## 2025-06-15
127+
128+
### Reduced memory footprint for analyzing simulation traces
129+
130+
The [`leios-trace-processor`](analysis/sims/trace-processor/) was refactored in order to dramatically reduce the memory footprint of analyzing large simulation traces.
131+
132+
## 2025-06-12
133+
134+
### "Miniature mainnet" topology
135+
136+
Because the 10,000 [pseudo-mainnet](data/simulation/pseudo-mainnet/topology-v1.ipynb) runs so slowly and consumes so much memory in the Rust and Haskell simulations, thus making it impractical for repeated use and for large experiments, we created a smaller 750-node topology that faithfully mimics mainnet. It has nearly the same diameter as mainnet and have very similar stake distribution and edge degree.
137+
138+
- Methodology: [topology-v2.ipynb](data/simulation/pseudo-mainnet/topology-v2.ipynb)
139+
- Network: [topology-v2.yaml](data/simulation/pseudo-mainnet/topology-v2.yaml)
140+
- Metrics: [topology-v2.md](data/simulation/pseudo-mainnet/topology-v2.md)
141+
142+
| Metric | Value |
143+
|--------|-------|
144+
| Total nodes | 750 |
145+
| Block producers | 216 |
146+
| Relay nodes | 534 |
147+
| Total connections | 19314 |
148+
| Network diameter | 5 hops |
149+
| Average connections per node | 25.75 |
150+
| Clustering coefficient | 0.332 |
151+
| Average latency | 64.8ms ms |
152+
| Maximum latency | 578.3ms ms |
153+
| Stake-weighted latency | 0.0ms ms |
154+
| Bidirectional connections | 1463 |
155+
| Asymmetry ratio | 84.85% |
156+
157+
### Rust simulation
158+
159+
- Fixed bad assertion triggered by extreme load of 10k node network
160+
- Added support for configurable timestamp resolution
161+
3162
## 2025-06-11
4163

5164
### Additional data analyses in Leios trace processor
@@ -79,7 +238,7 @@ Here are a few personal (@bwbush) observations, reflections, and conclusions on
79238
5. It appears that front running can best be eliminated (at the ledger level, but not at the mempool level) by strictly ordering transactions by their IB's slot and VRF.
80239
- Other IB and EB ordering proposals create complexity in the ledger rules and would be difficult to fully analyze for vulnerabilities.
81240

82-
## Rust simulation
241+
### Rust simulation
83242

84243
Implemented random sampling of transactions from the Leios mempool. When transaction traffic is high enough that IBs are completely full, it should ensure that different IBs contain different transactions when possible.
85244

analysis/sims/.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,13 @@
22
.ipynb_checkpoints/
33
ols
44
sim-cli
5+
leios-trace-processor
56
tmp/
67
plots/
78
results/
89
stdout
910
stderr
1011
*.csv.gz
1112
*.log.gz
13+
*.log.xz
1214
index.html

analysis/sims/2025w23/analysis.ipynb

Lines changed: 3700 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
#!/usr/bin/env nix-shell
2+
#!nix-shell -i bash -p gzip
3+
4+
set -e
5+
6+
mkdir -p results/{tps3x,ibs}
7+
8+
for d in ibs tps3x
9+
do
10+
for f in cpus lifecycle resources receipts
11+
do
12+
DIR=$(find $d -type f -name $f.csv.gz \( -not -empty \) -printf %h\\n -quit)
13+
HL=$(sed -n -e '1p' "$DIR/case.csv")
14+
HR=$(zcat "$DIR/$f.csv.gz" | sed -n -e '1p')
15+
(
16+
echo "$HL,$HR"
17+
for g in $(find $d -type f -name $f.csv.gz \( -not -empty \) -printf %h\\n)
18+
do
19+
if [ ! -e "$g/stderr" ]
20+
then
21+
echo "Skipping $g because it has no stderr." >> /dev/stderr
22+
elif [ -s "$g/stderr" ]
23+
then
24+
echo "Skipping $g because its stderr is not empty." >> /dev/stderr
25+
else
26+
BL=$(sed -n -e '2p' "$g/case.csv")
27+
zcat "$g/$f.csv.gz" | sed -e "1d;s/^/$BL,/;s/null/NA/g"
28+
fi
29+
done
30+
) | gzip -9c > results/$d/$f.csv.gz &
31+
done
32+
done
33+
wait

analysis/sims/2025w23/git.hash

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
6e7e493b682602d4df70157aac309bff0ca9b64e
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
simulator,ibps
2+
rust,0.5

0 commit comments

Comments
 (0)