Skip to content

Commit 73c3b6b

Browse files
authored
Latencies for ASNs (#316)
* Generated latency-matrix data files for ASN and country * Recipe for reproducing or updating datasets * Added summary statistics and graphics * Updated logbook
1 parent c446918 commit 73c3b6b

File tree

8 files changed

+19964
-0
lines changed

8 files changed

+19964
-0
lines changed

Logbook.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,13 @@
11
# Leios logbook
22

3+
## 2025-05-05
4+
5+
### Internet latencies
6+
7+
In support of building a more realistic topology and latency graph for the pseudo-mainnet simulations, we have processed 2.6 billion `ping` measurements from the [RIPE Atlas](https://www.ripe.net/analyse/internet-measurements/ripe-atlas/). When combined with Cardano mainnet node-location telemetry, this can be used to have realistic network delays in the simulation network topology.
8+
9+
![ASN to ASN RTT statistics](data/internet/asn-to-asn.svg)
10+
311
## 2025-05-02
412

513
### Long-term profitablity of Praos
@@ -31,6 +39,7 @@ We have analyzed Cardano `mainnet` statistics regarding the Reserve, rewards, tr
3139
- 19.3% block space utilization
3240

3341
### Rust simulation
42+
3443
- Minor visual improvements to visualizer
3544
- Updated logic around "late IB inclusion" Leios extension to match Giorgos' revisions
3645

@@ -47,6 +56,7 @@ An experimental Delta-QSD expression was created for computing the delay between
4756
As many sections of the [Leios CIP](docs/cip/README.md) have been drafted as can be done pending resolution of outstanding discussions of changes in the Full Leios protocol. The document uses the standard CIP template and provides evidence-based arguments for the need and viability of Leios.
4857

4958
### Rust simulation
59+
5060
- Publicly hosted visualization as part of the Leios docs
5161
- Added a "transactions" view to the visualization, showing a graph of TXs in different states over time
5262
- Fixed crash when running long simulations

data/internet/ReadMe.md

Lines changed: 309 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,309 @@
1+
# Latency map of the internet
2+
3+
Here we process data from the [RIPE Atlas](https://www.ripe.net/analyse/internet-measurements/ripe-atlas/), which is one of the largest internet measurement networks. The result is a data table of ASN-to-ASN round-trip times (RTT) for pings. Each [Autonomous System](https://en.wikipedia.org/wiki/Autonomous_system_(Internet)) is identified by an ASN (autonomous system number).
4+
5+
These data have been used for the latencies in the mock up of a synthetic version of the Cardano `mainnet`, for use in simulation studies of Ouroboros protocol design and development.
6+
7+
8+
## Data files
9+
10+
- [asn\_rtt\_stat.csv.gz](asn_rtt_stat.csv.gz) ASN-to-ASN latencies
11+
- [country\_rtt\_stat.csv.gz](country_rtt_stat.csv.gz) country-to-country latencies
12+
- [intra\_rtt\_stat.csv.gz](intra_rtt_stat.csv.gz) intra-ASN latencies
13+
14+
15+
## Data dictionary
16+
17+
Latencies are assumed to be symmetical, so the data files only record the "upper triangle" of the latency matrix.
18+
19+
| Field | Units | Description |
20+
|----------------|----------|--------------------------------------------------------------------|
21+
| `asn1`, `asn2` | n/a | The autonomous system numbers between which latencies are measured |
22+
| `cty1`, `cty2` | n/a | The countries between which latencies are measured |
23+
| `rtt_cnt` | unitless | The number of pings in the raw dataset. |
24+
| `rtt_avg` | ms | The round-trip time (RTT) for ping between the two locations. |
25+
| `rtt_std` | ms | The sample standard deviation of the RTT ping measurements. |
26+
| `rtt_min` | ms | The minimum of the RTT ping measurments in th sample. |
27+
| `rtt_max` | ms | The maximum of the RTT ping measurements in the sample. |
28+
29+
30+
## Summary statistics
31+
32+
33+
### ASN to ASN
34+
35+
![ASN to ASN RTT statistics](asn-to-asn.svg)
36+
37+
38+
### Country to country
39+
40+
![Country to country RTT statistics](cty-to-cty.svg)
41+
42+
43+
### Intra-ASN
44+
45+
| Mean | Standard deviation | Minimum | Maximum |
46+
|--------:|-------------------:|--------:|------------:|
47+
| 80.4 ms | 103.5 ms | 0 ms | 249625.7 ms |
48+
49+
50+
## Data processing
51+
52+
This section provides a reproducible recipe for creating or updating the data files.
53+
54+
55+
### Schema
56+
57+
Use PostgreSQL to process the data.
58+
59+
First we create a schema for data processing.
60+
61+
```sql
62+
create schema asn;
63+
```
64+
65+
66+
### Ping measurements
67+
68+
Download the `ping` measurements for several days from the RIPE Atlas. The origin, destination, and RTT measurements are extracted from these files.
69+
70+
```bash
71+
for d in 2025-03-29 2025-03-30 2025-03-31 2025-04-18 2025-04-19
72+
do
73+
for t in 0000 0100 0200 0300 0400 0500 0600 0700 0800 0900 1000 1100 1200 1300 1400 1500 1600 1700 1800 1900 2000 2100 2200 2300
74+
do
75+
curl -sS "https://data-store.ripe.net/datasets/atlas-daily-dumps/${d}/ping-${d}T${t}.bz2" \
76+
| bunzip2 -c \
77+
| jq -r '
78+
select(.result)
79+
| select(.src_addr)
80+
| select(.dst_addr)
81+
| select(.src_addr | contains("."))
82+
| select(.dst_addr | contains("."))
83+
| .src_addr as $src
84+
| .dst_addr as $dst
85+
| [
86+
.result.[]
87+
| select(.rtt)
88+
| ($src + "," + $dst + "," + (.rtt | tostring))
89+
]
90+
| .[]
91+
' \
92+
| pigz -9c \
93+
> "ping-${d}T${t}.gz"
94+
sleep 10s
95+
done
96+
done
97+
```
98+
99+
Create a table for the `ping` measurements.
100+
101+
```sql
102+
create table asn.ping (
103+
src inet
104+
, dst inet
105+
, rtt real
106+
);
107+
```
108+
109+
Load the data extracts into the table.
110+
111+
```bash
112+
for f in ping-*T*.gz
113+
do
114+
echo "$f"
115+
zcat $f > tmp.csv
116+
psql -c "\\copy asn.ping from 'tmp.csv' csv"
117+
done
118+
rm tmp.csv
119+
```
120+
121+
```sql
122+
select count(*) from asn.ping;
123+
```
124+
125+
```console
126+
count
127+
------------
128+
2480314130
129+
(1 row)
130+
```
131+
132+
133+
### IP ranges for ASNs
134+
135+
The [iptoasn](https://iptoasn.com) web site provides a table of the IP ranges corresponding to each ASN. (Note that the RIPE Atlas's *probe* data file does not contain sufficient information for reconstructing destination ASNs in the `ping` data, wo we use this alternative dataset.)
136+
137+
Download the IP to ASN correspondence table.
138+
139+
```bash
140+
curl -sS -o ip2asn-v4.tsv.gz https://iptoasn.com/data/ip2asn-v4.tsv.gz
141+
zcat ip2asn-v4.tsv.gz \
142+
| gawk '
143+
BEGIN {
144+
FS="\t"
145+
OFS=","
146+
print "start", "end", "asn", "country", "label"
147+
}
148+
{
149+
print $1, $2, $3, $4, "\"" $5 "\""
150+
}
151+
' \
152+
> ip2asn.csv
153+
```
154+
155+
Load that download into a table.
156+
157+
```sql
158+
create table asn.ip2asn (
159+
ip_start inet
160+
, ip_end inet
161+
, asn bigint
162+
, country varchar(8)
163+
, organization text
164+
);
165+
166+
\copy asn.ip2asn from 'ip2asn.csv' csv header
167+
```
168+
169+
```console
170+
INSERT 0 56
171+
```
172+
173+
174+
### Known IP addresses
175+
176+
For efficiency we create a table of all of the IP addresses seen in the `ping` data. We do this because we want to only perform the costly `BETWEEN` query on IP addresses a single time.
177+
178+
```sql
179+
create table asn.addr as
180+
select src as ip from asn.ping
181+
union
182+
select dst from asn.ping
183+
;
184+
```
185+
186+
```console
187+
SELECT 73754
188+
```
189+
190+
191+
### Correspondence tables for know IP addresses
192+
193+
Now identify which ASN(s) correspond to each IP address and note the country code.
194+
195+
```sql
196+
create table asn.ips as
197+
select
198+
ip
199+
, asn
200+
, country
201+
from asn.addr
202+
inner join asn.ip2asn
203+
on ip between ip_start and ip_end
204+
;
205+
```
206+
207+
```console
208+
SELECT 82710
209+
```
210+
211+
212+
## RTT statistics
213+
214+
Summarize the `ping` statistics. We tabulate the minimum, maximum, mean, and standard deviation so that we can later sample from a truncated Gaussian distribution when generating synthetic data.
215+
216+
217+
### ASNs
218+
219+
```sql
220+
create table asn.asn_stat as
221+
select
222+
least(src.asn, dst.asn) as asn1
223+
, greatest(src.asn, dst.asn) as asn2
224+
, count(*) as rtt_cnt
225+
, avg(rtt) as rtt_avg
226+
, stddev(rtt) as rtt_std
227+
, min(rtt) as rtt_min
228+
, max(rtt) as rtt_max
229+
from asn.ping as ping
230+
inner join asn.ips as src
231+
on src.ip = ping.src
232+
inner join asn.ips as dst
233+
on dst.ip = ping.dst
234+
group by 1, 2
235+
;
236+
237+
\copy asn.asn_stat to 'asn_rtt_stat.csv' csv header
238+
```
239+
240+
```console
241+
COPY 422372
242+
```
243+
244+
```bash
245+
gzip -9v asn_rtt_stat.csv
246+
```
247+
248+
249+
### Countries
250+
251+
```sql
252+
create table asn.cty_stat as
253+
select
254+
least(src.country, dst.country) as cty1
255+
, greatest(src.country, dst.country) as cty2
256+
, count(*) as rtt_cnt
257+
, avg(rtt) as rtt_avg
258+
, stddev(rtt) as rtt_std
259+
, min(rtt) as rtt_min
260+
, max(rtt) as rtt_max
261+
from asn.ping as ping
262+
inner join asn.ips as src
263+
on src.ip = ping.src
264+
inner join asn.ips as dst
265+
on dst.ip = ping.dst
266+
group by 1, 2
267+
;
268+
269+
\copy asn.cty_stat to 'country_rtt_stat.csv' csv header
270+
```
271+
272+
```console
273+
COPY 6827
274+
```
275+
276+
```bash
277+
gzip -9v country_rtt_stat.csv
278+
```
279+
280+
281+
### Intra-ASN
282+
283+
```sql
284+
create table asn.intra_stat as
285+
select
286+
count(*) as rtt_cnt
287+
, avg(rtt) as rtt_avg
288+
, stddev(rtt) as rtt_std
289+
, min(rtt) as rtt_min
290+
, max(rtt) as rtt_max
291+
from asn.ping as ping
292+
inner join asn.ips as src
293+
on src.ip = ping.src
294+
inner join asn.ips as dst
295+
on dst.ip = ping.dst
296+
where src.country = dst.country
297+
;
298+
299+
\copy asn.intra_stat to 'intra_rtt_stat.csv' csv header
300+
```
301+
302+
```console
303+
COPY 1
304+
```
305+
306+
```bash
307+
gzip -9v intra_rtt_stat.csv
308+
```
309+

0 commit comments

Comments
 (0)