Skip to content

Commit 93280ff

Browse files
committed
fix: invert speed percentile labels
IQB needs to answer: "Can 95% of users perform this use case?" This requires checking if 95% of users meet ALL requirements. Example scenario ---------------- Gaming requirements: speed ≥ 10 Mbit/s AND latency ≤ 15 ms Two networks, with raw percentile data (before this commit): **Network Foo:** - p95 speed = 30 Mbit/s - p5 speed = 12 Mbit/s - p95 latency = 7 ms - p5 latency = 2 ms **Network Bar:** - p95 speed = 30 Mbit/s - p5 speed = 8 Mbit/s - p95 latency = 20 ms - p5 latency = 5 ms The percentile asymmetry problem --------------------------------- Percentile definitions (by convention): - pX = "X% of users have this value OR LESS" For Foo latency (p95 = 7ms, threshold ≤ 15ms): - 95% have ≤ 7ms - Since 7ms < 15ms threshold, we KNOW 95% pass ✓ DEFINITIVE For Bar latency (p95 = 20ms, threshold ≤ 15ms): - 95% have ≤ 20ms - But we DON'T know what % have ≤ 15ms - Could be 94% at 1ms + 1% at 18ms → 94% pass - Could be 10% at 1ms + 85% at 18ms → 10% pass - p95 alone is AMBIGUOUS ✗ For Foo speed (p95 = 30 Mbit/s, threshold ≥ 10 Mbit/s): - 95% have ≤ 30 Mbit/s (unhelpful direction!) - But we DON'T know what % have ≥ 10 Mbit/s - Could be 94% at 25 Mbit/s + 1% at 8 Mbit/s → 94% pass - Could be 50% at 25 Mbit/s + 45% at 8 Mbit/s → 50% pass - p95 alone is AMBIGUOUS ✗ For Foo speed (p5 = 12 Mbit/s, threshold ≥ 10 Mbit/s): - 5% have ≤ 12 Mbit/s, therefore 95% have > 12 Mbit/s - Since 12 Mbit/s > 10 Mbit/s threshold, we KNOW 95% pass ✓ DEFINITIVE For Bar speed (p5 = 8 Mbit/s, threshold ≥ 10 Mbit/s): - 95% have > 8 Mbit/s - But we DON'T know what % have ≥ 10 Mbit/s - p5 alone is AMBIGUOUS ✗ Pattern discovered ------------------ For "lower is better" metrics (latency, packet loss): - pX gives DEFINITIVE answer when: pX ≤ threshold - Useful percentile for IQB: p95 For "higher is better" metrics (speed): - pX gives DEFINITIVE answer when: p(100-X) ≥ threshold - Which is equivalent to: p5 ≥ threshold (for 95% coverage) - Useful percentile for IQB: p5 The asymmetry in code (before this commit): ```Python if latency_p95 <= threshold: latency_passes = True # Definitive else: latency_passes = None # Ambiguous, need more data if speed_p5 >= threshold: speed_passes = True # Definitive else: speed_passes = None # Ambiguous, need more data ``` This is error-prone: easy to accidentally use speed_p95 instead of speed_p5. Solution: Invert speed labels at data generation In BigQuery queries (data/query_*.sql), swap speed percentile labels: - OFFSET(5) raw → labeled as "download_p95" in JSON - OFFSET(95) raw → labeled as "download_p5" in JSON After inversion, "p95" uniformly means "performance that gives us a definitive answer when it meets the threshold": ```Python if speed_p95 >= threshold: # p95 is actually p5 raw (inverted) speed_passes = True else: speed_passes = None if latency_p95 <= threshold: # p95 is actually p95 raw (not inverted) latency_passes = True else: latency_passes = None ``` Why invert at data generation, not at runtime? 1. Availability guarantee: Never have to check "does p5 exist?" before attempting to answer the 95% coverage question 2. Single source of truth: Inversion logic lives in SQL queries only, not scattered across Python code 3. Self-documenting data: JSON files contain "quality percentiles" where pX always means "X% achieve this or better" 4. Simpler cache/calculator code: No metric-specific logic needed Trade-off: Speed percentile labels no longer match raw statistical definitions (p95 label contains p5 raw value). This is extensively documented in SQL comments and cache.py docstrings. See data/query_downloads.sql for detailed explanation and examples. Based on a comment by https://github.com/sermpezis/ inside a notes document. Hopefully I interepreted it correctly, otherwise TIL.
1 parent 014e0a5 commit 93280ff

File tree

7 files changed

+189
-114
lines changed

7 files changed

+189
-114
lines changed

data/br_2024_10.json

Lines changed: 33 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -10,48 +10,48 @@
1010
},
1111
"metrics": {
1212
"download_throughput_mbps": {
13-
"p1": 0.15979623373499155,
14-
"p5": 0.9501991252036766,
15-
"p10": 3.101174869710966,
16-
"p25": 15.0340700432778,
17-
"p50": 51.9831305263177,
18-
"p75": 158.38962702858973,
19-
"p90": 330.3352983503099,
20-
"p95": 456.0950392154999,
21-
"p99": 696.5613392781584
13+
"p1": 696.8102585656382,
14+
"p5": 456.24926276472667,
15+
"p10": 329.99833241434123,
16+
"p25": 158.08221295858434,
17+
"p50": 52.08522919032269,
18+
"p75": 15.052793601948656,
19+
"p90": 3.1272008078708624,
20+
"p95": 0.9523541336337032,
21+
"p99": 0.16179491817039293
2222
},
2323
"upload_throughput_mbps": {
24-
"p1": 0.042563080079753776,
25-
"p5": 0.07560071683921148,
26-
"p10": 0.08980854096320207,
27-
"p25": 5.545812099052701,
28-
"p50": 30.78175191467136,
29-
"p75": 88.37694460346944,
30-
"p90": 181.64033113619195,
31-
"p95": 255.97876412741525,
32-
"p99": 394.3416893812533
24+
"p1": 393.6290249806801,
25+
"p5": 256.00644187498716,
26+
"p10": 181.60570721295196,
27+
"p25": 88.42259005024358,
28+
"p50": 30.73281812980941,
29+
"p75": 5.55669981856058,
30+
"p90": 0.08981257546856133,
31+
"p95": 0.07559917542134865,
32+
"p99": 0.043266831359173155
3333
},
3434
"latency_ms": {
35-
"p1": 1.394,
36-
"p5": 3.637,
37-
"p10": 4.958,
38-
"p25": 9.079,
39-
"p50": 19.953,
40-
"p75": 52.065,
41-
"p90": 184.738,
42-
"p95": 234.072,
43-
"p99": 273.0
35+
"p1": 1.39,
36+
"p5": 3.643,
37+
"p10": 4.953,
38+
"p25": 9.073,
39+
"p50": 19.957,
40+
"p75": 52.024,
41+
"p90": 184.68,
42+
"p95": 234.185,
43+
"p99": 273.544
4444
},
4545
"packet_loss": {
4646
"p1": 0.0,
4747
"p5": 0.0,
4848
"p10": 0.0,
49-
"p25": 1.1042755272820004e-05,
50-
"p50": 0.004822712745559209,
51-
"p75": 0.05811090765473097,
52-
"p90": 0.13649207990035975,
53-
"p95": 0.1987869577393624,
54-
"p99": 0.3652163739953438
49+
"p25": 1.0923089245161829e-05,
50+
"p50": 0.0048178544016059515,
51+
"p75": 0.058124757325470816,
52+
"p90": 0.13651986085946857,
53+
"p95": 0.1985210573594862,
54+
"p99": 0.3680648144679889
5555
}
5656
}
5757
}

data/de_2024_10.json

Lines changed: 30 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -10,48 +10,48 @@
1010
},
1111
"metrics": {
1212
"download_throughput_mbps": {
13-
"p1": 0.22367850581560372,
14-
"p5": 1.262769802856182,
15-
"p10": 3.4166592054870026,
16-
"p25": 13.817824595534129,
17-
"p50": 45.24430302103892,
18-
"p75": 100.56946051210859,
19-
"p90": 248.78115747983244,
20-
"p95": 377.8657642766346,
21-
"p99": 741.7983223940372
13+
"p1": 741.3863770285967,
14+
"p5": 377.9433173862602,
15+
"p10": 248.65806704804896,
16+
"p25": 100.59657604456656,
17+
"p50": 45.262074301346765,
18+
"p75": 13.80200458802345,
19+
"p90": 3.432561194292282,
20+
"p95": 1.2581497389555467,
21+
"p99": 0.22552302324036846
2222
},
2323
"upload_throughput_mbps": {
24-
"p1": 0.04798033204768874,
25-
"p5": 0.07565187888251705,
26-
"p10": 0.19852741925194242,
27-
"p25": 3.5715003423978087,
28-
"p50": 17.172955392453527,
29-
"p75": 36.63458526768415,
30-
"p90": 53.192909502396375,
31-
"p95": 101.34444079000329,
32-
"p99": 285.7324202068485
24+
"p1": 285.715497004709,
25+
"p5": 101.84982169389747,
26+
"p10": 53.243619429234855,
27+
"p25": 36.62105866176215,
28+
"p50": 17.1805215736349,
29+
"p75": 3.556625227971489,
30+
"p90": 0.19786786217149757,
31+
"p95": 0.07565274320492381,
32+
"p99": 0.04880458855925971
3333
},
3434
"latency_ms": {
35-
"p1": 0.438,
36-
"p5": 3.433,
37-
"p10": 6.787,
35+
"p1": 0.448,
36+
"p5": 3.481,
37+
"p10": 6.78,
3838
"p25": 11.589,
3939
"p50": 17.712,
40-
"p75": 26.382,
41-
"p90": 38.489,
42-
"p95": 57.061,
43-
"p99": 305.85
40+
"p75": 26.381,
41+
"p90": 38.464,
42+
"p95": 57.313,
43+
"p99": 304.595
4444
},
4545
"packet_loss": {
4646
"p1": 0.0,
4747
"p5": 0.0,
4848
"p10": 0.0,
4949
"p25": 0.0,
50-
"p50": 0.00034573047467282084,
51-
"p75": 0.016581558328885995,
52-
"p90": 0.07073353719313655,
53-
"p95": 0.11517449630011735,
54-
"p99": 0.2521127443846117
50+
"p50": 0.0003440108877366934,
51+
"p75": 0.016605967886425984,
52+
"p90": 0.07071063900421407,
53+
"p95": 0.11531058751234509,
54+
"p99": 0.25114064520339185
5555
}
5656
}
5757
}

data/query_downloads.sql

Lines changed: 43 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,47 @@
11
SELECT
22
client.Geo.CountryCode as country_code,
33
COUNT(*) as sample_count,
4-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(1)] as download_p1,
5-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(5)] as download_p5,
6-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(10)] as download_p10,
7-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(25)] as download_p25,
4+
5+
-- ============================================================================
6+
-- PERCENTILE LABELING CONVENTION FOR IQB QUALITY ASSESSMENT
7+
-- ============================================================================
8+
-- IQB Policy: p95 means "95% of users achieve this performance or better"
9+
--
10+
-- This allows threshold comparison: "Can 95% of users perform this use case?"
11+
--
12+
-- For "lower is better" metrics (latency, packet loss):
13+
-- - Raw p95 = high value = "95% of users have ≤ 95ms latency"
14+
-- - This directly answers: "Can 95% of users achieve ≤ threshold?"
15+
-- - Label: OFFSET(95) → latency_p95 (no inversion needed) ✓
16+
--
17+
-- For "higher is better" metrics (throughput):
18+
-- - Raw p95 = high value = "95% of users have ≤ 625 Mbit/s speed"
19+
-- - But we need: "95% of users have ≥ X Mbit/s speed"
20+
-- - Solution: Use p5 raw = "95% of users have ≥ 2.76 Mbit/s"
21+
-- - Mathematical inversion: p(X)_quality = p(100-X)_raw
22+
-- - Label: OFFSET(5) → download_p95 (inverted!) ✓
23+
--
24+
-- Example:
25+
-- Requirement: ≥ 30 Mbit/s download, ≤ 200ms latency for 95% of users
26+
-- Network A: download_p95 = 10 Mbit/s, latency_p95 = 100ms → FAIL (10 < 30)
27+
-- Network B: download_p95 = 33 Mbit/s, latency_p95 = 50ms → PASS
28+
-- ============================================================================
29+
30+
-- Download throughput (higher is better - INVERTED LABELS!)
31+
-- ⚠️ OFFSET(99) = top speed = worst of top 1% → labeled as p1
32+
-- ⚠️ OFFSET(5) = 5th percentile raw = 95% have MORE → labeled as p95
33+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(99)] as download_p1,
34+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(95)] as download_p5,
35+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(90)] as download_p10,
36+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(75)] as download_p25,
837
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(50)] as download_p50,
9-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(75)] as download_p75,
10-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(90)] as download_p90,
11-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(95)] as download_p95,
12-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(99)] as download_p99,
38+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(25)] as download_p75,
39+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(10)] as download_p90,
40+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(5)] as download_p95,
41+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(1)] as download_p99,
42+
43+
-- Latency/MinRTT (lower is better - no inversion)
44+
-- Raw percentiles directly represent "X% of users have ≤ this latency"
1345
APPROX_QUANTILES(a.MinRTT, 100)[OFFSET(1)] as latency_p1,
1446
APPROX_QUANTILES(a.MinRTT, 100)[OFFSET(5)] as latency_p5,
1547
APPROX_QUANTILES(a.MinRTT, 100)[OFFSET(10)] as latency_p10,
@@ -19,6 +51,9 @@ SELECT
1951
APPROX_QUANTILES(a.MinRTT, 100)[OFFSET(90)] as latency_p90,
2052
APPROX_QUANTILES(a.MinRTT, 100)[OFFSET(95)] as latency_p95,
2153
APPROX_QUANTILES(a.MinRTT, 100)[OFFSET(99)] as latency_p99,
54+
55+
-- Packet Loss Rate (lower is better - no inversion)
56+
-- Raw percentiles directly represent "X% of users have ≤ this loss rate"
2257
APPROX_QUANTILES(a.LossRate, 100)[OFFSET(1)] as loss_p1,
2358
APPROX_QUANTILES(a.LossRate, 100)[OFFSET(5)] as loss_p5,
2459
APPROX_QUANTILES(a.LossRate, 100)[OFFSET(10)] as loss_p10,

data/query_uploads.sql

Lines changed: 25 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,32 @@
11
SELECT
22
client.Geo.CountryCode as country_code,
33
COUNT(*) as sample_count,
4-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(1)] as upload_p1,
5-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(5)] as upload_p5,
6-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(10)] as upload_p10,
7-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(25)] as upload_p25,
4+
5+
-- ============================================================================
6+
-- PERCENTILE LABELING CONVENTION FOR IQB QUALITY ASSESSMENT
7+
-- ============================================================================
8+
-- IQB Policy: p95 means "95% of users achieve this performance or better"
9+
--
10+
-- Upload throughput is "higher is better", so:
11+
-- - Raw p95 = "95% of users have ≤ X Mbit/s"
12+
-- - But we need: "95% of users have ≥ Y Mbit/s"
13+
-- - Solution: Use p5 raw (inverted labels)
14+
--
15+
-- See query_downloads.sql for detailed explanation and examples.
16+
-- ============================================================================
17+
18+
-- Upload throughput (higher is better - INVERTED LABELS!)
19+
-- ⚠️ OFFSET(99) = top speed = worst of top 1% → labeled as p1
20+
-- ⚠️ OFFSET(5) = 5th percentile raw = 95% have MORE → labeled as p95
21+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(99)] as upload_p1,
22+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(95)] as upload_p5,
23+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(90)] as upload_p10,
24+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(75)] as upload_p25,
825
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(50)] as upload_p50,
9-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(75)] as upload_p75,
10-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(90)] as upload_p90,
11-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(95)] as upload_p95,
12-
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(99)] as upload_p99
26+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(25)] as upload_p75,
27+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(10)] as upload_p90,
28+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(5)] as upload_p95,
29+
APPROX_QUANTILES(a.MeanThroughputMbps, 100)[OFFSET(1)] as upload_p99
1330
FROM
1431
`measurement-lab.ndt.unified_uploads`
1532
WHERE

data/us_2024_10.json

Lines changed: 32 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -10,48 +10,48 @@
1010
},
1111
"metrics": {
1212
"download_throughput_mbps": {
13-
"p1": 0.37354810526833476,
14-
"p5": 2.7494108827310177,
15-
"p10": 7.6575433038007406,
16-
"p25": 29.94873577502137,
17-
"p50": 96.36533017831101,
18-
"p75": 268.1810327939917,
19-
"p90": 474.1768162996085,
20-
"p95": 625.4494125653449,
21-
"p99": 893.2782851912168
13+
"p1": 891.8792927991478,
14+
"p5": 625.7114881036548,
15+
"p10": 474.3016262507926,
16+
"p25": 268.42685523419607,
17+
"p50": 96.34596771774324,
18+
"p75": 29.912972037815766,
19+
"p90": 7.635142044576455,
20+
"p95": 2.7455472138124843,
21+
"p99": 0.38098156744742
2222
},
2323
"upload_throughput_mbps": {
24-
"p1": 0.06279911698366483,
25-
"p5": 0.15105079102447938,
26-
"p10": 1.0130561597157441,
27-
"p25": 8.030055616329323,
28-
"p50": 20.95814566696693,
29-
"p75": 65.73945359925672,
30-
"p90": 223.9767416770114,
31-
"p95": 370.4336035390081,
32-
"p99": 813.7319533731953
24+
"p1": 816.2955641497589,
25+
"p5": 369.1587515169648,
26+
"p10": 224.4207438060576,
27+
"p25": 65.7807279310557,
28+
"p50": 20.964743814950857,
29+
"p75": 8.038127970369711,
30+
"p90": 1.0064079692417476,
31+
"p95": 0.1521468780337755,
32+
"p99": 0.06268772417401364
3333
},
3434
"latency_ms": {
35-
"p1": 0.16,
36-
"p5": 0.808,
37-
"p10": 2.886,
38-
"p25": 7.778,
39-
"p50": 16.124,
40-
"p75": 30.0,
41-
"p90": 51.303,
42-
"p95": 80.55,
43-
"p99": 251.545
35+
"p1": 0.159,
36+
"p5": 0.802,
37+
"p10": 2.894,
38+
"p25": 7.783,
39+
"p50": 16.128,
40+
"p75": 30.002,
41+
"p90": 51.178,
42+
"p95": 80.743,
43+
"p99": 252.507
4444
},
4545
"packet_loss": {
4646
"p1": 0.0,
4747
"p5": 0.0,
4848
"p10": 0.0,
4949
"p25": 0.0,
50-
"p50": 0.000516724336793541,
51-
"p75": 0.019090240380880846,
52-
"p90": 0.07332944466732425,
53-
"p95": 0.12018590164702943,
54-
"p99": 0.253111989432024
50+
"p50": 0.0005198544633386031,
51+
"p75": 0.01908806003333987,
52+
"p90": 0.07333410102683499,
53+
"p95": 0.1205355721342832,
54+
"p99": 0.2548062180643292
5555
}
5656
}
5757
}

library/src/iqb/cache.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,29 @@ def get_data(
7070
Raises:
7171
FileNotFoundError: If requested data is not available in cache.
7272
ValueError: If requested percentile is not available in cached data.
73+
74+
⚠️ PERCENTILE INTERPRETATION (CRITICAL!)
75+
=========================================
76+
IQB Policy: pX means "X% of users achieve this performance or better"
77+
78+
This enables threshold comparison: "Can 95% of users meet requirement?"
79+
80+
For "lower is better" metrics (latency, packet loss):
81+
- Raw p95 = "95% of users have ≤ 80ms latency"
82+
- Directly usable: latency_p95 ≤ threshold? ✓
83+
- No inversion needed
84+
85+
For "higher is better" metrics (throughput):
86+
- Raw p95 = "95% of users have ≤ 625 Mbit/s speed" ✗
87+
- We need: "95% of users have ≥ X Mbit/s"
88+
- Solution: Use p5 raw = "95% have more than this"
89+
- Mathematical inversion: p(X)_quality = p(100-X)_raw
90+
- Example: OFFSET(5) raw → labeled as "download_p95" in JSON
91+
92+
This inversion happens in BigQuery (see data/query_*.sql),
93+
so this cache code treats all percentiles uniformly.
94+
When you request percentile=95, you ALWAYS get
95+
"performance that 95% of users achieve or better".
7396
"""
7497
# Design Note
7598
# -----------

library/tests/iqb/cache_test.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -104,9 +104,9 @@ def test_get_data_with_different_percentile(self, data_dir):
104104
assert "m-lab" in data_p50
105105
data_p50 = data_p50["m-lab"]
106106

107-
# p95 should be higher than p50 for throughput metrics
108-
assert data_p95["download_throughput_mbps"] > data_p50["download_throughput_mbps"]
109-
assert data_p95["upload_throughput_mbps"] > data_p50["upload_throughput_mbps"]
107+
# p95 should be higher than p50 for throughput metrics (lower is worse)
108+
assert data_p95["download_throughput_mbps"] < data_p50["download_throughput_mbps"]
109+
assert data_p95["upload_throughput_mbps"] < data_p50["upload_throughput_mbps"]
110110

111111
# p95 should be higher than p50 for latency (higher is worse)
112112
assert data_p95["latency_ms"] > data_p50["latency_ms"]

0 commit comments

Comments
 (0)