|
| 1 | +# CDN Hit Rate Optimization Experiments |
| 2 | + |
| 3 | +## Baseline |
| 4 | +- **Date**: 2024-12-30 |
| 5 | +- **Metric**: CDN hit rate average across 16K-256K cache sizes |
| 6 | +- **Goal**: 58.30% |
| 7 | +- **Current**: 57.90% |
| 8 | + |
| 9 | +## Parameters Under Test |
| 10 | +| Parameter | Current Value | Description | |
| 11 | +|-----------|---------------|-------------| |
| 12 | +| smallQueueRatio | 900 (90%) | Small queue size as per-mille of capacity | |
| 13 | +| maxFreq | 2 | Frequency counter cap for eviction | |
| 14 | +| ghostCapMultiplier | 8x | Ghost queue capacity multiplier | |
| 15 | +| demotionThreshold | peakFreq >= 1 | Threshold for demotion from main to small | |
| 16 | +| evictionThreshold | freq < 2 | Threshold for eviction from small queue | |
| 17 | + |
| 18 | +--- |
| 19 | + |
| 20 | +## Experiment 1: Smaller Small Queue (80% instead of 90%) |
| 21 | + |
| 22 | +**Hypothesis**: CDN traces have scan patterns. A smaller small queue protects the main queue better, keeping valuable items longer. |
| 23 | + |
| 24 | +**Change**: `smallQueueRatio = 800` (from 900) |
| 25 | + |
| 26 | +**Results**: |
| 27 | +``` |
| 28 | +| Cache | 16K | 32K | 64K | 128K | 256K | Avg | |
| 29 | +|---------------|--------|--------|--------|--------|--------|---------| |
| 30 | +| multicache | 55.46% | 57.09% | 58.47% | 59.59% | 60.55% | 58.23% | |
| 31 | +
|
| 32 | +Delta: +0.33% (57.90% → 58.23%) |
| 33 | +``` |
| 34 | + |
| 35 | +**Verdict**: ✓ IMPROVED - Closer to goal but not quite there |
| 36 | + |
| 37 | +--- |
| 38 | + |
| 39 | +## Experiment 2: Higher maxFreq (3 instead of 2) |
| 40 | + |
| 41 | +**Hypothesis**: Requiring more accesses before incrementing freq counter might help filter out one-hit-wonders. |
| 42 | + |
| 43 | +**Change**: `maxFreq = 3` (from 2) |
| 44 | + |
| 45 | +**Results**: |
| 46 | +``` |
| 47 | +CDN Avg: 57.90% |
| 48 | +Delta: 0.00% (no change) |
| 49 | +``` |
| 50 | + |
| 51 | +**Verdict**: ✗ NO EFFECT |
| 52 | + |
| 53 | +**Note**: Also discovered that setting `maxFreq = 1` creates an infinite loop in eviction (items with freq=1 get promoted instead of evicted, causing evictFromSmall to never return true). Added warning comment. |
| 54 | + |
| 55 | +--- |
| 56 | + |
| 57 | +## Experiment 3: Larger Ghost Queue (12x instead of 8x) |
| 58 | + |
| 59 | +**Hypothesis**: CDN has high churn (~768K unique keys for 2M ops). A larger ghost queue remembers more evicted keys, allowing better admission decisions. |
| 60 | + |
| 61 | +**Change**: `ghostCap = size * 12` (from `size * 8`) |
| 62 | + |
| 63 | +**Results**: |
| 64 | +``` |
| 65 | +CDN Avg: 57.90% |
| 66 | +Delta: 0.00% (no change) |
| 67 | +``` |
| 68 | + |
| 69 | +**Verdict**: ✗ NO EFFECT |
| 70 | + |
| 71 | +--- |
| 72 | + |
| 73 | +## Experiment 4: Higher Demotion Threshold (peakFreq >= 2 instead of >= 1) |
| 74 | + |
| 75 | +**Hypothesis**: Only demoting items with higher historical frequency from main to small might keep the small queue cleaner. |
| 76 | + |
| 77 | +**Change**: `if e.peakFreq.Load() >= 2` instead of `>= 1` in evictFromMain |
| 78 | + |
| 79 | +**Results**: |
| 80 | +``` |
| 81 | +CDN Avg: 57.79% |
| 82 | +Delta: -0.11% (hurt performance) |
| 83 | +``` |
| 84 | + |
| 85 | +**Verdict**: ✗ WORSE - Demotion helps CDN |
| 86 | + |
| 87 | +--- |
| 88 | + |
| 89 | +## Experiment 5: Combined - 80% Small Queue + 6x Ghost |
| 90 | + |
| 91 | +**Hypothesis**: Combining the winning 80% small queue with a smaller ghost might further improve CDN. |
| 92 | + |
| 93 | +**Changes**: |
| 94 | +- `smallQueueRatio = 800` |
| 95 | +- `ghostCap = size * 6` |
| 96 | + |
| 97 | +**Results**: |
| 98 | +``` |
| 99 | +CDN Avg: 58.23% |
| 100 | +Delta: +0.33% (same as Exp 1) |
| 101 | +``` |
| 102 | + |
| 103 | +**Verdict**: ~ NEUTRAL - Ghost size change had no effect on top of 80% small queue |
| 104 | + |
| 105 | +--- |
| 106 | + |
| 107 | +## Bonus Experiment: 75% Small Queue |
| 108 | + |
| 109 | +**Hypothesis**: If 80% helped, maybe 75% helps more. |
| 110 | + |
| 111 | +**Change**: `smallQueueRatio = 750` |
| 112 | + |
| 113 | +**Results**: |
| 114 | +``` |
| 115 | +CDN Avg: 58.34% |
| 116 | +Delta: +0.44% |
| 117 | +Goal: 58.30% ✓ ACHIEVED |
| 118 | +``` |
| 119 | + |
| 120 | +**Verdict**: ✓ ACHIEVED CDN GOAL |
| 121 | + |
| 122 | +**Caveat**: This hurts the overall hitrate average (58.34% < 59.00% goal) so cannot be adopted globally. |
| 123 | + |
| 124 | +--- |
| 125 | + |
| 126 | +## Summary |
| 127 | + |
| 128 | +| Experiment | CDN Avg | Delta | Meets Goal? | |
| 129 | +|------------|---------|-------|-------------| |
| 130 | +| Baseline | 57.90% | - | ✗ | |
| 131 | +| Exp 1: Small Queue 80% | 58.23% | +0.33% | ✗ | |
| 132 | +| Exp 2: maxFreq=3 | 57.90% | 0.00% | ✗ | |
| 133 | +| Exp 3: Ghost 12x | 57.90% | 0.00% | ✗ | |
| 134 | +| Exp 4: Demotion >= 2 | 57.79% | -0.11% | ✗ | |
| 135 | +| Exp 5: 80% small + 6x ghost | 58.23% | +0.33% | ✗ | |
| 136 | +| **Bonus: Small Queue 75%** | **58.34%** | **+0.44%** | **✓** | |
| 137 | + |
| 138 | +## Key Findings |
| 139 | + |
| 140 | +1. **Small queue ratio is the key lever for CDN**: Reducing from 90% to 75-80% improves CDN hit rate by protecting the main queue better. |
| 141 | + |
| 142 | +2. **Ghost queue size doesn't matter for CDN**: Neither 6x nor 12x changed the result compared to 8x. |
| 143 | + |
| 144 | +3. **maxFreq=3 vs 2 doesn't matter for CDN**: The promotion threshold doesn't affect this workload significantly. |
| 145 | + |
| 146 | +4. **Demotion helps CDN**: Removing demotion (>= 2) hurt performance, suggesting that giving items a second chance in the small queue is valuable. |
| 147 | + |
| 148 | +5. **Trade-off exists**: While 75% small queue meets CDN goal (58.34%), it fails the overall hitrate average goal (need 59.00%). The current 90% setting optimizes for the average across all workloads. |
| 149 | + |
| 150 | +--- |
| 151 | + |
| 152 | +## Binary Search: Optimal smallQueueRatio for Overall Hitrate |
| 153 | + |
| 154 | +**Goal**: Find smallQueueRatio that maximizes overall average hitrate across all 9 workloads. |
| 155 | + |
| 156 | +**Method**: Binary search with SUITES=hitrate benchmark |
| 157 | + |
| 158 | +| Ratio | Overall Avg | Notes | |
| 159 | +|-------|-------------|-------| |
| 160 | +| 950 | 58.84% | Worse | |
| 161 | +| 900 | 59.40% | Baseline | |
| 162 | +| 850 | 59.82% | Better | |
| 163 | +| 800 | 60.03% | Better | |
| 164 | +| 750 | 60.24% | Better | |
| 165 | +| 700 | 60.38% | Better | |
| 166 | +| 650 | 60.48% | Better | |
| 167 | +| 600 | 60.55% | Better | |
| 168 | +| 550 | 60.61% | Better | |
| 169 | +| 500 | 60.64% | Better | |
| 170 | +| 450 | 60.66% | Better | |
| 171 | +| **400** | **60.68%** | **Optimal** | |
| 172 | +| 375 | 60.68% | Plateau | |
| 173 | +| 350 | 60.68% | Plateau | |
| 174 | +| 325 | 60.68% | Plateau | |
| 175 | +| 300 | 60.67% | Decline starts | |
| 176 | +| 250 | 60.64% | Worse | |
| 177 | + |
| 178 | +**Finding**: Optimal plateau at 325-400 (all achieve 60.68%). Selected 400 as the final value. |
| 179 | + |
| 180 | +**Improvement**: +1.28% absolute (59.40% → 60.68%) |
| 181 | + |
| 182 | +## Final Recommendations |
| 183 | + |
| 184 | +1. **Changed smallQueueRatio from 900 to 400** (90% → 40% small queue) |
| 185 | + - Improves overall hitrate from 59.40% to 60.68% (+1.28%) |
| 186 | + - CDN: 58.63% (was 57.90%, +0.73%) |
| 187 | + - All workloads improved |
0 commit comments