You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: investigation/QUEUELIST_BENCHMARK_RESULTS.md
+39-15Lines changed: 39 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,13 +2,16 @@
2
2
3
3
## Overview
4
4
5
-
Created comprehensive BenchmarkDotNet benchmarks for QueueList to simulate the 5000-element append scenario as used in CheckDeclarations. Tested 5 implementations:
5
+
Created comprehensive BenchmarkDotNet benchmarks for QueueList to simulate the 5000-element append scenario as used in CheckDeclarations. Tested 8 implementations:
**Key Insight**: V1/V2 perform nearly identically (~1% difference, within margin of error). Array-based V3 is **4.2x faster** but allocates **27x more memory**.
34
+
| Implementation | Mean (ms) | Ratio | Allocated | Alloc Ratio |
-**V5 (DList with lazy cached list) is the WINNER**: **4.1x faster** than baseline with only **1.6x more memory** (best speed/memory trade-off)
47
+
- V6 (DList native) is slightly slower but uses even less memory (1.46x)
48
+
- V3/V7 (array-based) are equally fast but use 8x more memory
49
+
- V1/V2 perform nearly identically (~1% difference, within margin of error)
40
50
41
51
## Analysis
42
52
@@ -46,18 +56,32 @@ This is closest to real CheckDeclarations usage:
46
56
2.**AppendOptimized overhead**: Creating intermediate merged lists has cost without benefit for single-element case
47
57
3.**No structural sharing**: Each operation creates new objects, so optimization can't amortize
48
58
49
-
### Why V3 (Array) is Fastest
59
+
### Why V5 (DList with Caching) is Best
60
+
61
+
1.**O(1) append**: DList composition is constant time
62
+
2.**Lazy materialization**: List is only computed when needed for iteration
63
+
3.**Balanced trade-off**: 4.1x speedup with only 1.6x memory overhead
64
+
4.**Good for append-heavy + periodic iteration**: Perfect fit for the CheckDeclarations pattern
65
+
66
+
### Why V6 (DList Native) is Also Good
67
+
68
+
1.**Even less memory**: 1.46x allocation overhead
69
+
2.**Still very fast**: 4.0x speedup over baseline
70
+
3.**Trade-off**: Slightly slower iteration (materializes on every access)
71
+
72
+
### Why V3/V7 (Array/ImmutableArray) Are Fast But Costly
50
73
51
74
1.**Contiguous memory**: Better cache locality
52
75
2.**Direct indexing**: No list traversal overhead
53
76
3.**Simple iteration**: Array enumeration is highly optimized
54
-
4.**Trade-off**: 27-38x more memory allocation
77
+
4.**Trade-off**: 8x more memory allocation
55
78
56
79
### Recommendations
57
80
58
81
1.**For this PR**: The AppendOptimized/caching changes don't help and should be reverted
59
-
2.**Future work**: Consider array-backed implementation if willing to accept higher memory usage
60
-
3.**Real solution**: Architectural change to avoid O(n²) iterations in CombineCcuContentFragments
82
+
2.**Best alternative**: **V5 (DList with lazy cached list)** - 4.1x faster with only 1.6x memory overhead
83
+
3.**Memory-conscious alternative**: V6 (DList native) - 4.0x faster with only 1.46x memory overhead
84
+
4.**Future work**: Consider implementing DList-based QueueList for real performance gains
61
85
62
86
## Benchmark Categories
63
87
@@ -68,4 +92,4 @@ The benchmark includes 5 categories:
68
92
4.**Combined**: Realistic scenario with periodic operations
69
93
5.**AppendQueueList**: Appending QueueList objects (not single elements)
70
94
71
-
All results confirm: **Current optimizations (V1/V2) provide no measurable benefit** over the baseline for the actual usage pattern.
95
+
All results confirm: **Current optimizations (V1/V2) provide no measurable benefit** over the baseline for the actual usage pattern.**DList-based implementations (V5/V6) show real performance gains** with acceptable memory overhead.
0 commit comments