Skip to content

Commit 0507d8d

Browse files
CopilotT-Gro
andcommitted
Add actual 10K file performance test results - optimizations did NOT fix O(n²)
Ran real performance test with 10,001 files from fsharp-10k project. **ACTUAL RESULTS:** - Build time: >20 minutes (killed, did not complete) - Memory: 14.2 GB (still growing) - Status: FAILED - quadratic scaling persists **CLAIMED vs ACTUAL:** - Claimed: ~2-5 minutes expected - Actual: >20 minutes, did not complete **CONCLUSION:** The "expected improvement" was incorrect. Actual measurement proves: - ❌ O(n²) issue NOT resolved - ❌ No significant speedup observed - ❌ Optimizations insufficient for same-namespace scenario Root cause: All 10K files in same namespace triggers entity conflict checking on every merge. MergeWith incremental merge doesn't help when namespace-level entities must be processed repeatedly. Further architectural changes needed. Co-authored-by: T-Gro <[email protected]>
1 parent 84e6a82 commit 0507d8d

File tree

1 file changed

+95
-0
lines changed

1 file changed

+95
-0
lines changed
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# Actual 10,000 File Performance Test Results
2+
3+
## Test Configuration
4+
- **Project**: fsharp-10k (https://github.com/ners/fsharp-10k)
5+
- **Files**: 10,001 F# source files (Foo1.fs through Foo10001.fs)
6+
- **Modules**: 1 module per file, all in same namespace (ConsoleApp1)
7+
- **Compiler**: Optimized with CachedDList + MergeWith incremental merge
8+
- **Date**: December 18, 2025
9+
- **Hardware**: GitHub Actions runner
10+
11+
## Actual Measured Results
12+
13+
### Build Performance - WITH OPTIMIZATIONS
14+
15+
| Metric | Value | Notes |
16+
|--------|-------|-------|
17+
| **Total Time** | **>20 minutes** | Build killed after 20+ minutes (still running) |
18+
| **Peak Memory** | **14.2 GB** | Measured at 16:41 elapsed time |
19+
| **Status** | **Did not complete** | Process terminated due to excessive runtime |
20+
21+
### Process Details
22+
```
23+
Process: fsc.dll compiling ConsoleApp1.fsproj
24+
Start time: 07:35:58 UTC
25+
Killed at: 07:56+ UTC (>20 minutes elapsed)
26+
CPU usage: 124% (single-threaded bottleneck)
27+
Memory: 14.2 GB at 16:41 elapsed, still growing
28+
```
29+
30+
## Analysis
31+
32+
### O(n²) Issue NOT Fully Resolved
33+
34+
The actual test proves that **the optimizations did NOT fix the O(n²) scaling issue**:
35+
36+
1. **5K files**: 17 seconds ✅ (acceptable)
37+
2. **10K files**: >20 minutes ❌ (unacceptable, quadratic scaling persists)
38+
39+
### Expected vs Actual
40+
41+
| Scenario | Expected (Claimed) | Actual (Measured) |
42+
|----------|-------------------|-------------------|
43+
| 10K files | ~2-5 minutes | **>20 minutes (did not complete)** |
44+
| Improvement | 4-10x faster | **No significant improvement** |
45+
46+
### Why Optimizations Didn't Work
47+
48+
While two optimizations were implemented:
49+
50+
1. **CachedDList (O(1) append)**
51+
- Works correctly in microbenchmarks (4.1x faster)
52+
- Does reduce append overhead
53+
54+
2. **MergeWith incremental merge** ⚠️
55+
- Implementation may have issues
56+
- Entity map caching not effective for the fsharp-10k scenario
57+
- All files in same namespace (ConsoleApp1) likely causes cache invalidation
58+
- Map merging still iterates through accumulated entities
59+
60+
### Root Cause Still Present
61+
62+
The fsharp-10k test case has all 10,000 files in the **same namespace** (ConsoleApp1), which means:
63+
- Every file merge triggers entity name conflict checking
64+
- `AllEntitiesByLogicalMangledName` map must be rebuilt or merged for each file
65+
- Even with "incremental" merge, processing 10K namespace-level entities repeatedly → O(n²)
66+
67+
### Memory Growth Pattern
68+
69+
Memory grew from ~4GB initially to 14.2GB after 16 minutes, suggesting:
70+
- Continued accumulation of data structures
71+
- Possible memory leaks or inefficient entity storage
72+
- GC pressure from repeated allocations
73+
74+
## Conclusion
75+
76+
**The "expected improvement" claims were incorrect.** Actual measurement shows:
77+
78+
- ❌ 10K files did NOT complete in 2-5 minutes
79+
- ❌ No 4-10x speedup observed
80+
- ❌ O(n²) scaling persists in practice
81+
82+
**Further architectural changes are needed** beyond CachedDList and MergeWith to truly fix the O(n²) issue for projects with many files in the same namespace.
83+
84+
### Recommendations for Future Work
85+
86+
1. **Profile the actual merge path** with dotnet-trace to identify remaining hot spots
87+
2. **Persistent data structures** for entity maps that support O(log n) union operations
88+
3. **Incremental compilation** to avoid reprocessing all files
89+
4. **Namespace-aware caching** that doesn't invalidate on every file when all files are in same namespace
90+
5. **Consider memoization** of conflict checking results across file merges
91+
92+
## Raw Data
93+
94+
Build log: `/tmp/build_10k_optimized.log`
95+
Process stats captured at multiple time points during execution.

0 commit comments

Comments
 (0)