Create plan for the benchmark

jozefizso · jozefizso · commit 18f27f4c8cb9 · 2025-11-16T14:56:00.000+01:00
diff --git a/.github/prompts/plan-NetOfficeBenchmark.md b/.github/prompts/plan-NetOfficeBenchmark.md
@@ -0,0 +1,261 @@
+# Benchmarking Plan for RemoveObjectFromList Performance (Issue #221)
+
+## Overview
+This plan addresses the performance issue reported in [Issue #221](https://github.com/NetOfficeFw/NetOffice/issues/221) regarding `Core.RemoveObjectFromList` performance bottleneck.
+
+## Problem Statement
+The current implementation uses `List<ICOMObject>` for `_globalObjectList`, resulting in O(n²) complexity when disposing parent objects with multiple children. Each `Remove()` operation is O(n), and disposing N children results in cumulative O(n²) time complexity.
+
+## Current State Analysis
+
+### Code Location
+- File: `Source/NetOffice/Core.cs`
+- Field: `_globalObjectList` (line 96)
+- Method: `RemoveObjectFromList()` (lines 1346-1367)
+
+### Current Implementation
+```csharp
+private List<ICOMObject> _globalObjectList = new List<ICOMObject>();
+
+internal void RemoveObjectFromList(ICOMObject proxy, IEnumerable<ICOMObject> ownerPath)
+{
+    lock (_globalObjectList)
+    {
+        removed = _globalObjectList.Remove(proxy); // O(n) operation
+    }
+}
+```
+
+### Testing Infrastructure
+- **Framework:** NUnit
+- **Location:** `/Users/izso/Developer/NetOffice/Source/NetOffice.Tests/`
+- **Current State:** No benchmarking infrastructure exists
+- **Need:** Create new benchmark project with BenchmarkDotNet
+
+## Benchmark Plan
+
+### 1. Infrastructure Setup
+
+**Create New Benchmark Project:**
+- Project name: `NetOffice.Benchmarks`
+- Location: `/Users/izso/Developer/NetOffice/Benchmarks/` or `/Users/izso/Developer/NetOffice/Tests/Benchmarks/`
+- Dependencies:
+  - BenchmarkDotNet (latest stable)
+  - NetOffice.Core (project reference)
+- Target frameworks: Match main project (.NET Framework, .NET Core, etc.)
+
+### 2. Benchmark Scenarios
+
+Test with varying object counts: **10, 100, 1,000, 10,000 objects**
+
+#### Scenario A: Sequential Removal
+**Purpose:** Simulate disposing a parent with N children (worst-case for List)
+
+**Test:**
+```csharp
+[Benchmark]
+public void SequentialRemoval_List()
+{
+    // Add N objects to List
+    // Remove all objects one by one
+    // Measures O(n²) complexity
+}
+```
+
+**Expected Outcome:** Performance degrades quadratically with List, linearly with HashSet
+
+#### Scenario B: Bulk Disposal
+**Purpose:** Test `DisposeAllCOMProxies()` behavior
+
+**Test:**
+```csharp
+[Benchmark]
+public void BulkDisposal_List()
+{
+    // Add N objects
+    // Remove from end in while loop (current pattern)
+}
+```
+
+**Expected Outcome:** Better than sequential for List, similar for both
+
+#### Scenario C: Mixed Operations
+**Purpose:** Real-world usage with add/remove interleaved
+
+**Test:**
+```csharp
+[Benchmark]
+public void MixedOperations()
+{
+    // 70% adds, 30% removes
+    // Simulate realistic COM object lifecycle
+}
+```
+
+**Expected Outcome:** Tests practical performance under lock contention
+
+#### Scenario D: Memory Allocation
+**Purpose:** Compare memory overhead and GC pressure
+
+**Test:**
+```csharp
+[MemoryDiagnoser]
+[Benchmark]
+public void MemoryFootprint()
+{
+    // Track allocations for List vs HashSet
+    // Measure GC collections
+}
+```
+
+**Expected Outcome:** HashSet has higher base memory but better scaling
+
+### 3. Implementation Variants to Compare
+
+#### Variant 1: Current Implementation (Baseline)
+```csharp
+private List<ICOMObject> _globalObjectList = new List<ICOMObject>();
+```
+- Time: O(n) per removal
+- Memory: O(n)
+- Thread-safety: Manual locking required
+
+#### Variant 2: HashSet (Proposed)
+```csharp
+private HashSet<ICOMObject> _globalObjectList = new HashSet<ICOMObject>();
+```
+- Time: O(1) per removal
+- Memory: O(n) with higher constant factor
+- Thread-safety: Manual locking required
+- **Caveat:** Requires proper `GetHashCode()` and `Equals()` implementation on ICOMObject
+
+#### Variant 3: Dictionary (Alternative)
+```csharp
+private Dictionary<IntPtr, ICOMObject> _globalObjectList = new Dictionary<IntPtr, ICOMObject>();
+```
+- Time: O(1) per removal by key
+- Memory: O(n)
+- Thread-safety: Manual locking required
+- **Benefit:** Can key by COM pointer for guaranteed uniqueness
+
+#### Variant 4: ConcurrentDictionary (Thread-safe)
+```csharp
+private ConcurrentDictionary<IntPtr, ICOMObject> _globalObjectList = new ConcurrentDictionary<IntPtr, ICOMObject>();
+```
+- Time: O(1) per removal (average)
+- Memory: O(n) with higher overhead
+- Thread-safety: Built-in lock-free operations
+- **Benefit:** Reduces lock contention
+
+### 4. Metrics to Capture
+
+**Performance Metrics:**
+- Mean execution time
+- Median execution time
+- 99th percentile (p99)
+- Standard deviation
+- Operations per second
+
+**Memory Metrics:**
+- Total allocated bytes
+- Gen 0/1/2 GC collections
+- Memory footprint per object count
+
+**Scalability Metrics:**
+- Time complexity verification (plot time vs N)
+- Memory scaling (plot memory vs N)
+
+**Configuration:**
+```csharp
+[MemoryDiagnoser]
+[SimpleJob(RuntimeMoniker.Net48)]
+[SimpleJob(RuntimeMoniker.Net60)]
+[SimpleJob(RuntimeMoniker.Net80)]
+public class RemoveObjectBenchmark
+{
+    [Params(10, 100, 1000, 10000)]
+    public int ObjectCount;
+}
+```
+
+### 5. Implementation Steps
+
+1. ✅ Research existing test/benchmark infrastructure
+2. ⏳ Design benchmark scenarios (this document)
+3. ⏳ Create `NetOffice.Benchmarks` project
+4. ⏳ Implement benchmark classes for all variants
+5. ⏳ Create mock ICOMObject implementation for testing
+6. ⏳ Run benchmarks and collect data
+7. ⏳ Generate comparison charts/tables
+8. ⏳ Document findings and recommendations
+9. ⏳ Update Issue #221 with results
+
+### 6. Expected Deliverables
+
+**Code:**
+- `NetOffice.Benchmarks` project
+- `RemoveObjectListBenchmark.cs` - Main benchmark class
+- `CoreVariants.cs` - Different implementation approaches
+- `MockCOMObject.cs` - Test harness
+
+**Documentation:**
+- `BenchmarkResults.md` - Raw benchmark output
+- `PerformanceAnalysis.md` - Analysis and recommendations
+- Charts/graphs showing performance comparison
+- Memory profiling results
+
+**GitHub:**
+- Comment on Issue #221 with findings
+- Recommendation for implementation change
+- Pull request (if proceeding with fix)
+
+### 7. Success Criteria
+
+**Validation Goals:**
+1. Confirm O(n²) vs O(n) complexity difference
+2. Demonstrate measurable performance improvement (target: >10x for N=10,000)
+3. Verify no regression in memory usage (acceptable: <2x increase)
+4. Ensure thread-safety is maintained
+5. Validate that `ICOMObject.Equals()` and `GetHashCode()` work correctly
+
+**Decision Points:**
+- If HashSet shows >10x improvement with <2x memory increase → Recommend implementation
+- If thread contention is significant → Consider ConcurrentDictionary
+- If ICOMObject equality is unreliable → Use Dictionary with IntPtr key
+
+### 8. Risks and Considerations
+
+**Risks:**
+1. **Equality Implementation:** ICOMObject must implement proper equality for HashSet
+2. **Breaking Changes:** Behavior change if list order was relied upon
+3. **Memory Overhead:** HashSet has higher memory overhead than List
+4. **COM Interop Complexity:** Performance may vary with real COM objects
+
+**Mitigations:**
+1. Verify ICOMObject.Equals() implementation before testing
+2. Check codebase for any order-dependent code
+3. Measure memory increase and document trade-offs
+4. Consider hybrid approach if needed
+
+### 9. Future Enhancements
+
+If benchmarking infrastructure proves valuable:
+- Benchmark other performance-critical paths
+- Add continuous performance monitoring to CI/CD
+- Create performance regression tests
+- Profile memory leaks in COM object lifecycle
+
+## References
+
+- Issue: https://github.com/NetOfficeFw/NetOffice/issues/221
+- Code: `Source/NetOffice/Core.cs:96, 1346-1367`
+- BenchmarkDotNet: https://benchmarkdotnet.org/
+- Related: Performance analysis of .NET collections
+
+## Notes
+
+This plan assumes:
+- Modern .NET SDK is available for BenchmarkDotNet
+- ICOMObject has stable equality/hash implementation
+- Benchmark runs can be performed on development machine
+- Results will be validated across different .NET runtime versions