Skip to content

Commit 4091a41

Browse files
fix(security): replace 10 generic catch blocks with specific handlers
Fixed MessageProcessingService.cs (4 instances) and MetricsService.cs (6 instances) Refs: CodeQL cs/catch-of-all-exceptions
1 parent 94afeb3 commit 4091a41

File tree

10 files changed

+3378
-10
lines changed

10 files changed

+3378
-10
lines changed

ALERT_ANALYSIS_SUMMARY.md

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
# CodeQL Alert Analysis - Updated Summary
2+
3+
**Date**: October 31, 2025
4+
**Total Alerts**: 362 (360 open + 2 fixed)
5+
6+
## ✅ Analysis Complete
7+
8+
After filtering out stale alerts from Docker scans and build artifacts, here's what needs to be fixed:
9+
10+
### Alert Breakdown
11+
12+
| Priority | Count | Estimated Effort | Issues |
13+
|----------|-------|------------------|---------|
14+
| **HIGH** | **81** | **3-4 days** | Resource leaks, null safety, error handling |
15+
| **MEDIUM** | **16** | **4 hours** | Performance (Dictionary access) |
16+
| **LOW** | **43** | **4-6 hours** | Code quality improvements |
17+
| **STALE** | **220** | **2 min (script)** | Docker/build artifacts to dismiss |
18+
| **TOTAL** | **360** | **~5 days** | - |
19+
20+
### High Priority Issues (81 alerts)
21+
22+
1. **Generic Catch Blocks (39)** - Makes debugging difficult
23+
- Most common in: `MessageProcessingService.cs`, `SandboxExecutorService.cs`, `LeasePullService.cs`
24+
- Fix: Replace `catch (Exception ex)` with specific exception types
25+
- Effort: 1.5 days
26+
27+
2. **Resource Disposal (38)** - Memory leaks
28+
- 37 × `cs/local-not-disposed`
29+
- 1 × `cs/missed-using-statement`
30+
- Fix: Add `using` statements for IDisposable objects
31+
- Effort: 1 day
32+
33+
3. **Null Dereference (4)** - Potential crashes
34+
- Fix: Add null checks or use `?.` operator
35+
- Effort: 2 hours
36+
37+
### Medium Priority (16 alerts)
38+
39+
4. **Inefficient Dictionary Access (16)** - Performance
40+
- Pattern: `if (dict.ContainsKey(key)) { var val = dict[key]; }`
41+
- Fix: Use `dict.TryGetValue(key, out var val)` instead
42+
- Effort: 4 hours
43+
44+
### Low Priority (43 alerts)
45+
46+
5. **Path.Combine (22)** - Cross-platform compatibility
47+
6. **Useless Assignments (8)** - Dead code
48+
7. **LINQ Optimizations (5)** - Code clarity
49+
8. **Useless Casts (6)** - Code clarity
50+
9. **Nested Ifs (2)** - Readability
51+
52+
## Top 10 Files with Most Issues
53+
54+
| File | Alert Count | Priority |
55+
|------|-------------|----------|
56+
| `tests/ControlPlane.Api.Tests/MTlsIntegrationTests.cs` | 17 | Mix |
57+
| `tests/ControlPlane.Api.Tests/InvoiceClassifierAgentTests.cs` | 11 | Mix |
58+
| `tests/Node.Runtime.Tests/Services/MessageProcessingServiceTests.cs` | 10 | High |
59+
| `src/Node.Runtime/Services/SandboxExecutorService.cs` | 9 | **High** |
60+
| `src/Node.Runtime/Services/LeasePullService.cs` | 7 | **High** |
61+
| `src/ControlPlane.Api/Services/MetricsService.cs` | 6 | Medium |
62+
| `tests/Node.Runtime.Tests/InvoiceClassifierIntegrationTests.cs` | 5 | Mix |
63+
| `tests/ControlPlane.Api.Tests/LeaseServiceLogicTests.cs` | 5 | Mix |
64+
| `src/Node.Runtime/Program.cs` | 5 | **High** |
65+
| `tests/Node.Runtime.Tests/Integration/DLQHandlingIntegrationTests.cs` | 5 | Mix |
66+
67+
## Recommended Action Plan
68+
69+
### Step 1: Clean Up (5 minutes)
70+
```powershell
71+
.\dismiss-stale-alerts.ps1
72+
```
73+
- Dismisses 220 stale alerts
74+
- Reduces open alerts: 360 → 140
75+
76+
### Step 2: High Priority Fixes (3-4 days)
77+
Focus on production services first:
78+
1. `SandboxExecutorService.cs` (9 alerts)
79+
2. `LeasePullService.cs` (7 alerts)
80+
3. `MessageProcessingService.cs` (see detailed examples in docs)
81+
4. `Program.cs` files
82+
83+
### Step 3: Medium Priority (4 hours)
84+
- Fix Dictionary access patterns across codebase
85+
86+
### Step 4: Low Priority (4-6 hours)
87+
- Bulk fixes for code quality
88+
- Can be done incrementally
89+
90+
### Step 5: Prevention
91+
- Create CodeQL config to exclude build artifacts
92+
- Enable Dependabot
93+
- Add to CI/CD checks
94+
95+
## Files Created
96+
97+
-`SECURITY_REMEDIATION_PLAN.md` - Complete remediation guide (updated)
98+
-`CODEQL_CLEANUP_GUIDE.md` - Why stale alerts exist and how to fix
99+
-`docs/SECURITY_FIX_EXAMPLES.md` - Code examples for each issue type
100+
-`dismiss-stale-alerts.ps1` - Automated cleanup script
101+
-`valid-alerts.xml` - Filtered list of 140 real issues
102+
103+
## Next Steps
104+
105+
1. **Review** the remediation plan
106+
2. **Run** `dismiss-stale-alerts.ps1` to clean up stale alerts
107+
3. **Start** with high-priority fixes in production services
108+
4. **Track** progress using the todo list
109+
5. **Configure** CodeQL exclusions to prevent future stale alerts
110+
111+
## Success Metrics
112+
113+
- ✅ Stale alerts reduced from 220 to 0
114+
- ⏳ High priority alerts: 81 → 0 (target: 2 weeks)
115+
- ⏳ All alerts: 140 → 0 (target: 3 weeks)
116+
- ⏳ Dependabot enabled
117+
- ⏳ CodeQL config updated
118+
119+
---
120+
121+
**Ready to start?** The remediation plan is solid and all tools are ready! 🚀

CODEQL_CLEANUP_GUIDE.md

Lines changed: 182 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
# CodeQL Alert Cleanup Summary
2+
3+
**Date**: October 31, 2025
4+
**Issue**: 360 open CodeQL alerts, but most reference non-existent files
5+
6+
## Problem Identified
7+
8+
Your repository shows **360 open CodeQL security alerts**, but investigation reveals:
9+
10+
| Category | Count | Status |
11+
|----------|-------|--------|
12+
| **Total Open Alerts** | 360 | - |
13+
| **Stale Alerts (non-existent files)** | 220 | ❌ Need dismissal |
14+
| **Valid Alerts (actual source code)** | 140 | ✅ Need remediation |
15+
| **Fixed Alerts** | 2 | ✅ Already resolved |
16+
17+
## Root Cause
18+
19+
CodeQL scanned more than just your source code:
20+
21+
### 1. Docker Container Scans (154 alerts)
22+
**Files**: `dylan-mccarthy/scalable-process-agent-system/control-plane` and `node-runtime`
23+
24+
These alerts came from scanning **compiled Docker container images** instead of source code. When you build Docker images, CodeQL scanned the running containers or image layers.
25+
26+
**Why this happened**: Your CI/CD pipeline likely runs CodeQL after building containers, and it picked up code inside the containers.
27+
28+
### 2. Build Artifacts (62 alerts)
29+
**Files**: `src/*/obj/Release/net9.0/...`
30+
31+
These are **auto-generated files** from .NET compilation:
32+
- `Protos/LeaseService.cs` (30 alerts) - gRPC generated code
33+
- `LeaseService.cs` (29 alerts) - gRPC client/server code
34+
- `RegexGenerator.g.cs` (3 alerts) - C# source generator output
35+
36+
**Why this happened**: CodeQL scanned the `obj/` build output folders.
37+
38+
### 3. Node.js Dependencies (2 alerts)
39+
**Files**: `usr/local/lib/node_modules/npm/...`
40+
41+
These are **npm package files** from the admin-ui or container Node.js installation.
42+
43+
**Why this happened**: CodeQL scanned node_modules or container filesystem.
44+
45+
### 4. Container Runtime Paths (2 alerts)
46+
**Files**: `app/ControlPlane.Api.deps.json`, `app/Node.Runtime.deps.json`
47+
48+
These are **.NET dependency manifests** from inside running containers (`/app` folder).
49+
50+
**Why this happened**: CodeQL scanned container filesystem at runtime.
51+
52+
## Solution
53+
54+
### Step 1: Dismiss Stale Alerts
55+
56+
Run the provided cleanup script:
57+
58+
```powershell
59+
.\dismiss-stale-alerts.ps1
60+
```
61+
62+
This will:
63+
- ✅ Dismiss all 220 stale alerts with appropriate reasons
64+
- ✅ Preserve the 140 legitimate source code alerts
65+
- ✅ Show progress and summary
66+
67+
**Estimated time**: 2-3 minutes (with API rate limiting)
68+
69+
### Step 2: Prevent Future Stale Alerts
70+
71+
Update your CodeQL configuration to exclude these paths:
72+
73+
Create `.github/codeql/codeql-config.yml`:
74+
75+
```yaml
76+
name: "CodeQL Config"
77+
78+
paths-ignore:
79+
# Exclude build artifacts
80+
- '**/obj/**'
81+
- '**/bin/**'
82+
83+
# Exclude Node modules
84+
- '**/node_modules/**'
85+
- 'usr/local/lib/**'
86+
87+
# Exclude container paths
88+
- 'app/**'
89+
90+
# Exclude Docker image paths
91+
- 'dylan-mccarthy/**'
92+
93+
paths:
94+
# Only scan actual source code
95+
- 'src/**'
96+
- 'tests/**'
97+
- '.github/**'
98+
99+
queries:
100+
- uses: security-and-quality
101+
```
102+
103+
Then update `.github/workflows/code-quality.yml`:
104+
105+
```yaml
106+
- name: Initialize CodeQL
107+
uses: github/codeql-action/init@v3
108+
with:
109+
languages: csharp
110+
queries: security-and-quality
111+
config-file: .github/codeql/codeql-config.yml # Add this line
112+
```
113+
114+
### Step 3: Re-analyze Remaining Alerts
115+
116+
After cleanup, re-run the alert analysis to categorize the 140 valid alerts:
117+
118+
```powershell
119+
# Refresh alert data
120+
gh api --paginate "/repos/dylan-mccarthy/Scalable-Process-Agent-System/code-scanning/alerts?per_page=100&state=open" --jq '.[] | {number, rule: .rule.id, file: .most_recent_instance.location.path, line: .most_recent_instance.location.start_line}' > valid-alerts.json
121+
122+
# Group by rule
123+
Get-Content valid-alerts.json | ConvertFrom-Json | Group-Object rule | Select-Object Name, Count | Sort-Object Count -Descending
124+
```
125+
126+
This will give you the **actual** code quality issues to address.
127+
128+
## Expected Outcome
129+
130+
After running the cleanup:
131+
132+
```
133+
Before: 360 open alerts (220 stale + 140 valid)
134+
After: 140 open alerts (all valid source code issues)
135+
Cleanup: 220 alerts dismissed with reasons
136+
```
137+
138+
The 140 remaining alerts will be legitimate code quality issues in your actual source files that should be addressed through the security remediation plan.
139+
140+
## Why GitHub Didn't Auto-Close These
141+
142+
GitHub CodeQL **does not automatically close alerts** when:
143+
- Files are deleted from the repository
144+
- Files are in ignored paths (gitignore)
145+
- Alerts are from previous scans of different file paths
146+
147+
You must manually dismiss stale alerts or configure exclusions to prevent them.
148+
149+
## Next Steps
150+
151+
1. ✅ Run `dismiss-stale-alerts.ps1` to clean up 220 stale alerts
152+
2. ✅ Create `.github/codeql/codeql-config.yml` to exclude build artifacts
153+
3. ✅ Update workflow to use config file
154+
4. ✅ Re-run analysis on the 140 remaining valid alerts
155+
5. ✅ Use updated `SECURITY_REMEDIATION_PLAN.md` for the real issues
156+
157+
## Verification
158+
159+
After running the cleanup script, verify the results:
160+
161+
```powershell
162+
# Check current open alert count
163+
gh api /repos/dylan-mccarthy/Scalable-Process-Agent-System/code-scanning/alerts `
164+
--jq '[.[] | select(.state == "open")] | length'
165+
166+
# Expected: 140 (down from 360)
167+
```
168+
169+
## Questions?
170+
171+
- **Q: Will this delete legitimate security findings?**
172+
A: No, the script only dismisses alerts for files that don't exist in the repository.
173+
174+
- **Q: Can I undo the dismissals?**
175+
A: Yes, dismissed alerts can be reopened from the GitHub Security tab.
176+
177+
- **Q: Why are Docker images being scanned?**
178+
A: Your CI/CD pipeline may be running CodeQL after Docker build. Consider moving CodeQL earlier in the pipeline.
179+
180+
---
181+
182+
**Ready to proceed?** Run the cleanup script and then we'll tackle the real 140 code quality issues! 🚀

0 commit comments

Comments
 (0)