Skip to content

Commit 6af2eee

Browse files
author
Eric Oliver
committed
plan to implement cleanup
1 parent 83be0f6 commit 6af2eee

File tree

1 file changed

+229
-0
lines changed

1 file changed

+229
-0
lines changed
Lines changed: 229 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,229 @@
1+
# CLI Process Exit Cleanup - Architectural Solution
2+
3+
## Executive Summary
4+
5+
The reviewer's feedback is **absolutely correct**. The `process.exit(0)` call at line 354 in `src/cli/cli-entry.ts` is a band-aid solution that was added when the CLI was hanging and not exiting naturally. This prevents proper cleanup of resources and is technically debt that needs to be addressed.
6+
7+
## Root Cause Validation
8+
9+
### Evidence of Band-Aid Solution
10+
11+
1. **Banner text**: "Press Ctrl+C twice to **force exit**" (banner.ts:23)
12+
2. **Existing force exit logic**: Timeout mechanism already exists (index.ts:210)
13+
3. **Test comments**: "Complete the task to avoid hanging test" (BatchProcessor.test.ts:250)
14+
4. **Recovery mentions**: "hanging operations" (RecoveryManager.ts:72)
15+
16+
### Resources Keeping Event Loop Alive
17+
18+
Our analysis identified exactly what's preventing natural process exit:
19+
20+
```typescript
21+
// setInterval timers keeping process alive:
22+
- SessionManager.autoSaveTimer (line 330)
23+
- CLIMcpService.healthChecker (line 507)
24+
- MemoryOptimizer.gcInterval (line 51)
25+
- CLIUIService.spinner intervals (line 378)
26+
27+
// EventEmitters with active listeners:
28+
- SessionManager extends EventEmitter
29+
- BatchProcessor extends EventEmitter
30+
- NonInteractiveModeService extends EventEmitter
31+
32+
// Active network connections:
33+
- SSE/WebSocket MCP connections
34+
- StdioMcpConnection child processes
35+
36+
// Various setTimeout calls throughout codebase
37+
```
38+
39+
## Recommended Architecture
40+
41+
### Phase 1: Create Centralized CleanupManager
42+
43+
```mermaid
44+
graph TD
45+
A[CLI Entry Point] --> B[CleanupManager.performShutdown]
46+
B --> C[Timer Cleanup]
47+
B --> D[MCP Service Cleanup]
48+
B --> E[Performance Monitoring Cleanup]
49+
B --> F[Event Listener Cleanup]
50+
51+
C --> C1[Clear setInterval timers]
52+
C --> C2[Clear setTimeout timers]
53+
C --> C3[Cancel pending promises]
54+
55+
D --> D1[GlobalCLIMcpService.dispose]
56+
D --> D2[Force close SSE connections]
57+
D --> D3[SIGTERM → SIGKILL child processes]
58+
59+
E --> E1[memoryOptimizer.stopMonitoring]
60+
E --> E2[Clear GC intervals]
61+
E --> E3[Stop performance timers]
62+
63+
F --> F1[Remove all EventEmitter listeners]
64+
F --> F2[Close browser sessions]
65+
F --> F3[Emergency recovery cleanup]
66+
```
67+
68+
### Phase 2: Implementation Strategy
69+
70+
```typescript
71+
// New CleanupManager service
72+
export class CleanupManager {
73+
private static instance: CleanupManager
74+
private cleanupTasks: Array<() => Promise<void>> = []
75+
private isShuttingDown = false
76+
77+
static getInstance(): CleanupManager {
78+
if (!this.instance) {
79+
this.instance = new CleanupManager()
80+
}
81+
return this.instance
82+
}
83+
84+
registerCleanupTask(task: () => Promise<void>): void {
85+
this.cleanupTasks.push(task)
86+
}
87+
88+
async performShutdown(timeoutMs: number = 10000): Promise<void> {
89+
if (this.isShuttingDown) return
90+
this.isShuttingDown = true
91+
92+
getCLILogger().debug("[CleanupManager] Starting graceful shutdown...")
93+
94+
try {
95+
// Execute all cleanup tasks with timeout
96+
await Promise.race([
97+
Promise.allSettled(this.cleanupTasks.map((task) => task())),
98+
new Promise((_, reject) => setTimeout(() => reject(new Error("Cleanup timeout")), timeoutMs)),
99+
])
100+
101+
getCLILogger().debug("[CleanupManager] Graceful shutdown completed")
102+
103+
// Allow event loop to drain naturally
104+
setImmediate(() => {
105+
if (!process.exitCode) {
106+
process.exitCode = 0
107+
}
108+
})
109+
} catch (error) {
110+
getCLILogger().warn("[CleanupManager] Cleanup timeout, forcing exit:", error)
111+
// Last resort - but with proper logging
112+
setTimeout(() => process.exit(process.exitCode || 0), 1000)
113+
}
114+
}
115+
}
116+
```
117+
118+
### Phase 3: Replace process.exit(0) with Proper Cleanup
119+
120+
**Current problematic code (cli-entry.ts:352-354):**
121+
122+
```typescript
123+
// Exit successfully after non-interactive execution completes
124+
getCLILogger().debug("[cli-entry] Non-interactive execution completed successfully, exiting...")
125+
process.exit(0) // ❌ BAND-AID - Prevents cleanup!
126+
```
127+
128+
**Proposed replacement:**
129+
130+
```typescript
131+
// Perform proper cleanup before exit
132+
getCLILogger().debug("[cli-entry] Non-interactive execution completed, performing cleanup...")
133+
134+
try {
135+
const cleanupManager = CleanupManager.getInstance()
136+
137+
// Register essential cleanup tasks
138+
cleanupManager.registerCleanupTask(async () => {
139+
if (options.mcpAutoConnect) {
140+
const { GlobalCLIMcpService } = await import("./services/GlobalCLIMcpService")
141+
const globalMcpService = GlobalCLIMcpService.getInstance()
142+
await globalMcpService.dispose()
143+
}
144+
})
145+
146+
cleanupManager.registerCleanupTask(async () => {
147+
memoryOptimizer.stopMonitoring()
148+
})
149+
150+
cleanupManager.registerCleanupTask(async () => {
151+
// Clear any remaining timers/intervals
152+
// This would be implemented in each service
153+
})
154+
155+
// Perform graceful shutdown
156+
await cleanupManager.performShutdown()
157+
158+
getCLILogger().debug("[cli-entry] Cleanup completed successfully")
159+
} catch (error) {
160+
getCLILogger().error("[cli-entry] Cleanup failed:", error)
161+
process.exitCode = 1
162+
}
163+
164+
// Process will exit naturally once event loop drains
165+
```
166+
167+
## Implementation Priority
168+
169+
### High Priority (Immediate)
170+
171+
1. **Create CleanupManager service** - Centralized cleanup coordination
172+
2. **Replace process.exit(0)** - Remove the band-aid solution
173+
3. **Add MCP disposal cleanup** - Ensure GlobalCLIMcpService.dispose() is called
174+
4. **Add timer cleanup** - Clear setInterval/setTimeout keeping process alive
175+
176+
### Medium Priority
177+
178+
1. **Enhanced MCP disposal** - Force terminate stubborn child processes
179+
2. **EventEmitter cleanup** - Remove all listeners before exit
180+
3. **Browser session cleanup** - Close any open browser sessions
181+
4. **Performance monitoring cleanup** - Stop all monitoring services
182+
183+
### Low Priority
184+
185+
1. **Comprehensive testing** - Regression tests for hanging scenarios
186+
2. **Debug logging enhancement** - Better visibility into cleanup process
187+
3. **Timeout tuning** - Optimize cleanup timeouts for performance
188+
189+
## Benefits of This Approach
190+
191+
### ✅ Technical Benefits
192+
193+
- **Removes technical debt** - Eliminates the band-aid `process.exit(0)`
194+
- **Prevents resource leaks** - Proper cleanup of all resources
195+
- **Enables natural exit** - Process terminates when event loop drains
196+
- **Maintains reliability** - Fallback force-exit if cleanup hangs
197+
198+
### ✅ User Experience Benefits
199+
200+
- **Faster CLI response** - No hanging while waiting for cleanup
201+
- **Cleaner automation** - Reliable exit codes for scripts
202+
- **Better error handling** - Graceful shutdown with proper logging
203+
- **Reduced memory usage** - No lingering processes or connections
204+
205+
## Risk Mitigation
206+
207+
### Potential Risks
208+
209+
- Cleanup might take too long in some scenarios
210+
- Some services might resist graceful shutdown
211+
- Changes could affect interactive mode
212+
213+
### Mitigation Strategies
214+
215+
- **Graduated timeouts** - Quick cleanup with fallback force-exit
216+
- **Comprehensive testing** - Test both batch and interactive modes
217+
- **Rollback plan** - Keep existing logic as emergency fallback
218+
- **Monitoring** - Log cleanup performance to identify issues
219+
220+
## Success Criteria
221+
222+
- ✅ CLI exits naturally without `process.exit(0)`
223+
- ✅ All MCP connections disposed properly
224+
- ✅ No hanging timers or intervals
225+
- ✅ Clean exit codes (0 for success, 1 for error)
226+
- ✅ Total cleanup time < 10 seconds
227+
- ✅ No regression in interactive mode functionality
228+
229+
This architectural approach transforms the band-aid solution into a robust, maintainable cleanup system that addresses the root causes of the hanging issue.

0 commit comments

Comments
 (0)