-
Notifications
You must be signed in to change notification settings - Fork 4
fix: improve orphaned process cleanup to handle child processes safely #119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add port-based cleanup for LSP (2087) and Jupyter (8888) servers - Add isDeepnoteRelatedProcess() to verify processes are from deepnote-venvs - Expand process detection to include pylsp and jupyter child processes - Prevent killing external Jupyter/LSP servers not managed by extension - Fix 'Address already in use' errors from orphaned LSP servers The cleanup now uses a two-stage safety check: 1. Verify process is deepnote-related (from deepnote-venvs directory) 2. Verify process is orphaned (parent no longer exists) This prevents interference with user's manual Jupyter servers or other tools while still cleaning up stuck processes from crashed VSCode sessions.
📝 WalkthroughWalkthroughAdds robust, platform-aware orphaned-process cleanup to the Deepnote server starter. New private helpers: isDeepnoteRelatedProcess, isProcessAlive, killProcessGracefully, and cleanupProcessesByPort. cleanupOrphanedProcesses now performs port-based cleanup (port 2087 and 8888–8895) before a broader scan, filters candidates by command-line ownership, distinguishes main server processes via lock files from child processes, chooses kill strategy based on ownership/orphan status, and ensures lock-file removal and non-blocking startup behavior. No public API changes. Sequence Diagram(s)sequenceDiagram
participant Startup as Server Startup
participant PortCleaner as cleanupProcessesByPort
participant Scanner as Broad Scanner
participant Verifier as isDeepnoteRelatedProcess / isProcessAlive
participant Killer as killProcessGracefully
participant Locker as Lock-file manager
Startup->>PortCleaner: Check ports 2087, 8888–8895
PortCleaner->>PortCleaner: Discover listeners (netstat / lsof / ss)
PortCleaner->>Verifier: Verify deepnote-related & alive
alt listener is orphaned
PortCleaner->>Killer: Attempt graceful kill -> escalate
Killer->>Locker: Remove lock file if present
else listener retained
PortCleaner-->>Startup: Skip (owned/active)
end
Startup->>Scanner: Broad process enumeration (ps / tasklist)
Scanner->>Verifier: Filter by command line & liveness
Scanner->>Locker: Check lock-file ownership
alt process is orphaned or non-owner child
Scanner->>Killer: Attempt graceful kill -> escalate
Killer->>Locker: Cleanup lock files
else skip
Scanner-->>Startup: Skip (owner/active)
end
Killer-->>Startup: Report killed/skipped and final status
Possibly related PRs
Suggested reviewers
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (3)**/*.node.ts📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Files:
**/*.{ts,tsx}📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Files:
src/kernels/**/*.ts📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)
Files:
🪛 ast-grep (0.39.6)src/kernels/deepnote/deepnoteServerStarter.node.ts[warning] 848-848: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns. (regexp-from-variable) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #119 +/- ##
=====================================
Coverage 72% 72%
=====================================
Files 536 536
Lines 40837 40837
Branches 4990 4990
=====================================
Hits 29596 29596
Misses 9578 9578
Partials 1663 1663 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts (1)
782-819: Gate by isDeepnoteRelatedProcess before any orphan checks/kills.Prevents accidental kills of non-Deepnote orphaned Python daemons when Windows pushes all python.exe PIDs.
Apply:
// Check each process to determine if it should be killed for (const pid of candidatePids) { + // Final safety gate: ensure the PID is truly Deepnote-related + const deepnoteRelated = await this.isDeepnoteRelatedProcess(pid); + if (!deepnoteRelated) { + pidsToSkip.push({ pid, reason: 'not deepnote-related' }); + continue; + } // Check if there's a lock file for this PID (only main server processes have lock files) const lockData = await this.readLockFile(pid);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts(7 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.node.ts
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use
*.node.tsfor Desktop-specific implementations that require full file system access and Python environments
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Usel10n.t()for user-facing strings
Use typed error classes fromsrc/platform/errors/when throwing or handling errors
Use theILoggerservice instead ofconsole.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation withCancellationToken
**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility (public/protected/private) and then alphabetically
Do not add the Microsoft copyright header to new files
Use Uri.joinPath() to construct file paths instead of manual string concatenation
Add a blank line after groups of const declarations and before return statements for readability
Separate third-party imports from local file imports
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
src/kernels/**/*.ts
📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)
src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
🧬 Code graph analysis (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts (1)
src/platform/logging/index.ts (1)
logger(35-48)
- Use PowerShell Get-CimInstance on Windows (WMIC is deprecated) - Add WMIC fallback for older Windows systems - Add -ww flag to ps on Unix to prevent command line truncation - Use regex path matching for deepnote-venvs detection - Prevent false positives with proper path boundary matching
- Add isProcessAlive() helper to check process status cross-platform - Implement graceful kill on Unix (SIGTERM → SIGKILL escalation) - Implement graceful kill on Windows (taskkill → taskkill /F escalation) - Deduplicate PIDs before processing to avoid double-kills - Add ss fallback when lsof is unavailable on Unix - Filter for LISTEN sockets only using lsof -sTCP:LISTEN - Wait 1 second after SIGTERM before escalating to SIGKILL - Improve error handling and logging throughout cleanup process
- Change from single port 8888 to range 8888-8895 - Jupyter increments port when 8888 is busy, so we need to check multiple ports - Add timing measurement for port-based cleanup phase - Log cleanup duration for performance monitoring The loop adds minimal overhead as each port check is fast when no process is listening. Timing is logged at debug level for monitoring.
- Windows tasklist CSV output never includes command lines - Changed to two-step approach: collect all python.exe/pythonw.exe PIDs first - Then resolve each PID's command line using isDeepnoteRelatedProcess() - Only verified deepnote-related processes are added to candidate list - Kill logic remains gated by isDeepnoteRelatedProcess + orphan checks - Fixes false negatives where deepnote processes were missed on Windows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts (1)
521-563: Use PowerShell CIM for PPID; WMIC is deprecated/missing on many systems.Prefer CIM, fallback to WMIC; then check parent existence. Keeps orphaning accurate on Win10/11.
Apply:
- if (process.platform === 'win32') { - // Windows: use WMIC to get parent process ID - const result = await processService.exec( - 'wmic', - ['process', 'where', `ProcessId=${pid}`, 'get', 'ParentProcessId'], - { throwOnStdErr: false } - ); - - if (result.stdout) { - const lines = result.stdout - .split('\n') - .filter((line) => line.trim() && !line.includes('ParentProcessId')); - if (lines.length > 0) { - const ppid = parseInt(lines[0].trim(), 10); - if (!isNaN(ppid)) { - // PPID of 0 means orphaned - if (ppid === 0) { - return true; - } - - // Check if parent process exists - const parentCheck = await processService.exec( - 'tasklist', - ['/FI', `PID eq ${ppid}`, '/FO', 'CSV', '/NH'], - { throwOnStdErr: false } - ); - - // Normalize and check stdout - const stdout = (parentCheck.stdout || '').trim(); - - // Parent is missing if: - // 1. stdout is empty - // 2. stdout starts with "INFO:" (case-insensitive) - // 3. stdout contains "no tasks are running" (case-insensitive) - if (stdout.length === 0 || /^INFO:/i.test(stdout) || /no tasks are running/i.test(stdout)) { - return true; // Parent missing, process is orphaned - } - - // Parent exists - return false; - } - } - } - } else { + if (process.platform === 'win32') { + // Windows: prefer PowerShell CIM, fallback to WMIC + let ppid: number | undefined; + try { + const ps = await processService.exec( + 'powershell.exe', + ['-NoProfile', '-Command', `(Get-CimInstance Win32_Process -Filter "ProcessId=${pid}").ParentProcessId`], + { throwOnStdErr: false } + ); + const raw = (ps.stdout || '').trim(); + const parsed = parseInt(raw, 10); + if (!isNaN(parsed)) ppid = parsed; + } catch {} + if (ppid === undefined) { + const wmic = await processService.exec( + 'wmic', + ['process', 'where', `ProcessId=${pid}`, 'get', 'ParentProcessId'], + { throwOnStdErr: false } + ); + const lines = (wmic.stdout || '') + .split('\n') + .map((l) => l.trim()) + .filter((l) => l && !/^ParentProcessId/i.test(l)); + const parsed = lines.length ? parseInt(lines[0], 10) : NaN; + if (!isNaN(parsed)) ppid = parsed; + } + if (ppid !== undefined) { + if (ppid === 0) return true; + const parentCheck = await processService.exec( + 'tasklist', + ['/FI', `PID eq ${ppid}`, '/FO', 'CSV', '/NH'], + { throwOnStdErr: false } + ); + const stdout = (parentCheck.stdout || '').trim(); + if (stdout.length === 0 || /^"?INFO:/i.test(stdout) || /no tasks are running/i.test(stdout)) { + return true; + } + return false; + } + } else {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.node.ts
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use
*.node.tsfor Desktop-specific implementations that require full file system access and Python environments
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Usel10n.t()for user-facing strings
Use typed error classes fromsrc/platform/errors/when throwing or handling errors
Use theILoggerservice instead ofconsole.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation withCancellationToken
**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility (public/protected/private) and then alphabetically
Do not add the Microsoft copyright header to new files
Use Uri.joinPath() to construct file paths instead of manual string concatenation
Add a blank line after groups of const declarations and before return statements for readability
Separate third-party imports from local file imports
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
src/kernels/**/*.ts
📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)
src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
🧬 Code graph analysis (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts (2)
src/platform/logging/index.ts (1)
logger(35-48)build/launchWebUtils.js (1)
port(20-24)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build & Test
🔇 Additional comments (3)
src/kernels/deepnote/deepnoteServerStarter.node.ts (3)
593-645: Process-origin detection looks solid.PS CIM fallback + ps -ww + venv regex = robust. Please sanity‑test on Windows 11 and macOS to confirm long argv and WMIC‑absent cases.
673-699: Graceful kill helper is good.TERM then KILL with a short wait is fine.
851-853: Nice: port sweep and Windows Python PID discovery.Range scan 8888–8895 and 2‑step Windows candidate build look good.
Also applies to: 912-920
- Windows tasklist substring match was brittle and unreliable - Changed to use /FO CSV option for structured output - Parse CSV rows to extract PID from second column - Use exact numeric equality comparison instead of substring match - Skip INFO: messages and empty lines in output - Handle parse errors safely with try-catch per line - Return false on any failure while preserving throwOnStdErr: false - More robust and accurate process existence detection on Windows
- Add -nP flag to lsof invocations to avoid slow DNS/service lookups - Clean up lock files after successful process kills in port cleanup logic - Windows: cleanup lock files after both graceful and force kills - Unix: cleanup lock files after graceful kill completes - Uses existing deleteLockFile() helper which safely checks existence - Prevents stale lock files from accumulating after cleanup operations - Improves performance by avoiding unnecessary DNS reverse lookups
- Unix branch was using only string matching to add PIDs to candidates - Now validates each PID with isDeepnoteRelatedProcess() before adding - Added re-verification in kill/lock loop before any action - Matches Windows two-stage verification pattern - Prevents false positives from ps output string matching - Ensures only verified deepnote-related processes are considered - Adds safety check in case process changes between detection and action
- Changed from immediate force kill to graceful termination first - Windows: try taskkill without /F first, escalate to /F on failure - Unix: use existing killProcessGracefully helper (SIGTERM then SIGKILL) - Matches the graceful termination pattern from port-based cleanup - Preserves throwOnStdErr: false behavior throughout - Gives processes a chance to clean up before force termination - More consistent behavior across all cleanup code paths
- pylsp: Python Language Server Protocol (LSP server) - pythonw: Windows Python executable without console window - tlnp: ss command flag for TCP listening numeric ports
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts (1)
520-563: Replace WMIC + localized tasklist checks with PowerShell CIM for orphan detection.WMIC is deprecated and tasklist “INFO:” parsing is locale‑sensitive. Use CIM for PPID and parent liveness to harden this path.
Apply:
- if (process.platform === 'win32') { - // Windows: use WMIC to get parent process ID - const result = await processService.exec( - 'wmic', - ['process', 'where', `ProcessId=${pid}`, 'get', 'ParentProcessId'], - { throwOnStdErr: false } - ); - if (result.stdout) { - const lines = result.stdout - .split('\n') - .filter((line) => line.trim() && !line.includes('ParentProcessId')); - if (lines.length > 0) { - const ppid = parseInt(lines[0].trim(), 10); - if (!isNaN(ppid)) { - // PPID of 0 means orphaned - if (ppid === 0) { - return true; - } - - // Check if parent process exists - const parentCheck = await processService.exec( - 'tasklist', - ['/FI', `PID eq ${ppid}`, '/FO', 'CSV', '/NH'], - { throwOnStdErr: false } - ); - const stdout = (parentCheck.stdout || '').trim(); - if (stdout.length === 0 || /^INFO:/i.test(stdout) || /no tasks are running/i.test(stdout)) { - return true; // Parent missing, process is orphaned - } - return false; - } - } - } - } else { + if (process.platform === 'win32') { + // Windows: prefer CIM for PPID and parent liveness + let ppid = NaN; + try { + const p = await processService.exec( + 'powershell.exe', + ['-NoProfile','-Command', `(Get-CimInstance Win32_Process -Filter "ProcessId=${pid}").ParentProcessId`], + { throwOnStdErr: false } + ); + ppid = parseInt((p.stdout || '').trim(), 10); + } catch { + ppid = NaN; + } + if (!isNaN(ppid)) { + if (ppid === 0) return true; // no parent + try { + const parent = await processService.exec( + 'powershell.exe', + ['-NoProfile','-Command', `(Get-CimInstance Win32_Process -Filter "ProcessId=${ppid}") | Select-Object -ExpandProperty ProcessId`], + { throwOnStdErr: false } + ); + const exists = (parent.stdout || '').trim().length > 0; + return !exists; + } catch { + return true; // if we can’t query parent, treat as orphan for safety + } + } + } else {As per coding guidelines.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
cspell.json(1 hunks)src/kernels/deepnote/deepnoteServerStarter.node.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.node.ts
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use
*.node.tsfor Desktop-specific implementations that require full file system access and Python environments
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Usel10n.t()for user-facing strings
Use typed error classes fromsrc/platform/errors/when throwing or handling errors
Use theILoggerservice instead ofconsole.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation withCancellationToken
**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility (public/protected/private) and then alphabetically
Do not add the Microsoft copyright header to new files
Use Uri.joinPath() to construct file paths instead of manual string concatenation
Add a blank line after groups of const declarations and before return statements for readability
Separate third-party imports from local file imports
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
src/kernels/**/*.ts
📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)
src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
🧬 Code graph analysis (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts (1)
src/platform/logging/index.ts (1)
logger(35-48)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Build & Package Extension
- GitHub Check: Build & Test
🔇 Additional comments (1)
cspell.json (1)
49-55: LGTM.New words align with usage (pylsp/pythonw/ss -tlnp). No issues.
- Changed from substring match to regex-based port extraction - Prevents false positives (e.g., port 8888 matching 88880) - Parses LocalAddress column and extracts port number explicitly - Handles both IPv4 (0.0.0.0:8888) and IPv6 ([::]:8888) formats - Uses exact numeric equality comparison instead of string contains - More robust and accurate port-based process detection
- Some ss builds don't support 'sport = :port' filter syntax - Detect filter failure (empty stdout, stderr mentions 'filter') - Fallback to unfiltered 'ss -tlnp' and parse full output - Use regex to match exact port with word boundary (e.g., :8888\b) - Extract pid=NUMBER from matching lines only - Prevents false positives (port 8888 won't match 88880) - Maintains Set<number> for automatic PID deduplication - Preserves throwOnStdErr: false throughout - Handles null/empty output safely
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
saltenasl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't work, i force quit code which then meant kernel never started for me anymore until i manually killed the process running on port 2087 and then it started working again. The error was in starting LSP via toolkit (I haven't copied it unfortunately)
Here's a comprehensive pull request description:
Summary
Implements comprehensive orphaned process cleanup to prevent "Address already in use" errors when restarting VSCode after crashes or force quits. The cleanup system safely detects and terminates orphaned deepnote-toolkit processes (including LSP servers and Jupyter servers) from previous sessions while preserving active processes.
Problem
When VSCode crashes or is force-quit while a Deepnote notebook is open, the deepnote-toolkit server processes (main server, Python LSP server on port 2087, Jupyter server on port 8888+) become orphaned and continue holding their ports. On restart, attempting to start new servers fails with "Address already in use" errors, requiring manual process cleanup.
Solution
Implemented a multi-layered orphaned process cleanup system that runs on extension activation:
1. Port-Based Cleanup
lsof(withssfallback) on Unix,netstaton Windows2. Process Name-Based Cleanup
deepnote_toolkit server(main server)pylspon port 2087 (LSP server child process)jupyterwith deepnote context (Jupyter child process)3. Two-Stage Safety Verification
Every process must pass both checks before being killed:
Deepnote-Related Check (
isDeepnoteRelatedProcess):ps -wwwith full command linedeepnote-venvsdirectory or containsdeepnote_toolkit/[\\/](deepnote-venvs)[\\/]/iOrphaned Check (
isProcessOrphaned):4. Graceful Termination
All kill operations attempt graceful termination before force kill:
Windows:
taskkill /T /PID(graceful)taskkill /F /T /PIDon failure/Tflag kills entire process treeUnix:
5. Lock File Management
Key Features
Safety
/Tflag)Performance
-nPflag on lsof (no DNS/service lookups, ~10-50x faster)Set<number>Robustness
-wwflag on ps (prevents command line truncation)Technical Details
Windows Process Detection
Unix Process Detection
Process Liveness Check
tasklist /FO CSVoutputkill -0 ${pid}(signal 0 = existence check)Testing
Tested scenarios:
Performance Impact
Breaking Changes
None. This is purely additive functionality that runs on extension activation.
Summary by CodeRabbit
Bug Fixes
Chores