-
Notifications
You must be signed in to change notification settings - Fork 4
fix: auto-kill orphaned processes (isolated to kill only those started in the given window) #21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds a lock-file–based ownership mechanism to manage Deepnote server processes. DeepnoteServerStarter now creates a per-window sessionId and lockFileDir, writes a JSON lock file after server start, deletes it on stop/dispose, and performs session-aware orphan cleanup. Orphan detection uses OS-specific parent-process checks (Unix via PPID; Windows via ParentProcessId/tasklist). Cleanup only terminates processes verified as orphaned; otherwise skips and logs reasons. New private methods handle lock-file lifecycle and orphan detection. The IDeepnoteServerStarter interface now adds an explicit dispose(): Promise and no longer extends vscode.Disposable. Sequence Diagram(s)sequenceDiagram
autonumber
actor User as VS Code Window
participant Starter as DeepnoteServerStarter
participant OS as OS/Process Table
participant FS as LockFileDir (FS)
rect rgba(227,242,253,0.6)
note over Starter,FS: Server startup
User->>Starter: startServerImpl()
Starter->>OS: launch server process
OS-->>Starter: PID
Starter->>FS: writeLockFile(PID, {sessionId, pid, timestamp})
FS-->>Starter: OK
Starter-->>User: ready
end
rect rgba(232,245,233,0.6)
note over Starter,FS: Server stop/dispose
User->>Starter: stopServerImpl()/dispose()
Starter->>FS: deleteLockFile(PID)
FS-->>Starter: OK
Starter-->>User: done
end
sequenceDiagram
autonumber
participant Starter as DeepnoteServerStarter
participant FS as LockFileDir (FS)
participant OS as OS/Process Table
rect rgba(255,249,196,0.6)
note over Starter,OS: Orphan cleanup sweep
Starter->>OS: list candidate PIDs
OS-->>Starter: [pid1, pid2, ...]
loop for each PID
Starter->>FS: readLockFile(PID)
alt lock exists and sessionId != current
Starter->>OS: isProcessOrphaned(PID)
alt orphaned
Starter->>OS: kill(PID)
OS-->>Starter: killed
Starter->>FS: deleteLockFile(PID)
else not orphaned
Starter-->>Starter: skip (log reason)
end
else no lock or same session
Starter->>OS: isProcessOrphaned(PID)
alt orphaned
Starter->>OS: kill(PID)
OS-->>Starter: killed
Starter->>FS: deleteLockFile(PID)
else not orphaned
Starter-->>Starter: skip (log reason)
end
end
end
end
note over OS,Starter: OS checks
OS-->>Starter: Unix: PPID exists? / Windows: ParentProcessId + tasklist
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md(1 hunks)src/kernels/deepnote/deepnoteServerStarter.node.ts(10 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.node.ts
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use
*.node.tsfor Desktop-specific implementations that require full file system access and Python environments
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Usel10n.t()for user-facing strings
Use typed error classes fromsrc/platform/errors/when throwing or handling errors
Use theILoggerservice instead ofconsole.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation withCancellationToken
**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility, then alphabetically
Separate third-party imports from local file imports
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
src/kernels/**/*.ts
📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)
src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
🧬 Code graph analysis (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts (2)
src/platform/common/uuid.ts (1)
generateUuid(10-61)src/platform/logging/index.ts (1)
logger(35-48)
🪛 markdownlint-cli2 (0.18.1)
ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md
3-3: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
6-6: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
9-9: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
14-14: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
18-18: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
28-28: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
32-32: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
42-42: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
43-43: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
49-49: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
53-53: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
54-54: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
60-60: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
72-72: Trailing spaces
Expected: 0 or 2; Actual: 3
(MD009, no-trailing-spaces)
80-80: Trailing spaces
Expected: 0 or 2; Actual: 6
(MD009, no-trailing-spaces)
97-97: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
100-100: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
101-101: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
108-108: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
109-109: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
114-114: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
122-122: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts(10 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.node.ts
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use
*.node.tsfor Desktop-specific implementations that require full file system access and Python environments
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Usel10n.t()for user-facing strings
Use typed error classes fromsrc/platform/errors/when throwing or handling errors
Use theILoggerservice instead ofconsole.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation withCancellationToken
**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility, then alphabetically
Separate third-party imports from local file imports
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
src/kernels/**/*.ts
📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)
src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
Previously, cleanupOrphanedProcesses() would force-kill every process
matching 'deepnote_toolkit server', which could terminate active servers
from other VS Code windows.
This commit implements a sophisticated cleanup mechanism that only kills
genuine orphans by:
1. Session Management:
- Each VS Code window gets a unique session ID (UUID)
- Lock files stored in OS temp dir track server ownership
- Lock files contain: sessionId, pid, timestamp
2. Lock File Lifecycle:
- Created when server starts successfully
- Deleted when server stops normally
- Deleted when orphaned process is killed
- Cleaned up during extension disposal
3. Orphan Detection:
- Verifies parent process exists before killing
- Unix: uses 'ps -o ppid=' and checks if PPID is 1 or parent missing
- Windows: uses 'wmic' to get parent PID and verifies existence
- Only kills if process is truly orphaned (no parent or parent is init)
4. Smart Cleanup Logic:
- Checks lock file for each candidate process
- If lock file exists with different session ID:
* Verifies process is orphaned before killing
* Skips if parent exists (active in another window)
- If no lock file exists:
* Still verifies orphan status (backward compatibility)
* Skips if parent exists
- Comprehensive logging of all decisions
Benefits:
- Multi-window safe: won't kill servers from other VS Code windows
- Robust orphan detection using OS-level parent process verification
- Full audit trail with detailed logging
- Automatic cleanup of stale lock files
- Backward compatible with processes from older versions
Added documentation in ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md
The Windows orphan detection was incorrectly treating the tasklist message 'INFO: No tasks are running which match the specified criteria.' as indicating a live parent process, when it actually means the parent process doesn't exist. Changes: - Normalize and trim stdout from tasklist before checking - Return true (orphaned) when stdout is empty - Return true (orphaned) when stdout starts with 'INFO:' (case-insensitive) - Return true (orphaned) when stdout contains 'no tasks are running' (case-insensitive) - Return true (orphaned) when ppid === 0 - Return false (not orphaned) only when parent process actually exists This ensures that processes whose parents have terminated are correctly identified as orphaned and cleaned up, while processes with active parents in other VS Code windows are preserved.
859f89c to
608ca12
Compare
| }); | ||
|
|
||
| // Clean up any orphaned deepnote-toolkit processes from previous sessions | ||
| this.cleanupOrphanedProcesses().catch((ex) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this happen in getOrStartServer or dispose instead? Normally, you don't want to run business logic in the constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good practice, I definitely agree. We could have a separate method we'd call after instantiation.
Having it in the getOrStartServer does not seem like a good idea according to AI:
Current approach (cleanup in constructor):
- ✅ Runs once at extension startup
- ✅ Cleans up orphans before any server is needed
- ✅ Simpler - happens automatically when the class is instantiated
- ❌ Runs even if user never opens a Deepnote file
- ❌ Adds to extension activation time
Alternative approach (cleanup in getOrStartServer):
- ✅ Only runs when actually needed (lazy)
- ✅ Doesn't slow down extension activation if Deepnote isn't used
- ❌ Adds latency to first server start
- ❌ More complex - need to track if cleanup has run
- ❌ Could run multiple times if called concurrently before first completion
Current Approach (Constructor) vs. Moving to getOrStartServer()
I would recommend keeping it in the constructor for these reasons:
-
It's a singleton - The class is registered as
addSingleton, so the constructor only runs once during extension activation, not repeatedly. -
Cleanup is fast - The cleanup operation is relatively quick (just checking processes and killing orphans). It's not a heavy operation that would significantly impact extension activation time.
-
Better user experience - When a user opens a Deepnote file, they want the server to start quickly. Having cleanup already done means no unexpected delay on first use.
-
Simpler code - No need to track whether cleanup has run or handle race conditions if multiple files are opened simultaneously before cleanup completes.
-
Cleaner separation of concerns - Constructor handles initialization/cleanup,
getOrStartServer()focuses on server management.
However, if you wanted to move it to getOrStartServer(), you would need to:
- Add a flag like
private cleanupCompleted = false - Add a promise to track in-flight cleanup:
private cleanupPromise?: Promise<void> - Check/await this promise at the start of
getOrStartServer() - Handle concurrent calls properly
My recommendation: Keep it in the constructor. The current design is cleaner and the performance impact is negligible since:
- It only runs once (singleton)
- It runs in the background (fire-and-forget with
.catch()) - It doesn't block extension activation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I moved it to activate() and registered this in the service manager that will automatically call it on start. Tested and it logs the messages from the logic so it seems to work fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome 🎉
… pattern - Move initialization work from constructor to activate() method - Implement IExtensionSyncActivationService interface - Register as activation service in serviceRegistry - Follow codebase convention of minimal constructors with explicit activation - Update documentation to reflect new activation pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md(1 hunks)src/kernels/deepnote/deepnoteServerStarter.node.ts(5 hunks)src/notebooks/serviceRegistry.node.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.node.ts
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use
*.node.tsfor Desktop-specific implementations that require full file system access and Python environments
Files:
src/notebooks/serviceRegistry.node.tssrc/kernels/deepnote/deepnoteServerStarter.node.ts
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Usel10n.t()for user-facing strings
Use typed error classes fromsrc/platform/errors/when throwing or handling errors
Use theILoggerservice instead ofconsole.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation withCancellationToken
**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility, then alphabetically
Separate third-party imports from local file imports
Files:
src/notebooks/serviceRegistry.node.tssrc/kernels/deepnote/deepnoteServerStarter.node.ts
src/kernels/**/*.ts
📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)
src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations
Files:
src/kernels/deepnote/deepnoteServerStarter.node.ts
🧠 Learnings (2)
📚 Learning: 2025-09-03T12:55:29.175Z
Learnt from: CR
PR: deepnote/vscode-extension#0
File: .github/instructions/ipywidgets.instructions.md:0-0
Timestamp: 2025-09-03T12:55:29.175Z
Learning: Applies to src/notebooks/controllers/ipywidgets/serviceRegistry.{node,web}.ts : Register DI services in platform-specific registries (serviceRegistry.node.ts vs serviceRegistry.web.ts)
Applied to files:
src/notebooks/serviceRegistry.node.ts
📚 Learning: 2025-09-12T12:52:09.726Z
Learnt from: CR
PR: deepnote/vscode-deepnote#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T12:52:09.726Z
Learning: Applies to src/notebooks/deepnote/deepnoteActivationService.ts : deepnoteActivationService.ts wires VSCode activation
Applied to files:
src/notebooks/serviceRegistry.node.ts
🧬 Code graph analysis (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts (2)
src/kernels/deepnote/types.ts (3)
IDeepnoteServerStarter(90-90)IDeepnoteServerStarter(91-116)DeepnoteServerInfo(118-122)src/platform/common/uuid.ts (1)
generateUuid(10-61)
🪛 markdownlint-cli2 (0.18.1)
ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md
22-22: Fenced code blocks should be surrounded by blank lines
(MD031, blanks-around-fences)
🔇 Additional comments (3)
src/kernels/deepnote/deepnoteServerStarter.node.ts (3)
299-377: LGTM! Graceful shutdown implementation.The enhanced disposal flow correctly:
- Waits for pending ops with timeout
- Attempts SIGTERM before SIGKILL
- Tracks exit promises with timeouts
- Cleans up lock files after process termination
449-523: Orphan detection correctly implemented.Platform-specific checks properly handle:
- Windows: PPID via wmic, parent existence via tasklist, handles INFO: message
- Unix: PPID via ps -o ppid=, parent check via ps -o pid= (headerless)
- Conservative fallback returns false on errors
Past review fixes incorporated correctly.
560-572: Verify PID extraction patterns
- Windows: run
tasklist /FI "IMAGENAME eq python.exe" /FO CSV /NHlocally and confirm lines like"python.exe","12345",…; adjust the regex to /"python.exe","(\d+)"/i if casing varies.- Unix: splitting on whitespace and using the second field correctly extracts the PID from
ps aux.
Replaces #19
Fully vibe coded via Augment and Coderabbit. It seems to work for me.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation