fix: auto-kill orphaned processes (isolated to kill only those started in the given window) #21

jankuca · 2025-10-05T15:06:29Z

Replaces #19

Fully vibe coded via Augment and Coderabbit. It seems to work for me.

Summary by CodeRabbit

New Features
- Session-aware lock files to track server ownership across windows and explicit server disposal to improve lifecycle handling.
- Cross-platform orphan-process detection to avoid killing active servers.
Bug Fixes
- Prevents accidental termination of active servers in other windows.
- More reliable startup/shutdown by handling stale processes and removing stale locks.
Documentation
- Added design doc describing orphan cleanup lifecycle and edge cases.

coderabbitai · 2025-10-05T15:07:04Z

📝 Walkthrough

Walkthrough

Adds a lock-file–based ownership mechanism to manage Deepnote server processes. DeepnoteServerStarter now creates a per-window sessionId and lockFileDir, writes a JSON lock file after server start, deletes it on stop/dispose, and performs session-aware orphan cleanup. Orphan detection uses OS-specific parent-process checks (Unix via PPID; Windows via ParentProcessId/tasklist). Cleanup only terminates processes verified as orphaned; otherwise skips and logs reasons. New private methods handle lock-file lifecycle and orphan detection. The IDeepnoteServerStarter interface now adds an explicit dispose(): Promise and no longer extends vscode.Disposable.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User as VS Code Window
  participant Starter as DeepnoteServerStarter
  participant OS as OS/Process Table
  participant FS as LockFileDir (FS)

  rect rgba(227,242,253,0.6)
  note over Starter,FS: Server startup
  User->>Starter: startServerImpl()
  Starter->>OS: launch server process
  OS-->>Starter: PID
  Starter->>FS: writeLockFile(PID, {sessionId, pid, timestamp})
  FS-->>Starter: OK
  Starter-->>User: ready
  end

  rect rgba(232,245,233,0.6)
  note over Starter,FS: Server stop/dispose
  User->>Starter: stopServerImpl()/dispose()
  Starter->>FS: deleteLockFile(PID)
  FS-->>Starter: OK
  Starter-->>User: done
  end

sequenceDiagram
  autonumber
  participant Starter as DeepnoteServerStarter
  participant FS as LockFileDir (FS)
  participant OS as OS/Process Table

  rect rgba(255,249,196,0.6)
  note over Starter,OS: Orphan cleanup sweep
  Starter->>OS: list candidate PIDs
  OS-->>Starter: [pid1, pid2, ...]
  loop for each PID
    Starter->>FS: readLockFile(PID)
    alt lock exists and sessionId != current
      Starter->>OS: isProcessOrphaned(PID)
      alt orphaned
        Starter->>OS: kill(PID)
        OS-->>Starter: killed
        Starter->>FS: deleteLockFile(PID)
      else not orphaned
        Starter-->>Starter: skip (log reason)
      end
    else no lock or same session
      Starter->>OS: isProcessOrphaned(PID)
      alt orphaned
        Starter->>OS: kill(PID)
        OS-->>Starter: killed
        Starter->>FS: deleteLockFile(PID)
      else not orphaned
        Starter-->>Starter: skip (log reason)
      end
    end
  end
  end

  note over OS,Starter: OS checks
  OS-->>Starter: Unix: PPID exists? / Windows: ParentProcessId + tasklist

Pre-merge checks

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly captures the primary change of auto-killing orphaned processes limited to those started in the given window, reflecting the session-aware lock-file cleanup introduced in the PR. It is clear, specific, and avoids generic terms, giving reviewers an accurate summary of the fix.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a996ac and d2ef82a.

📒 Files selected for processing (2)

ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md (1 hunks)
src/kernels/deepnote/deepnoteServerStarter.node.ts (10 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.node.ts

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use *.node.ts for Desktop-specific implementations that require full file system access and Python environments

Files:

src/kernels/deepnote/deepnoteServerStarter.node.ts

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Use l10n.t() for user-facing strings
Use typed error classes from src/platform/errors/ when throwing or handling errors
Use the ILogger service instead of console.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation with CancellationToken

**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility, then alphabetically
Separate third-party imports from local file imports

Files:

src/kernels/deepnote/deepnoteServerStarter.node.ts

src/kernels/**/*.ts

📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)

src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations

Files:

src/kernels/deepnote/deepnoteServerStarter.node.ts

🧬 Code graph analysis (1)

src/kernels/deepnote/deepnoteServerStarter.node.ts (2)

src/platform/common/uuid.ts (1)

generateUuid (10-61)

src/platform/logging/index.ts (1)

logger (35-48)

🪛 markdownlint-cli2 (0.18.1)

ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md

3-3: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

6-6: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

9-9: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

14-14: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

18-18: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

28-28: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

32-32: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

42-42: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

43-43: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

49-49: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

53-53: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

54-54: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

60-60: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

72-72: Trailing spaces
Expected: 0 or 2; Actual: 3

(MD009, no-trailing-spaces)

80-80: Trailing spaces
Expected: 0 or 2; Actual: 6

(MD009, no-trailing-spaces)

97-97: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

100-100: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

101-101: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

108-108: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

109-109: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

114-114: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

122-122: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md

src/kernels/deepnote/deepnoteServerStarter.node.ts

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d2ef82a and 10a4d66.

📒 Files selected for processing (1)

src/kernels/deepnote/deepnoteServerStarter.node.ts (10 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.node.ts

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use *.node.ts for Desktop-specific implementations that require full file system access and Python environments

Files:

src/kernels/deepnote/deepnoteServerStarter.node.ts

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Use l10n.t() for user-facing strings
Use typed error classes from src/platform/errors/ when throwing or handling errors
Use the ILogger service instead of console.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation with CancellationToken

**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility, then alphabetically
Separate third-party imports from local file imports

Files:

src/kernels/deepnote/deepnoteServerStarter.node.ts

src/kernels/**/*.ts

📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)

src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations

Files:

src/kernels/deepnote/deepnoteServerStarter.node.ts

src/kernels/deepnote/deepnoteServerStarter.node.ts

Previously, cleanupOrphanedProcesses() would force-kill every process matching 'deepnote_toolkit server', which could terminate active servers from other VS Code windows. This commit implements a sophisticated cleanup mechanism that only kills genuine orphans by: 1. Session Management: - Each VS Code window gets a unique session ID (UUID) - Lock files stored in OS temp dir track server ownership - Lock files contain: sessionId, pid, timestamp 2. Lock File Lifecycle: - Created when server starts successfully - Deleted when server stops normally - Deleted when orphaned process is killed - Cleaned up during extension disposal 3. Orphan Detection: - Verifies parent process exists before killing - Unix: uses 'ps -o ppid=' and checks if PPID is 1 or parent missing - Windows: uses 'wmic' to get parent PID and verifies existence - Only kills if process is truly orphaned (no parent or parent is init) 4. Smart Cleanup Logic: - Checks lock file for each candidate process - If lock file exists with different session ID: * Verifies process is orphaned before killing * Skips if parent exists (active in another window) - If no lock file exists: * Still verifies orphan status (backward compatibility) * Skips if parent exists - Comprehensive logging of all decisions Benefits: - Multi-window safe: won't kill servers from other VS Code windows - Robust orphan detection using OS-level parent process verification - Full audit trail with detailed logging - Automatic cleanup of stale lock files - Backward compatible with processes from older versions Added documentation in ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md

The Windows orphan detection was incorrectly treating the tasklist message 'INFO: No tasks are running which match the specified criteria.' as indicating a live parent process, when it actually means the parent process doesn't exist. Changes: - Normalize and trim stdout from tasklist before checking - Return true (orphaned) when stdout is empty - Return true (orphaned) when stdout starts with 'INFO:' (case-insensitive) - Return true (orphaned) when stdout contains 'no tasks are running' (case-insensitive) - Return true (orphaned) when ppid === 0 - Return false (not orphaned) only when parent process actually exists This ensures that processes whose parents have terminated are correctly identified as orphaned and cleaned up, while processes with active parents in other VS Code windows are preserved.

…ection

Artmann · 2025-10-05T17:40:57Z

src/kernels/deepnote/deepnoteServerStarter.node.ts

+        });
+
+        // Clean up any orphaned deepnote-toolkit processes from previous sessions
+        this.cleanupOrphanedProcesses().catch((ex) => {


Should this happen in getOrStartServer or dispose instead? Normally, you don't want to run business logic in the constructor.

That's a good practice, I definitely agree. We could have a separate method we'd call after instantiation.

Having it in the getOrStartServer does not seem like a good idea according to AI:

Current approach (cleanup in constructor):

✅ Runs once at extension startup

✅ Cleans up orphans before any server is needed

✅ Simpler - happens automatically when the class is instantiated

❌ Runs even if user never opens a Deepnote file

❌ Adds to extension activation time

Alternative approach (cleanup in getOrStartServer):

✅ Only runs when actually needed (lazy)

✅ Doesn't slow down extension activation if Deepnote isn't used

❌ Adds latency to first server start

❌ More complex - need to track if cleanup has run

❌ Could run multiple times if called concurrently before first completion

Current Approach (Constructor) vs. Moving to getOrStartServer()

I would recommend keeping it in the constructor for these reasons:

It's a singleton - The class is registered as addSingleton, so the constructor only runs once during extension activation, not repeatedly.

Cleanup is fast - The cleanup operation is relatively quick (just checking processes and killing orphans). It's not a heavy operation that would significantly impact extension activation time.

Better user experience - When a user opens a Deepnote file, they want the server to start quickly. Having cleanup already done means no unexpected delay on first use.

Simpler code - No need to track whether cleanup has run or handle race conditions if multiple files are opened simultaneously before cleanup completes.

Cleaner separation of concerns - Constructor handles initialization/cleanup, getOrStartServer() focuses on server management.

However, if you wanted to move it to getOrStartServer(), you would need to:

Add a flag like private cleanupCompleted = false

Add a promise to track in-flight cleanup: private cleanupPromise?: Promise<void>

Check/await this promise at the start of getOrStartServer()

Handle concurrent calls properly

My recommendation: Keep it in the constructor. The current design is cleaner and the performance impact is negligible since:

It only runs once (singleton)

It runs in the background (fire-and-forget with .catch())

It doesn't block extension activation

Ok I moved it to activate() and registered this in the service manager that will automatically call it on start. Tested and it logs the messages from the logic so it seems to work fine.

Awesome 🎉

… pattern - Move initialization work from constructor to activate() method - Implement IExtensionSyncActivationService interface - Register as activation service in serviceRegistry - Follow codebase convention of minimal constructors with explicit activation - Update documentation to reflect new activation pattern

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 608ca12 and af07a23.

📒 Files selected for processing (3)

ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md (1 hunks)
src/kernels/deepnote/deepnoteServerStarter.node.ts (5 hunks)
src/notebooks/serviceRegistry.node.ts (1 hunks)

🧰 Additional context used

📓 Path-based instructions (3)

**/*.node.ts

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use *.node.ts for Desktop-specific implementations that require full file system access and Python environments

Files:

src/notebooks/serviceRegistry.node.ts
src/kernels/deepnote/deepnoteServerStarter.node.ts

**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Use l10n.t() for user-facing strings
Use typed error classes from src/platform/errors/ when throwing or handling errors
Use the ILogger service instead of console.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation with CancellationToken

**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility, then alphabetically
Separate third-party imports from local file imports

Files:

src/notebooks/serviceRegistry.node.ts
src/kernels/deepnote/deepnoteServerStarter.node.ts

src/kernels/**/*.ts

📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)

src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations

Files:

src/kernels/deepnote/deepnoteServerStarter.node.ts

🧠 Learnings (2)

📚 Learning: 2025-09-03T12:55:29.175Z

Learnt from: CR
PR: deepnote/vscode-extension#0
File: .github/instructions/ipywidgets.instructions.md:0-0
Timestamp: 2025-09-03T12:55:29.175Z
Learning: Applies to src/notebooks/controllers/ipywidgets/serviceRegistry.{node,web}.ts : Register DI services in platform-specific registries (serviceRegistry.node.ts vs serviceRegistry.web.ts)

Applied to files:

src/notebooks/serviceRegistry.node.ts

📚 Learning: 2025-09-12T12:52:09.726Z

Learnt from: CR
PR: deepnote/vscode-deepnote#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T12:52:09.726Z
Learning: Applies to src/notebooks/deepnote/deepnoteActivationService.ts : deepnoteActivationService.ts wires VSCode activation

Applied to files:

src/notebooks/serviceRegistry.node.ts

🧬 Code graph analysis (1)

src/kernels/deepnote/deepnoteServerStarter.node.ts (2)

src/kernels/deepnote/types.ts (3)

IDeepnoteServerStarter (90-90)

IDeepnoteServerStarter (91-116)

DeepnoteServerInfo (118-122)

src/platform/common/uuid.ts (1)

generateUuid (10-61)

🪛 markdownlint-cli2 (0.18.1)

ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md

22-22: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🔇 Additional comments (3)

src/kernels/deepnote/deepnoteServerStarter.node.ts (3)

299-377: LGTM! Graceful shutdown implementation.

The enhanced disposal flow correctly:

Waits for pending ops with timeout

Attempts SIGTERM before SIGKILL

Tracks exit promises with timeouts

Cleans up lock files after process termination

449-523: Orphan detection correctly implemented.

Platform-specific checks properly handle:

Windows: PPID via wmic, parent existence via tasklist, handles INFO: message

Unix: PPID via ps -o ppid=, parent check via ps -o pid= (headerless)

Conservative fallback returns false on errors

Past review fixes incorporated correctly.

560-572: Verify PID extraction patterns

Windows: run tasklist /FI "IMAGENAME eq python.exe" /FO CSV /NH locally and confirm lines like "python.exe","12345",…; adjust the regex to /"python.exe","(\d+)"/i if casing varies.

Unix: splitting on whitespace and using the second field correctly extracts the PID from ps aux.

ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md

jankuca changed the base branch from main to jk/fix/kill-orphaned October 5, 2025 15:06

coderabbitai bot requested changes Oct 5, 2025

View reviewed changes

ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md Show resolved Hide resolved

src/kernels/deepnote/deepnoteServerStarter.node.ts Outdated Show resolved Hide resolved

coderabbitai bot requested changes Oct 5, 2025

View reviewed changes

src/kernels/deepnote/deepnoteServerStarter.node.ts Outdated Show resolved Hide resolved

coderabbitai bot approved these changes Oct 5, 2025

View reviewed changes

jankuca changed the base branch from jk/fix/kill-orphaned to main October 5, 2025 15:47

jankuca added 5 commits October 5, 2025 18:29

fix: auto-kill orphaned processes

c147723

Fix parent process existence check to use ps -o pid= for accurate det…

9c85cd2

…ection

Update docs to reflect ps -o pid= fix for parent process check

608ca12

jankuca force-pushed the jk/fix/kill-orphaned-isolated branch from 859f89c to 608ca12 Compare October 5, 2025 16:29

jankuca marked this pull request as ready for review October 5, 2025 16:35

FilipPyrek approved these changes Oct 5, 2025

View reviewed changes

Artmann reviewed Oct 5, 2025

View reviewed changes

Artmann approved these changes Oct 7, 2025

View reviewed changes

coderabbitai bot requested changes Oct 7, 2025

View reviewed changes

ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md Show resolved Hide resolved

jankuca merged commit 5f38928 into main Oct 7, 2025
3 checks passed

jankuca deleted the jk/fix/kill-orphaned-isolated branch October 7, 2025 10:29

coderabbitai bot approved these changes Oct 7, 2025

View reviewed changes

coderabbitai bot mentioned this pull request Oct 27, 2025

fix: improve orphaned process cleanup to handle child processes safely #119

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: auto-kill orphaned processes (isolated to kill only those started in the given window) #21

fix: auto-kill orphaned processes (isolated to kill only those started in the given window) #21

Uh oh!

jankuca commented Oct 5, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 5, 2025 •

edited

Loading

Walkthrough

Sequence Diagram(s)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Artmann Oct 5, 2025

Uh oh!

jankuca Oct 7, 2025

Uh oh!

jankuca Oct 7, 2025

Uh oh!

Artmann Oct 7, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix: auto-kill orphaned processes (isolated to kill only those started in the given window) #21

fix: auto-kill orphaned processes (isolated to kill only those started in the given window) #21

Uh oh!

Conversation

jankuca commented Oct 5, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Sequence Diagram(s)

Pre-merge checks

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Artmann Oct 5, 2025

Choose a reason for hiding this comment

Uh oh!

jankuca Oct 7, 2025

Choose a reason for hiding this comment

Current Approach (Constructor) vs. Moving to getOrStartServer()

Uh oh!

jankuca Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

Artmann Oct 7, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jankuca commented Oct 5, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 5, 2025 •

edited

Loading