Skip to content

Conversation

@jankuca
Copy link
Contributor

@jankuca jankuca commented Oct 5, 2025

Replaces #19

Fully vibe coded via Augment and Coderabbit. It seems to work for me.

Summary by CodeRabbit

  • New Features

    • Session-aware lock files to track server ownership across windows and explicit server disposal to improve lifecycle handling.
    • Cross-platform orphan-process detection to avoid killing active servers.
  • Bug Fixes

    • Prevents accidental termination of active servers in other windows.
    • More reliable startup/shutdown by handling stale processes and removing stale locks.
  • Documentation

    • Added design doc describing orphan cleanup lifecycle and edge cases.

@jankuca jankuca changed the base branch from main to jk/fix/kill-orphaned October 5, 2025 15:06
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 5, 2025

📝 Walkthrough

Walkthrough

Adds a lock-file–based ownership mechanism to manage Deepnote server processes. DeepnoteServerStarter now creates a per-window sessionId and lockFileDir, writes a JSON lock file after server start, deletes it on stop/dispose, and performs session-aware orphan cleanup. Orphan detection uses OS-specific parent-process checks (Unix via PPID; Windows via ParentProcessId/tasklist). Cleanup only terminates processes verified as orphaned; otherwise skips and logs reasons. New private methods handle lock-file lifecycle and orphan detection. The IDeepnoteServerStarter interface now adds an explicit dispose(): Promise and no longer extends vscode.Disposable.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User as VS Code Window
  participant Starter as DeepnoteServerStarter
  participant OS as OS/Process Table
  participant FS as LockFileDir (FS)

  rect rgba(227,242,253,0.6)
  note over Starter,FS: Server startup
  User->>Starter: startServerImpl()
  Starter->>OS: launch server process
  OS-->>Starter: PID
  Starter->>FS: writeLockFile(PID, {sessionId, pid, timestamp})
  FS-->>Starter: OK
  Starter-->>User: ready
  end

  rect rgba(232,245,233,0.6)
  note over Starter,FS: Server stop/dispose
  User->>Starter: stopServerImpl()/dispose()
  Starter->>FS: deleteLockFile(PID)
  FS-->>Starter: OK
  Starter-->>User: done
  end
Loading
sequenceDiagram
  autonumber
  participant Starter as DeepnoteServerStarter
  participant FS as LockFileDir (FS)
  participant OS as OS/Process Table

  rect rgba(255,249,196,0.6)
  note over Starter,OS: Orphan cleanup sweep
  Starter->>OS: list candidate PIDs
  OS-->>Starter: [pid1, pid2, ...]
  loop for each PID
    Starter->>FS: readLockFile(PID)
    alt lock exists and sessionId != current
      Starter->>OS: isProcessOrphaned(PID)
      alt orphaned
        Starter->>OS: kill(PID)
        OS-->>Starter: killed
        Starter->>FS: deleteLockFile(PID)
      else not orphaned
        Starter-->>Starter: skip (log reason)
      end
    else no lock or same session
      Starter->>OS: isProcessOrphaned(PID)
      alt orphaned
        Starter->>OS: kill(PID)
        OS-->>Starter: killed
        Starter->>FS: deleteLockFile(PID)
      else not orphaned
        Starter-->>Starter: skip (log reason)
      end
    end
  end
  end

  note over OS,Starter: OS checks
  OS-->>Starter: Unix: PPID exists? / Windows: ParentProcessId + tasklist
Loading

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly captures the primary change of auto-killing orphaned processes limited to those started in the given window, reflecting the session-aware lock-file cleanup introduced in the PR. It is clear, specific, and avoids generic terms, giving reviewers an accurate summary of the fix.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4a996ac and d2ef82a.

📒 Files selected for processing (2)
  • ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md (1 hunks)
  • src/kernels/deepnote/deepnoteServerStarter.node.ts (10 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.node.ts

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use *.node.ts for Desktop-specific implementations that require full file system access and Python environments

Files:

  • src/kernels/deepnote/deepnoteServerStarter.node.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Use l10n.t() for user-facing strings
Use typed error classes from src/platform/errors/ when throwing or handling errors
Use the ILogger service instead of console.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation with CancellationToken

**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility, then alphabetically
Separate third-party imports from local file imports

Files:

  • src/kernels/deepnote/deepnoteServerStarter.node.ts
src/kernels/**/*.ts

📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)

src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations

Files:

  • src/kernels/deepnote/deepnoteServerStarter.node.ts
🧬 Code graph analysis (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts (2)
src/platform/common/uuid.ts (1)
  • generateUuid (10-61)
src/platform/logging/index.ts (1)
  • logger (35-48)
🪛 markdownlint-cli2 (0.18.1)
ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md

3-3: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


6-6: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


9-9: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


14-14: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


18-18: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


28-28: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


32-32: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


42-42: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


43-43: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


49-49: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


53-53: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


54-54: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


60-60: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


72-72: Trailing spaces
Expected: 0 or 2; Actual: 3

(MD009, no-trailing-spaces)


80-80: Trailing spaces
Expected: 0 or 2; Actual: 6

(MD009, no-trailing-spaces)


97-97: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


100-100: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


101-101: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


108-108: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


109-109: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


114-114: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)


122-122: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below

(MD022, blanks-around-headings)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d2ef82a and 10a4d66.

📒 Files selected for processing (1)
  • src/kernels/deepnote/deepnoteServerStarter.node.ts (10 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.node.ts

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use *.node.ts for Desktop-specific implementations that require full file system access and Python environments

Files:

  • src/kernels/deepnote/deepnoteServerStarter.node.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Use l10n.t() for user-facing strings
Use typed error classes from src/platform/errors/ when throwing or handling errors
Use the ILogger service instead of console.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation with CancellationToken

**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility, then alphabetically
Separate third-party imports from local file imports

Files:

  • src/kernels/deepnote/deepnoteServerStarter.node.ts
src/kernels/**/*.ts

📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)

src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations

Files:

  • src/kernels/deepnote/deepnoteServerStarter.node.ts

@jankuca jankuca changed the base branch from jk/fix/kill-orphaned to main October 5, 2025 15:47
Previously, cleanupOrphanedProcesses() would force-kill every process
matching 'deepnote_toolkit server', which could terminate active servers
from other VS Code windows.

This commit implements a sophisticated cleanup mechanism that only kills
genuine orphans by:

1. Session Management:
   - Each VS Code window gets a unique session ID (UUID)
   - Lock files stored in OS temp dir track server ownership
   - Lock files contain: sessionId, pid, timestamp

2. Lock File Lifecycle:
   - Created when server starts successfully
   - Deleted when server stops normally
   - Deleted when orphaned process is killed
   - Cleaned up during extension disposal

3. Orphan Detection:
   - Verifies parent process exists before killing
   - Unix: uses 'ps -o ppid=' and checks if PPID is 1 or parent missing
   - Windows: uses 'wmic' to get parent PID and verifies existence
   - Only kills if process is truly orphaned (no parent or parent is init)

4. Smart Cleanup Logic:
   - Checks lock file for each candidate process
   - If lock file exists with different session ID:
     * Verifies process is orphaned before killing
     * Skips if parent exists (active in another window)
   - If no lock file exists:
     * Still verifies orphan status (backward compatibility)
     * Skips if parent exists
   - Comprehensive logging of all decisions

Benefits:
- Multi-window safe: won't kill servers from other VS Code windows
- Robust orphan detection using OS-level parent process verification
- Full audit trail with detailed logging
- Automatic cleanup of stale lock files
- Backward compatible with processes from older versions

Added documentation in ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md
The Windows orphan detection was incorrectly treating the tasklist message
'INFO: No tasks are running which match the specified criteria.' as
indicating a live parent process, when it actually means the parent
process doesn't exist.

Changes:
- Normalize and trim stdout from tasklist before checking
- Return true (orphaned) when stdout is empty
- Return true (orphaned) when stdout starts with 'INFO:' (case-insensitive)
- Return true (orphaned) when stdout contains 'no tasks are running' (case-insensitive)
- Return true (orphaned) when ppid === 0
- Return false (not orphaned) only when parent process actually exists

This ensures that processes whose parents have terminated are correctly
identified as orphaned and cleaned up, while processes with active
parents in other VS Code windows are preserved.
@jankuca jankuca force-pushed the jk/fix/kill-orphaned-isolated branch from 859f89c to 608ca12 Compare October 5, 2025 16:29
@jankuca jankuca marked this pull request as ready for review October 5, 2025 16:35
});

// Clean up any orphaned deepnote-toolkit processes from previous sessions
this.cleanupOrphanedProcesses().catch((ex) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this happen in getOrStartServer or dispose instead? Normally, you don't want to run business logic in the constructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good practice, I definitely agree. We could have a separate method we'd call after instantiation.

Having it in the getOrStartServer does not seem like a good idea according to AI:


Current approach (cleanup in constructor):

  • ✅ Runs once at extension startup
  • ✅ Cleans up orphans before any server is needed
  • ✅ Simpler - happens automatically when the class is instantiated
  • ❌ Runs even if user never opens a Deepnote file
  • ❌ Adds to extension activation time

Alternative approach (cleanup in getOrStartServer):

  • ✅ Only runs when actually needed (lazy)
  • ✅ Doesn't slow down extension activation if Deepnote isn't used
  • ❌ Adds latency to first server start
  • ❌ More complex - need to track if cleanup has run
  • ❌ Could run multiple times if called concurrently before first completion

Current Approach (Constructor) vs. Moving to getOrStartServer()

I would recommend keeping it in the constructor for these reasons:

  1. It's a singleton - The class is registered as addSingleton, so the constructor only runs once during extension activation, not repeatedly.

  2. Cleanup is fast - The cleanup operation is relatively quick (just checking processes and killing orphans). It's not a heavy operation that would significantly impact extension activation time.

  3. Better user experience - When a user opens a Deepnote file, they want the server to start quickly. Having cleanup already done means no unexpected delay on first use.

  4. Simpler code - No need to track whether cleanup has run or handle race conditions if multiple files are opened simultaneously before cleanup completes.

  5. Cleaner separation of concerns - Constructor handles initialization/cleanup, getOrStartServer() focuses on server management.

However, if you wanted to move it to getOrStartServer(), you would need to:

  • Add a flag like private cleanupCompleted = false
  • Add a promise to track in-flight cleanup: private cleanupPromise?: Promise<void>
  • Check/await this promise at the start of getOrStartServer()
  • Handle concurrent calls properly

My recommendation: Keep it in the constructor. The current design is cleaner and the performance impact is negligible since:

  • It only runs once (singleton)
  • It runs in the background (fire-and-forget with .catch())
  • It doesn't block extension activation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I moved it to activate() and registered this in the service manager that will automatically call it on start. Tested and it logs the messages from the logic so it seems to work fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome 🎉

… pattern

- Move initialization work from constructor to activate() method
- Implement IExtensionSyncActivationService interface
- Register as activation service in serviceRegistry
- Follow codebase convention of minimal constructors with explicit activation
- Update documentation to reflect new activation pattern
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 608ca12 and af07a23.

📒 Files selected for processing (3)
  • ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md (1 hunks)
  • src/kernels/deepnote/deepnoteServerStarter.node.ts (5 hunks)
  • src/notebooks/serviceRegistry.node.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.node.ts

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use *.node.ts for Desktop-specific implementations that require full file system access and Python environments

Files:

  • src/notebooks/serviceRegistry.node.ts
  • src/kernels/deepnote/deepnoteServerStarter.node.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Inject interfaces, not concrete classes
Avoid circular dependencies
Use l10n.t() for user-facing strings
Use typed error classes from src/platform/errors/ when throwing or handling errors
Use the ILogger service instead of console.log
Preserve error details while scrubbing PII in messages and telemetry
Include the Microsoft copyright header in source files
Prefer async/await and handle cancellation with CancellationToken

**/*.{ts,tsx}: Order class members (methods, fields, properties) first by accessibility, then alphabetically
Separate third-party imports from local file imports

Files:

  • src/notebooks/serviceRegistry.node.ts
  • src/kernels/deepnote/deepnoteServerStarter.node.ts
src/kernels/**/*.ts

📄 CodeRabbit inference engine (.github/instructions/kernel.instructions.md)

src/kernels/**/*.ts: Use event-driven updates (EventEmitter) for state changes
Monitor and dispose pending promises to prevent leaks during teardown
Use WeakRef/weak references for notebook/object backreferences to avoid memory leaks
Respect CancellationToken in all async operations

Files:

  • src/kernels/deepnote/deepnoteServerStarter.node.ts
🧠 Learnings (2)
📚 Learning: 2025-09-03T12:55:29.175Z
Learnt from: CR
PR: deepnote/vscode-extension#0
File: .github/instructions/ipywidgets.instructions.md:0-0
Timestamp: 2025-09-03T12:55:29.175Z
Learning: Applies to src/notebooks/controllers/ipywidgets/serviceRegistry.{node,web}.ts : Register DI services in platform-specific registries (serviceRegistry.node.ts vs serviceRegistry.web.ts)

Applied to files:

  • src/notebooks/serviceRegistry.node.ts
📚 Learning: 2025-09-12T12:52:09.726Z
Learnt from: CR
PR: deepnote/vscode-deepnote#0
File: CLAUDE.md:0-0
Timestamp: 2025-09-12T12:52:09.726Z
Learning: Applies to src/notebooks/deepnote/deepnoteActivationService.ts : deepnoteActivationService.ts wires VSCode activation

Applied to files:

  • src/notebooks/serviceRegistry.node.ts
🧬 Code graph analysis (1)
src/kernels/deepnote/deepnoteServerStarter.node.ts (2)
src/kernels/deepnote/types.ts (3)
  • IDeepnoteServerStarter (90-90)
  • IDeepnoteServerStarter (91-116)
  • DeepnoteServerInfo (118-122)
src/platform/common/uuid.ts (1)
  • generateUuid (10-61)
🪛 markdownlint-cli2 (0.18.1)
ORPHAN_PROCESS_CLEANUP_IMPLEMENTATION.md

22-22: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🔇 Additional comments (3)
src/kernels/deepnote/deepnoteServerStarter.node.ts (3)

299-377: LGTM! Graceful shutdown implementation.

The enhanced disposal flow correctly:

  • Waits for pending ops with timeout
  • Attempts SIGTERM before SIGKILL
  • Tracks exit promises with timeouts
  • Cleans up lock files after process termination

449-523: Orphan detection correctly implemented.

Platform-specific checks properly handle:

  • Windows: PPID via wmic, parent existence via tasklist, handles INFO: message
  • Unix: PPID via ps -o ppid=, parent check via ps -o pid= (headerless)
  • Conservative fallback returns false on errors

Past review fixes incorporated correctly.


560-572: Verify PID extraction patterns

  • Windows: run tasklist /FI "IMAGENAME eq python.exe" /FO CSV /NH locally and confirm lines like "python.exe","12345",…; adjust the regex to /"python.exe","(\d+)"/i if casing varies.
  • Unix: splitting on whitespace and using the second field correctly extracts the PID from ps aux.

@jankuca jankuca merged commit 5f38928 into main Oct 7, 2025
3 checks passed
@jankuca jankuca deleted the jk/fix/kill-orphaned-isolated branch October 7, 2025 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants