fix: dispose a torn down NB controller's kernel #17097

kevineger · 2025-10-17T18:55:11Z

It's my first time looking at any of this code and the lifecycle of everything is a little challenging to discover at a glance. Sorry if this is overlooking anything obvious! This passes unit tests and I verified manually it solves the issue I reported.

Problem

If a server/kernel is removed, then another was created for the active notebook editor, cell executions against the new kernel would fail. The issue with clear repro steps and the corresponding logs: #17094.

Why

The root cause is improper resource cleanup during the disposal of a VSCodeNotebookController. When a JupyterServerProvider is removed, the extension correctly disposes of the associated controllers. However, the dispose method for the controller did not explicitly dispose of the underlying IKernel instance that it had created and managed.

This left the kernel in an orphaned or "zombie" state. When a new server and controller were subsequently created for the same notebook, it'd try to use the old, still-lingering kernel, which was already in a started state, leading to the "Cannot call start again" error.

Fix

When a controller is disposed, clean up any kernels it was responsible for. This is done by iterating through all currently open NB docs and disposing any kernels who's connection metadata IDs match.

Fixes #17094

kevineger · 2025-10-17T18:57:54Z

@microsoft-github-policy-service agree company="Google"

DonJayamanne

This isn't the right solution as this will kill all kernels
E.g. assume you have a kernel for Python 3.14 and the id is 314
Now one can have two different notebooks running code against 314.
Both are two different instances and processes running against Python 3.14 (i.e. Kernel Id = 314).

Just because we close the first notebook we souldn't be killing the kernel instance running against the other notebook.

DonJayamanne · 2025-10-18T08:09:24Z

src/notebooks/controllers/vscodeNotebookController.ts

        );
        this.isDisposed = true;
+        // Dispose all kernels associated with this controller
+        workspace.notebookDocuments.forEach((doc) => {


Controller instances are not shared across notebooks.
We cannot dispose all other kernels associated with the same kernel.

The latest commit filters to the associatedDocuments. Is that sufficient? If there's a better way, please let me know!

Fixes microsoft#17094 It's my first time looking at any of this code and the lifecycle of everything is a little challenging to discover at a glance. This change passed unit tests and manually verifying, it solves the issue. I'll lay out the problem and solution independently below to make reviewing easier. If a server/kernel is removed, then another was created for the active notebook editor, cell executions against the new kernel would fail. The issue with clear repro steps and the corresponding logs: microsoft#17094. The root cause is improper resource cleanup during the disposal of a `VSCodeNotebookController`. When a `JupyterServerProvider` is removed, the extension correctly disposes of the associated controllers. However, the `dispose` method for the controller did not explicitly dispose of the underlying `IKernel` instance that it had created and managed. This left the kernel in an orphaned or "zombie" state. When a new server and controller were subsequently created for the same notebook, it'd try to use the old, still-lingering kernel, which was already in a started state, leading to the "Cannot call start again" error. The fix enhances the dispose method within the `VSCodeNotebookController`. The new logic ensures that when a controller is disposed, it actively cleans up any kernels it was responsible for. It iterates through all currently open NB docs and siposes any kernels who's connection metadata IDs match.

kevineger · 2025-10-20T21:23:29Z

Thanks for taking a look and sorry I was off-base @DonJayamanne. I'm having a bit of a hard time understanding the life-cycle of everything, with all the wrapping and indirection.

Using the same repro as listed in the issue, if I create a 3rd party Jupyter server, execute code, delete it (provider returns []) and then create another:

this.controller here when the initial cell execution is kicked off uses the new/updated controller.
But kernel.controller.id !== controller.id and so a second execution is started here using the old, disposed controller.

I tried taking out that line but all the CellExecution logic still maintains a tie to the disposed controller.

I'll try and spend some more time building an understanding of all this, but if you have any pointers or recommendations for a fix that'll save me time 👍.

This change modifies the `VSCodeNotebookController`'s `dispose` method to ensure that when a controller is disposed, it disposes the kernels that are directly associated with that specific controller instance.

kevineger mentioned this pull request Oct 17, 2025

Jupyter API: A resolved server cannot be used on a notebook which had a server that was removed. NB controller disposed. #17094

Open

DonJayamanne requested changes Oct 18, 2025

View reviewed changes

kevineger requested a review from DonJayamanne October 20, 2025 21:23

fix: dispose kernels associated with a controller

45a2704

This change modifies the `VSCodeNotebookController`'s `dispose` method to ensure that when a controller is disposed, it disposes the kernels that are directly associated with that specific controller instance.

kevineger force-pushed the nb-kernel-lifecycle branch from cf6f4ab to 45a2704 Compare October 22, 2025 00:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: dispose a torn down NB controller's kernel #17097

fix: dispose a torn down NB controller's kernel #17097

Uh oh!

kevineger commented Oct 17, 2025

Uh oh!

kevineger commented Oct 17, 2025

Uh oh!

DonJayamanne left a comment

Uh oh!

DonJayamanne Oct 18, 2025

Uh oh!

kevineger Oct 22, 2025

Uh oh!

kevineger commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

fix: dispose a torn down NB controller's kernel #17097

Are you sure you want to change the base?

fix: dispose a torn down NB controller's kernel #17097

Uh oh!

Conversation

kevineger commented Oct 17, 2025

Problem

Why

Fix

Uh oh!

kevineger commented Oct 17, 2025

Uh oh!

DonJayamanne left a comment

Choose a reason for hiding this comment

Uh oh!

DonJayamanne Oct 18, 2025

Choose a reason for hiding this comment

Uh oh!

kevineger Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

kevineger commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants