Skip to content

Support multiple backends#596

Merged
andrii-i merged 74 commits intojupyter-server:mainfrom
andrii-i:multiple-backends
Feb 9, 2026
Merged

Support multiple backends#596
andrii-i merged 74 commits intojupyter-server:mainfrom
andrii-i:multiple-backends

Conversation

@andrii-i
Copy link
Collaborator

@andrii-i andrii-i commented Dec 10, 2025

Summary

Update Jupyter Scheduler to support scheduling and running arbitrary file types beyond notebooks. Multiple backends can now run simultaneously, allowing different file types to be handled by their respective backends within the same UI.

Introduces the BaseBackend abstraction, allowing backend authors to define capabilities declaratively (supported file types, output formats, scheduler/executor classes) with automatic discovery via entry points. Custom backends before could have been defined as sets of classes; this PR formalizes the pattern into a BaseBackend class with declarative configuration and automatic discovery via entry points.

Key Features

New BaseBackend abstraction

from jupyter_scheduler.base_backend import BaseBackend

class MyBackend(BaseBackend):
    id = "my_backend"
    name = "My Custom Backend"
    description = "Execute notebooks on my infrastructure"
    scheduler_class = "my_package.scheduler:MyScheduler"
    execution_manager_class = "my_package.executors:MyExecutionManager"
    file_extensions = ["ipynb"]
    output_formats = [{"id": "ipynb", "label": "Notebook"}]

Available backends discovery via new REST API endpoint

  • New GET /scheduler/backends endpoint returns available backends with their capabilities

Backends Provided by Default with Jupyter Scheduler

  • jupyter_server_nb - Execute notebooks via nbconvert (refactoring of an existing notebook execution logic)
  • jupyter_server_py - Execute Python scripts via local subprocess

Backend Discovery via Entry Points

Backends are registered using Python entry points in pyproject.toml:

[project.entry-points."jupyter_scheduler.backends"]
my_backend = "my_package.backends:MyBackend"

Backend Selection UI

  • Backend picker dropdown in "Create Job" form (when multiple backends support the file type)
  • Shows backend name and description
  • Auto-selects based on file extension

Configuration Options

Route legacy jobs (pre-3.0 UUID-only IDs) to a specific backend

c.SchedulerApp.legacy_job_backend = "jupyter_server_nb"

Set preferred backend per file extension

c.SchedulerApp.preferred_backends = {"ipynb": "k8s_backend"}

Code Changes

New files:

  • jupyter_scheduler/base_backend.py - Base class for backends
  • jupyter_scheduler/backend_registry.py - Registry for managing backends
  • jupyter_scheduler/backend_utils.py - backends discovery logic via entry points
  • jupyter_scheduler/job_id.py - Job ID encoding (backend_id:uuid) logic
  • src/util/backend-utils.ts - backend utilities

Modified:

  • extension.py - backend discovery and initialization
  • handlers.py - logic to route requests to correct backend
  • create-job.tsx - added backend picker UI
  • handler.ts - added new /scheduler/backends API endpoint

User-facing changes

Screenshot 2025-12-09 at 9 34 09 PM

Backwards-incompatible changes

None.

Pre-v3 pre-multiple-backend jobs have UUID-only IDs and are routed to legacy_job_backend which be default is set to pre-v3 local notebook execution logic. So without installing additional backends default installation works identically (single jupyter_server_nb backend).

Testing

  • All pre-existing tests pass
  • Added new tests including Playwright E2E tests for multi-backend UI with multiple mocked multiple backends

@andrii-i andrii-i added the enhancement New feature or request label Dec 10, 2025
@andrii-i andrii-i force-pushed the multiple-backends branch 8 times, most recently from 694b782 to b0d7a63 Compare December 12, 2025 16:23
Copy link
Collaborator

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrii-i Thanks for working on this over the last month! This is an impressive amount of code. 💪

I left some code review feedback on the backend below. However, I haven't reviewed the backend or done local testing. Let's talk more about this PR Thursday (will miss standup tomorrow). Hopefully that will give @JGuinegagne time to review as well.

Regardless of we release this in v2 or v3, it would be great to publish a pre-release before publishing an official release. There are a lot of changes here, and I think we will need a bug bash to validate all of them.

@andrii-i
Copy link
Collaborator Author

andrii-i commented Jan 28, 2026

Thank you for the review David. Added "Fixes #597, part of #599" to the top of PR description to clarify that this is planned for release as a part of v3

Copy link
Collaborator

@JGuinegagne JGuinegagne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of minor suggestions/sanity-checks.

Caveat: I'm not familiar with this codebase, some comments may be irrelevant or missing context. Feel free to use judgement on what to address.

@andrii-i andrii-i requested review from JGuinegagne and dlqqq February 5, 2026 15:11
Copy link
Collaborator

@JGuinegagne JGuinegagne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM w/ optional comments.

Evaluate whether to return error details for 5xx (disallowed in a normal service, but may be okay in a client application like jupyter-scheduler).

if expected_message != message:
return False
return True
# Test utilities module (currently empty - add helpers here as needed)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking: remove module?


config: BackendConfig
scheduler: BaseScheduler
scheduler: Any # BaseScheduler at runtime, but Any to support test mocks
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hum, compromising typing integrity for the sake of unit tests doesn't sound right...

if cfg.id in seen_ids:
raise ValueError(f"Duplicate backend ID: '{cfg.id}'")
if ":" in cfg.id:
raise ValueError(f"Backend ID cannot contain ':': '{cfg.id}'")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: performing two validation operations in the same loop, this is a bit odd.
Any way we could use pydantic models to validate the ID regex?

pydantic most likely supports disallowing duplicates in list.


def get_scheduler(self, job_id: str):
def get_scheduler(self, job_id: str) -> BaseScheduler:
"""Get scheduler for a job ID. Raises HTTPError(400) if backend unavailable."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking: I know it's not from this PR, but docstring formalities would call for:

"""Return scheduler for a job ID.

Raises:
    HTTPError(400) if backend unavailable.

"""Resolve backend from payload['backend'] or auto-select by file extension."""
backend_id = payload.get("backend")
"""Resolve backend from payload['backend_id'] or auto-select by file extension."""
backend_id = payload.get("backend_id")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: can backend_id possibly be nullish here?

except Exception as e:
self.log.exception(e)
raise HTTPError(500, "Unexpected error occurred during creation of job.") from e
raise HTTPError(500, f"Unexpected error during creation of job: {e}") from e
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: per DRY principle, consider exploring context manager for error handling

if result.returncode != 0:
raise RuntimeError(
f"Script exited with code {result.returncode}\nstderr: {result.stderr[:500]}"
f"Script exited with code {result.returncode}. See 'Errors' output for full error trace."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in that case, it would help to provide the file path.

await page.waitForSelector('text=Saving Completed', { state: 'hidden' });
await scheduler.assertSnapshot(FILENAMES.CREATE_JOB_VIEW);
// Flaky: file names and timestamps vary by environment
// await scheduler.assertSnapshot(FILENAMES.CREATE_JOB_VIEW);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Collaborator

@dlqqq dlqqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing our feedback Andrii!

I'm approving this PR because the main branch will continue to evolve, and I don't think see any issues worth blocking on right now. The testing you've added inspires confidence that this new feature works, so I think it's fine to merge for now.

Before the v3.0 release, we should revisit the architecture as a team, remove any unnecessary / excessively complex components, and simplify as much as possible to minimize maintenance burden. We should also think more deeply about schema changes for the local scheduler database, and whether we should add a DB migration script for users.

@andrii-i
Copy link
Collaborator Author

andrii-i commented Feb 7, 2026

Thanks for the review and approval @dlqqq. Would be happy to have additional discussions.

whether we should add a DB migration script for users

The current update_db_schema function handles automatic column additions via ALTER TABLE - new nullable columns are added transparently on startup. That said, happy to discuss if we need something more robust.

@andrii-i andrii-i merged commit 955883c into jupyter-server:main Feb 9, 2026
6 checks passed
@andrii-i andrii-i deleted the multiple-backends branch February 9, 2026 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support multiple backends

3 participants