Support multiple backends#596
Conversation
ed1708d to
05779aa
Compare
694b782 to
b0d7a63
Compare
…end definition and discovery
for more information, see https://pre-commit.ci
…se database_manager_class to detect SQL storage needs
for more information, see https://pre-commit.ci
b0d7a63 to
3e48af6
Compare
bb7cd2e to
b092f48
Compare
dlqqq
left a comment
There was a problem hiding this comment.
@andrii-i Thanks for working on this over the last month! This is an impressive amount of code. 💪
I left some code review feedback on the backend below. However, I haven't reviewed the backend or done local testing. Let's talk more about this PR Thursday (will miss standup tomorrow). Hopefully that will give @JGuinegagne time to review as well.
Regardless of we release this in v2 or v3, it would be great to publish a pre-release before publishing an official release. There are a lot of changes here, and I think we will need a bug bash to validate all of them.
JGuinegagne
left a comment
There was a problem hiding this comment.
Lots of minor suggestions/sanity-checks.
Caveat: I'm not familiar with this codebase, some comments may be irrelevant or missing context. Feel free to use judgement on what to address.
…backend_id for legacy jobs at the backend, update snapshots
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
c8b5c07 to
3c10450
Compare
for more information, see https://pre-commit.ci
… support it logic more readable
for more information, see https://pre-commit.ci
JGuinegagne
left a comment
There was a problem hiding this comment.
LGTM w/ optional comments.
Evaluate whether to return error details for 5xx (disallowed in a normal service, but may be okay in a client application like jupyter-scheduler).
| if expected_message != message: | ||
| return False | ||
| return True | ||
| # Test utilities module (currently empty - add helpers here as needed) |
There was a problem hiding this comment.
non-blocking: remove module?
|
|
||
| config: BackendConfig | ||
| scheduler: BaseScheduler | ||
| scheduler: Any # BaseScheduler at runtime, but Any to support test mocks |
There was a problem hiding this comment.
hum, compromising typing integrity for the sake of unit tests doesn't sound right...
| if cfg.id in seen_ids: | ||
| raise ValueError(f"Duplicate backend ID: '{cfg.id}'") | ||
| if ":" in cfg.id: | ||
| raise ValueError(f"Backend ID cannot contain ':': '{cfg.id}'") |
There was a problem hiding this comment.
optional: performing two validation operations in the same loop, this is a bit odd.
Any way we could use pydantic models to validate the ID regex?
pydantic most likely supports disallowing duplicates in list.
|
|
||
| def get_scheduler(self, job_id: str): | ||
| def get_scheduler(self, job_id: str) -> BaseScheduler: | ||
| """Get scheduler for a job ID. Raises HTTPError(400) if backend unavailable.""" |
There was a problem hiding this comment.
non-blocking: I know it's not from this PR, but docstring formalities would call for:
"""Return scheduler for a job ID.
Raises:
HTTPError(400) if backend unavailable.| """Resolve backend from payload['backend'] or auto-select by file extension.""" | ||
| backend_id = payload.get("backend") | ||
| """Resolve backend from payload['backend_id'] or auto-select by file extension.""" | ||
| backend_id = payload.get("backend_id") |
There was a problem hiding this comment.
optional: can backend_id possibly be nullish here?
| except Exception as e: | ||
| self.log.exception(e) | ||
| raise HTTPError(500, "Unexpected error occurred during creation of job.") from e | ||
| raise HTTPError(500, f"Unexpected error during creation of job: {e}") from e |
There was a problem hiding this comment.
optional: per DRY principle, consider exploring context manager for error handling
| if result.returncode != 0: | ||
| raise RuntimeError( | ||
| f"Script exited with code {result.returncode}\nstderr: {result.stderr[:500]}" | ||
| f"Script exited with code {result.returncode}. See 'Errors' output for full error trace." |
There was a problem hiding this comment.
in that case, it would help to provide the file path.
| await page.waitForSelector('text=Saving Completed', { state: 'hidden' }); | ||
| await scheduler.assertSnapshot(FILENAMES.CREATE_JOB_VIEW); | ||
| // Flaky: file names and timestamps vary by environment | ||
| // await scheduler.assertSnapshot(FILENAMES.CREATE_JOB_VIEW); |
dlqqq
left a comment
There was a problem hiding this comment.
Thank you for addressing our feedback Andrii!
I'm approving this PR because the main branch will continue to evolve, and I don't think see any issues worth blocking on right now. The testing you've added inspires confidence that this new feature works, so I think it's fine to merge for now.
Before the v3.0 release, we should revisit the architecture as a team, remove any unnecessary / excessively complex components, and simplify as much as possible to minimize maintenance burden. We should also think more deeply about schema changes for the local scheduler database, and whether we should add a DB migration script for users.
|
Thanks for the review and approval @dlqqq. Would be happy to have additional discussions.
The current |
Summary
Update Jupyter Scheduler to support scheduling and running arbitrary file types beyond notebooks. Multiple backends can now run simultaneously, allowing different file types to be handled by their respective backends within the same UI.
Introduces the
BaseBackendabstraction, allowing backend authors to define capabilities declaratively (supported file types, output formats, scheduler/executor classes) with automatic discovery via entry points. Custom backends before could have been defined as sets of classes; this PR formalizes the pattern into aBaseBackendclass with declarative configuration and automatic discovery via entry points.Key Features
New
BaseBackendabstractionAvailable backends discovery via new REST API endpoint
GET /scheduler/backendsendpoint returns available backends with their capabilitiesBackends Provided by Default with Jupyter Scheduler
jupyter_server_nb- Execute notebooks via nbconvert (refactoring of an existing notebook execution logic)jupyter_server_py- Execute Python scripts via local subprocessBackend Discovery via Entry Points
Backends are registered using Python entry points in pyproject.toml:
Backend Selection UI
Configuration Options
Route legacy jobs (pre-3.0 UUID-only IDs) to a specific backend
Set preferred backend per file extension
Code Changes
New files:
Modified:
User-facing changes
Backwards-incompatible changes
None.
Pre-v3 pre-multiple-backend jobs have UUID-only IDs and are routed to
legacy_job_backendwhich be default is set to pre-v3 local notebook execution logic. So without installing additional backends default installation works identically (single jupyter_server_nb backend).Testing