Add jupyter-fs integration with projspec chips and scan-url backend#2
Add jupyter-fs integration with projspec chips and scan-url backend#2
Conversation
- Bump package version in package.json. - Add jupyter-fs integration details to README, including automatic detection and per-resource scanning. - Implement new API endpoint for scanning fsspec URLs in routes.py. - Introduce JfsChipsWidget for displaying projspec chips in jupyter-fs sidebars. - Update ProjspecPanel and related components to support jupyter-fs resources. - Modify various components to handle optional paths and improve error handling for remote filesystems.
…t scanning - Introduce a new helper function _scan_url to run projspec.Project in a worker thread. - Update the post method in ScanUrlRouteHandler to use async/await for improved I/O handling. - Ensure project data is returned as JSON after scanning the fsspec URL.
- Add validation to ensure 'url' is a string and 'subpath' is either a string or null, returning a 400 error for invalid inputs. - Implement a mechanism to use a server-configured allowed URL for path construction, discarding any client-supplied query parameters. - Introduce a new helper function _redact_url_credentials to safely log URLs by redacting embedded passwords. - Update _normalize_url to ensure consistent URL comparison by stripping query parameters and normalizing paths. - Add unit tests for URL normalization, allowlist checking, and credential redaction to improve code coverage and reliability.
- Add validation in ScanUrlRouteHandler to return a 422 error when no jupyter-fs resources are configured. - Improve URL normalization by ensuring percent-encoded characters in netloc and paths are decoded. - Update _redact_url_credentials to handle various URL formats for better security logging. - Introduce comprehensive unit tests for resource extraction and URL handling, including edge cases for missing fields and percent-encoded paths. - Enhance error logging in fetchJfsResources to provide clearer feedback on network and parsing issues.
- Updated ScanUrlRouteHandler to avoid unquoting subpath, preventing potential double-decoding attacks. - Added a unit test to ensure that double-encoded traversal attempts are correctly blocked, maintaining security against path traversal vulnerabilities. - Improved handling of subpath validation to reject malformed inputs without crashing the application.
- Implement a two-layer traversal check to block both raw and single-encoded traversal attempts, improving security against path traversal vulnerabilities. - Update unit tests to verify that single-encoded dot segments are correctly identified and rejected, while allowing legitimate folder names containing '%' characters. - Refactor subpath normalization to ensure consistent handling of valid inputs.
- Deleted a test case that checked for a 400 response when a subpath normalizes to '.', as it was deemed unnecessary. - This change streamlines the test suite while maintaining coverage for critical validation scenarios.
The MutationObserver was dropping its sidebar-level observation when narrowing to the breadcrumb element. When tree-finder replaced the breadcrumb during navigation (e.g. clicking root after visiting a subdirectory), the observer missed the replacement because it was only watching the now-detached old element. Keep the sidebar observation permanently active so structural changes are always detected, and defer the breadcrumb re-read by one task to let tree-finder finish rendering the replacement element. Made-with: Cursor
There was a problem hiding this comment.
Pull request overview
Extends the projspec JupyterLab extension to support project discovery on remote/virtual filesystems surfaced via jupyter-fs, keeping chips/panel in sync across local and jupyter-fs file browser tabs and adding a backend endpoint to scan fsspec URLs safely.
Changes:
- Add a new backend
/jupyter-projspec/scan-urlendpoint to scan allowed jupyter-fs resources (with traversal protection + credential redaction). - Introduce a
ScanSourceunion (local|jfs) and update panel/chips to scan either local paths or jupyter-fs URLs. - Inject projspec chips into jupyter-fs sidebars and sync the right panel to the active left sidebar tab.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| style/base.css | Adds empty-state styling and new CSS hooks for jupyter-fs sidebar chip injection. |
| src/widgets/ProjspecPanel.ts | Switches panel state from a path string to a `ScanSource |
| src/widgets/JfsChipsWidget.ts | New widget that renders chips in jupyter-fs sidebars and tracks breadcrumb navigation via MutationObserver. |
| src/types.ts | Adds ScanSource union + helpers for equality, display, endpoint selection, and request init building. |
| src/tokens.ts | Introduces a JupyterLab token to share panel state/mappings between plugins with explicit activation ordering. |
| src/index.ts | Splits into main + jupyter-fs integration plugin; injects chips into jupyter-fs sidebars and syncs panel to active tab. |
| src/components/SpecItem.tsx | Updates path prop to nullable to disable make-related UI for non-local sources. |
| src/components/ProjspecPanelComponent.tsx | Refactors scanning to use ScanSource and adds a null-source empty state (no scan). |
| src/components/ProjspecChips.tsx | Adds optional scanUrl prop and routes scans through scan-url POST for jupyter-fs. |
| src/components/ProjectView.tsx | Propagates nullable path down to spec items. |
| src/components/ArtifactsView.tsx | Makes “make” actions conditional on path !== null and prevents calls when unavailable. |
| src/api.ts | Adds client helper to fetch jupyter-fs resources from /jupyterfs/resources and compute sidebar IDs. |
| package.json | Bumps extension version to 0.3.0. |
| jupyter_projspec/tests/test_routes.py | Adds extensive unit/integration test coverage for URL normalization, allowlisting, traversal protection, and scan-url validation. |
| jupyter_projspec/routes.py | Adds /scan-url handler with allowlist validation, traversal protection, URL normalization, and credential redaction. |
| README.md | Documents jupyter-fs integration and the new scan-url endpoint. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| } | ||
| ); | ||
|
|
There was a problem hiding this comment.
injectChips creates a container + attaches a JfsChipsWidget, but there’s no disposal/cleanup path (e.g., remove the injected container and delete sidebarId from injected/sidebarIdToUrl) if the sidebar widget is later disposed/recreated. Adding a chipsWidget.disposed.connect(...) cleanup like the local file browser chips uses would prevent orphaned DOM nodes and allow reinjection after re-creation.
| chipsWidget.disposed.connect(() => { | |
| injected.delete(sidebarId); | |
| sidebarIdToUrl.delete(sidebarId); | |
| if (container.parentNode) { | |
| container.parentNode.removeChild(container); | |
| } | |
| }); |
| @patch("jupyter_projspec.routes._get_jfs_resource_urls", | ||
| return_value=["s3://bucket/prefix"]) | ||
| async def test_query_param_injection_blocked(self, _mock_jfs, jp_fetch): | ||
| """A URL with injected query params matching an allowed URL must still pass | ||
| the allowlist (query params stripped) and must NOT forward those params.""" | ||
| # The handler finds the match and uses the clean server URL, so it won't | ||
| # 403. It will proceed to scan and fail (s3 needs real credentials), | ||
| # but the important assertion is no 403 and no 500 crash. | ||
| with pytest.raises(Exception) as exc_info: | ||
| await jp_fetch( | ||
| "jupyter-projspec", "scan-url", | ||
| method="POST", | ||
| body=json.dumps({ | ||
| "url": "s3://bucket/prefix?evil=creds", | ||
| }).encode(), | ||
| ) | ||
| code = exc_info.value.response.code | ||
| assert code != 403, "Should not 403 — URL matches the allowlist" | ||
| # 500 is acceptable here: the handler correctly passed the allowlist | ||
| # and attempted a real S3 scan (no credentials in test env), confirming | ||
| # injected query params were discarded and the clean URL was used. |
There was a problem hiding this comment.
test_query_param_injection_blocked currently triggers a real scan of an s3://... URL (only allowlist is patched), which risks slow/flaky tests due to network/credential/provider behavior. Patch _scan_url (or projspec.Project) in this test to fail fast deterministically while still asserting that the allowlist match succeeds and injected query params are discarded.
| """Run projspec.Project() in a worker thread (blocking I/O safe). | ||
|
|
||
| Uses the shared _executor. This does not compete with make commands | ||
| because make is only available for local paths (the UI disables make | ||
| buttons for jfs sources), so there is no thread-pool starvation risk. |
There was a problem hiding this comment.
In ScanUrlRouteHandler, _scan_url uses the shared _executor and the docstring claims it “does not compete with make commands”. That’s not true in practice: users can still run local make requests while jupyter-fs scans are in-flight, and remote scans can be long-running (network I/O), potentially consuming threads and delaying/queuing make work. Consider using a dedicated executor for scan-url (or a tighter max_workers / separate concurrency limit) and update the comment accordingly.
| """Run projspec.Project() in a worker thread (blocking I/O safe). | |
| Uses the shared _executor. This does not compete with make commands | |
| because make is only available for local paths (the UI disables make | |
| buttons for jfs sources), so there is no thread-pool starvation risk. | |
| """Construct a projspec.Project for the given fsspec URL and return its dict. | |
| This function is intended to be run in a worker thread / executor by the | |
| caller, since it may perform blocking I/O. |
Motivation
Project discovery only worked with the local file browser. Many workflows involve remote filesystems (S3, Samba, SFTP) accessed through jupyter-fs. This PR extends projspec to work with jupyter-fs resources.
What changes
/scan-urlbackend endpoint scans remote filesystems via fsspec URLs, validated against configured jupyter-fs resourcesScanSourcediscriminated union (local|jfs) separates the two scan paths in the frontendTest plan
pytest jupyter_projspec/tests/test_routes.py