Feat/sandbox by distributedstatemachine · Pull Request #329 · one-covenant/basilica

distributedstatemachine · 2026-01-10T16:54:31Z

Summary

Brief description of what this PR does.

Related Issues

Closes #(issue number)

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Performance improvement
Code refactoring

Changes Made

List the main changes in this PR:

Testing

How Has This Been Tested?

Describe the tests you ran to verify your changes.

Unit tests pass (cargo test)
Integration tests pass
Manual testing completed

Test Configuration

OS:
Rust version:
Bittensor version (if applicable):

Checklist

My code follows the project's style guidelines
I have run cargo fmt to format my code
I have run cargo clippy and addressed all warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
Any dependent changes have been merged and published

Additional Context

Add any other context about the PR here.

Summary by CodeRabbit

New Features
- Full Sandbox API: lifecycle, code execution, file I/O, snapshots, Git, LSP, and resource/GPU controls exposed across SDKs.
- Quick-start factory functions and global configuration for defaults.
Documentation
- Extensive sandbox guides, quick-starts, and API reference added to SDK docs.
Tests
- End-to-end test harness and local K3d-based test workflow for sandbox verification and automation.

coderabbitai · 2026-01-10T16:54:47Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds a full Sandbox subsystem across Python and Rust SDKs (types, client, lifecycle, exec, files, git, LSP), exposes it in package/crate public APIs, and adds local K3d-based manifests and end-to-end test automation for sandbox validation.

Changes

Cohort / File(s)	Summary
Python SDK — public surface `crates/basilica-sdk-python/python/basilica/__init__.py`	Exports sandbox types and helpers (Sandbox, SandboxState, exceptions, ExecResult, FileInfo, Git results, LSP types), global _Config, `configure`/`get_config`, and quick-start factories (`python_sandbox`, `javascript_sandbox`, `js_sandbox`).
Python SDK — implementation `crates/basilica-sdk-python/python/basilica/sandbox.py`	Implements Sandbox client with lifecycle (create/get/refresh/delete/wait), file I/O, exec/run, Git ops, LSP methods, dataclasses for results/types, namespaced helpers (files/process/git), and errors.
Rust SDK — public API wiring `crates/basilica-sdk/src/lib.rs`	Re-exports new `sandbox` module and `sandbox_types`, and exposes `BasilicaClient`/`ClientBuilder` in crate root docs.
Rust SDK — implementation `crates/basilica-sdk/src/sandbox.rs`	Adds full Rust Sandbox client and rich type model: SandboxConfig/ResourceSpec/GpuSpec, Sandbox client (create/from_id/status/wait/exec/read/write/list/git/snapshot/lsp/delete), HTTP client wiring, serde payloads, and unit tests.
K3d Kubernetes manifests `scripts/localtest/k3d-manifests/api-deploy.yaml`, `scripts/localtest/k3d-manifests/operator-deploy.yaml`	Adds local K3d manifests for API and Operator: ConfigMap, ServiceAccount, ClusterRole/Binding, Deployments (probes, env, resources), Service, and Ingress for local testing.
E2E test scripts `scripts/localtest/sandbox-e2e.sh`, `scripts/localtest/sandbox-k3d-e2e.sh`	Adds comprehensive API/kubectl E2E harness and a K3d orchestration script: prereqs, CRD/RBAC setup, create/exec/files/snapshot tests, cluster/image build/push, status/logs, and cleanup commands.
Build automation `justfile`	Adds Justfile targets wrapping the K3d E2E script (sandbox-setup, sandbox-test, sandbox-api-test, sandbox-all, sandbox-status, sandbox-logs, sandbox-cleanup).
Docs `crates/basilica-sdk-python/README.md`, `crates/basilica-sdk-python/sandbox.md`	Adds a comprehensive Sandbox section and a long sandbox.md guide with quick starts, API reference, examples, and best practices.

Sequence Diagram(s)

sequenceDiagram
    participant User as User
    participant PySDK as Python SDK<br/>(Sandbox)
    participant API as Basilica API
    participant K8s as Kubernetes<br/>(Operator)
    participant Container as Sandbox<br/>Container

    User->>PySDK: Sandbox.create(language="python")
    PySDK->>API: POST /sandboxes
    API->>K8s: Create BasilicaSandbox CR
    K8s->>Container: Provision/Start pod
    Container-->>K8s: Pod Ready
    K8s-->>API: Status: Ready
    API-->>PySDK: sandbox_id,status
    PySDK-->>User: Sandbox instance

    User->>PySDK: sandbox.run(code)
    PySDK->>API: POST /sandboxes/{id}/run
    API->>Container: Execute code
    Container-->>API: ExecResult
    API-->>PySDK: ExecResult
    PySDK-->>User: stdout/stderr

sequenceDiagram
    participant Tester as Tester
    participant K3dScript as K3d E2E Script
    participant Docker as Docker Registry
    participant K3d as K3d Cluster
    participant Operator as Basilica Operator
    participant API as Basilica API
    participant E2ETest as Sandbox E2E Tests

    Tester->>K3dScript: ./sandbox-k3d-e2e.sh setup
    K3dScript->>Docker: Build & push images
    Docker-->>K3dScript: Images available
    K3dScript->>K3d: Create cluster & apply manifests
    K3d->>Operator: Deploy operator
    K3d->>API: Deploy API
    K3dScript->>Tester: Setup complete

    Tester->>K3dScript: ./sandbox-k3d-e2e.sh test
    K3dScript->>E2ETest: Run sandbox-e2e.sh
    E2ETest->>API: create/exec/files/snapshot requests
    API->>Operator: orchestrate sandboxes
    Operator->>K3d: create sandbox pods
    K3d->>E2ETest: test results
    E2ETest-->>K3dScript: Tests complete

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

refactor(scripts): rebuild localnet as composable Docker environment #350: Overlapping local infra/K3d manifests and test orchestration changes that touch the same E2E/local testing surface.

Suggested reviewers

epappas

Poem

🐰 A sandbox blossoms, code to play,
Python, Rust — both pave the way,
Snapshots, git, LSP in tow,
K3d tests make confidence grow,
Hop!—features ready, off we go!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Feat/sandbox' is vague and generic, using a convention-based prefix without describing the actual changes or their scope.	Use a more descriptive title that summarizes the main change, such as 'Add Sandbox API with client library and local testing infrastructure' or similar that conveys what was actually implemented.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/sandbox

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🤖 Fix all issues with AI agents

In @crates/basilica-sdk-python/python/basilica/sandbox.py:
- Line 515: The raise statement is using an unnecessary f-string in
SandboxError; replace the f-string raise SandboxError(f"Timeout waiting for
sandbox to become ready") with a plain string raise SandboxError("Timeout
waiting for sandbox to become ready") in the sandbox readiness/wait loop (look
for the raise referencing SandboxError and the timeout message) so the message
is a normal string literal without the f-prefix.
- Around line 307-318: The state property fails to match API PascalCase values;
normalize the incoming string before constructing SandboxState. In the state
property use a normalized form of self._state (e.g., convert PascalCase to
lowercase or call .lower()) when creating SandboxState(self._state...), keep the
ValueError fallback to SandboxState.FAILED, and no change needed to is_ready
since it relies on the corrected state property.
- Around line 1323-1341: The delete() method calls response.json() which will
raise a JSONDecodeError for empty/204 responses; update sandbox.delete to avoid
parsing the body unconditionally by checking response.status_code or
response.content first and only calling response.json() when there is a
non-empty body, returning None otherwise; ensure you still set self._state =
"Terminated" after a successful delete and preserve the existing error handling
(raise SandboxError on RequestException).

In @crates/basilica-sdk/src/sandbox.rs:
- Around line 676-691: from_id currently sets the sandbox's language to an empty
string which causes lsp_init to pick an empty language via
language.unwrap_or(&self.language); change from_id to accept a language argument
(e.g., add parameter language: impl Into<String>) and set Self.language to that
value, updating the Sandbox::from_id signature and callers accordingly so
reconnecting sandboxes retain correct language information; alternatively, if
you prefer runtime discovery, implement a fetch of sandbox status inside from_id
to populate language from the server before returning.

In @scripts/localtest/sandbox-e2e.sh:
- Around line 100-119: The script assumes a local basilica-backend at hardcoded
path ($SCRIPT_DIR/../../../basilica-backend) which may be missing; update the
logic around crd_file and operator_dir to accept configurable environment
variables (e.g., BASILICA_BACKEND_DIR) and fall back to the existing relative
path, validate existence before use, and improve the error message to instruct
how to obtain the dependency (git submodule URL or manual clone) when neither
the CRD file nor operator are found; update references to crd_file and
operator_dir and ensure the kubectl/cargo fallback only runs when operator_dir
exists.

🧹 Nitpick comments (10)

scripts/localtest/k3d-manifests/operator-deploy.yaml (2)
28-31: Consider scoping down wildcard permissions.

The wildcard permissions (resources: ["*"], verbs: ["*"]) grant overly broad access. While acceptable for local testing, consider specifying explicit resource types (e.g., basilicasandboxes, basilicasandboxes/status) for better security posture.
♻️ Proposed fix
 # BasilicaSandbox CRD management
 - apiGroups: ["basilica.ai"]
-  resources: ["*"]
-  verbs: ["*"]
+  resources: ["basilicasandboxes", "basilicasandboxes/status"]
+  verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
86-118: Add security context for better security posture.

Even for local testing, consider adding a securityContext to prevent privilege escalation and avoid running as root. This helps catch potential security issues early.
♻️ Proposed fix
       serviceAccountName: basilica-operator
       containers:
       - name: operator
         image: localhost:5050/basilica-operator:latest
         imagePullPolicy: Always
+        securityContext:
+          allowPrivilegeEscalation: false
+          runAsNonRoot: true
+          runAsUser: 1000
+          capabilities:
+            drop:
+              - ALL
         args:
scripts/localtest/sandbox-k3d-e2e.sh (1)
637-637: Consider declaring and assigning separately to follow best practices.

Shellcheck flags SC2155 because combining declaration and assignment with command substitution can mask return values. While date +%s is unlikely to fail, separating them is a defensive practice.
♻️ Proposed fix
     log_step "Running manual sandbox test..."
     
-    # Create a test sandbox directly via kubectl
-    local sandbox_name="test-sandbox-$(date +%s)"
+    local sandbox_name
+    sandbox_name="test-sandbox-$(date +%s)"
     
     log_info "Creating sandbox: $sandbox_name"
scripts/localtest/k3d-manifests/api-deploy.yaml (2)
88-106: Consider adding a security context to harden the container.

The static analysis flags that the container lacks explicit security settings. For a local testing manifest this is acceptable, but adding a security context would align with production best practices and prevent accidental privilege escalation.
🛡️ Suggested security context
       containers:
       - name: api
         image: localhost:5050/basilica-api:latest
         imagePullPolicy: Always
+        securityContext:
+          runAsNonRoot: true
+          allowPrivilegeEscalation: false
+          readOnlyRootFilesystem: true
+          capabilities:
+            drop:
+              - ALL
         ports:
         - name: http
           containerPort: 8080
155-158: Consider using ingressClassName instead of deprecated annotation.

The kubernetes.io/ingress.class annotation is deprecated since Kubernetes 1.18. While it still works in K3d, consider using the ingressClassName spec field for forward compatibility.
♻️ Suggested change
   annotations:
-    # K3s uses Traefik by default
-    kubernetes.io/ingress.class: traefik
+    # K3s uses Traefik by default
 spec:
+  ingressClassName: traefik
   rules:
scripts/localtest/sandbox-e2e.sh (2)
84-90: Separate variable declaration from command substitution to avoid masking failures.

As flagged by Shellcheck (SC2155), combining local with command substitution masks the command's exit status. With set -e, this could cause silent failures.
♻️ Suggested pattern
     # Check API health
-    local health=$(api_call GET "/health" 2>/dev/null || echo "{}")
+    local health
+    health=$(api_call GET "/health" 2>/dev/null || echo "{}")
     if echo "$health" | jq -e '.status == "ok" or .status == "healthy"' &>/dev/null; then
Apply this pattern throughout the script where local is combined with command substitution (lines 151, 167, 196, 198-199, 236, 249, 278, 280, 292, 295, 308, 313, 403, 427, 439).
304-322: Cleanup reports success without verifying deletion succeeded.

The cleanup function logs success regardless of whether the DELETE API call actually succeeded. Consider checking the response status.
♻️ Suggested improvement
 cleanup() {
     log_info "Cleaning up test sandboxes..."
     
     if [ -n "$SANDBOX_ID" ]; then
-        local response=$(api_call DELETE "/api/v1/sandboxes/$SANDBOX_ID")
-        log_success "Deleted sandbox: $SANDBOX_ID"
+        if api_call DELETE "/api/v1/sandboxes/$SANDBOX_ID" >/dev/null 2>&1; then
+            log_success "Deleted sandbox: $SANDBOX_ID"
+        else
+            log_warn "Failed to delete sandbox: $SANDBOX_ID"
+        fi
     fi
crates/basilica-sdk/src/sandbox.rs (2)
600-612: Empty response handling could be more explicit.

Parsing "null" to handle empty responses works for types like Option<T> and (), but may fail unexpectedly for types requiring actual data. Consider returning an error or documenting this behavior.
♻️ Alternative approach
         if response.status().is_success() {
-            // Handle empty responses (for delete, write, etc.)
             let text = response.text().await.map_err(ApiError::HttpClient)?;
             if text.is_empty() {
-                // Return default value for unit type responses
-                serde_json::from_str("null").map_err(|e| ApiError::Internal {
-                    message: format!("Failed to parse response: {}", e),
-                })
+                // For empty responses, try to deserialize from empty JSON object
+                // This works for structs with all optional/default fields
+                serde_json::from_str("{}").or_else(|_| {
+                    serde_json::from_str("null")
+                }).map_err(|e| ApiError::Internal {
+                    message: format!("Failed to parse empty response: {}", e),
+                })
             } else {
1461-1494: LSP notification methods return Result<()> but never fail.

Methods like lsp_did_open, lsp_did_change, and lsp_shutdown swallow errors with .unwrap_or_default() but still return Result<()>. This is semantically correct for "fire-and-forget" LSP notifications but could be simplified.

Either return () directly since these never fail, or log/trace errors before discarding them for debugging purposes. The current approach works but the Result wrapper provides no value.
crates/basilica-sdk-python/python/basilica/sandbox.py (1)

1257-1298: LSP document versions are hardcoded.

The lsp_did_open always uses version 1 and lsp_did_change always uses version 2. LSP servers expect monotonically increasing versions. If lsp_did_change is called multiple times, the version will be stale.

Consider tracking document versions per file, or documenting that users should manage versions themselves for complex workflows. For basic use cases, this limitation is acceptable.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7728c7b and 95741b0.

📒 Files selected for processing (9)

crates/basilica-sdk-python/python/basilica/__init__.py
crates/basilica-sdk-python/python/basilica/sandbox.py
crates/basilica-sdk/src/lib.rs
crates/basilica-sdk/src/sandbox.rs
justfile
scripts/localtest/k3d-manifests/api-deploy.yaml
scripts/localtest/k3d-manifests/operator-deploy.yaml
scripts/localtest/sandbox-e2e.sh
scripts/localtest/sandbox-k3d-e2e.sh

🧰 Additional context used

🧬 Code graph analysis (2)

crates/basilica-sdk-python/python/basilica/__init__.py (2)

crates/basilica-sdk-python/python/basilica/sandbox.py (17)

Sandbox (266-1357)

SandboxState (38-48)

SandboxError (234-237)

SandboxNotFound (240-245)

SandboxNotReady (248-254)

ExecutionError (257-263)

ExecResult (79-89)

FileInfo (93-100)

Snapshot (104-111)

NetworkIsolation (51-56)

GpuSpec (60-66)

ResourceSpec (70-75)

GitCloneResult (115-122)

GitStatusResult (126-135)

GitCommitResult (139-145)

GitPushResult (149-155)

GitPullResult (159-166)

crates/basilica-sdk-python/python/basilica/_basilica.pyi (1)

GpuSpec (76-85)

crates/basilica-sdk-python/python/basilica/sandbox.py (1)

crates/basilica-sdk/src/sandbox.rs (26)

success (256-258)

language (699-701)

is_ready (61-63)

create (655-674)

get (567-584)

post (586-618)

wait_until_ready (711-738)

run (787-801)

exec (741-755)

read_file (823-845)

write_file (848-870)

list_files (873-895)

create_snapshot (898-913)

git_clone (936-966)

git_status (984-1000)

git_commit (1020-1046)

git_push (1066-1093)

git_pull (1111-1138)

lsp_init (1164-1212)

lsp_completion (1232-1276)

lsp_hover (1295-1379)

lsp_did_open (1466-1494)

lsp_definition (1399-1459)

lsp_shutdown (1528-1539)

delete (620-637)

delete (1542-1546)

🪛 Checkov (3.2.334)

scripts/localtest/k3d-manifests/operator-deploy.yaml

[medium] 23-52: Minimize wildcard use in Roles and ClusterRoles

(CKV_K8S_49)

[medium] 66-118: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)

[medium] 66-118: Minimize the admission of root containers

(CKV_K8S_23)

scripts/localtest/k3d-manifests/api-deploy.yaml

[medium] 68-131: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)

[medium] 68-131: Minimize the admission of root containers

(CKV_K8S_23)

🪛 Ruff (0.14.10)

crates/basilica-sdk-python/python/basilica/sandbox.py

374-374: Avoid specifying long messages outside the exception class

(TRY003)

410-410: Avoid specifying long messages outside the exception class

(TRY003)

460-460: Abstract raise to an inner function

(TRY301)

466-466: Avoid specifying long messages outside the exception class

(TRY003)

482-482: Abstract raise to an inner function

(TRY301)

489-489: Avoid specifying long messages outside the exception class

(TRY003)

509-509: Avoid specifying long messages outside the exception class

(TRY003)

511-511: Avoid specifying long messages outside the exception class

(TRY003)

515-515: Avoid specifying long messages outside the exception class

(TRY003)

515-515: f-string without any placeholders

Remove extraneous f prefix

(F541)

566-566: Avoid specifying long messages outside the exception class

(TRY003)

621-621: Avoid specifying long messages outside the exception class

(TRY003)

650-650: Abstract raise to an inner function

(TRY301)

650-650: Avoid specifying long messages outside the exception class

(TRY003)

657-657: Avoid specifying long messages outside the exception class

(TRY003)

677-677: Avoid specifying long messages outside the exception class

(TRY003)

708-708: Avoid specifying long messages outside the exception class

(TRY003)

737-737: Avoid specifying long messages outside the exception class

(TRY003)

787-787: Avoid specifying long messages outside the exception class

(TRY003)

823-823: Avoid specifying long messages outside the exception class

(TRY003)

866-866: Avoid specifying long messages outside the exception class

(TRY003)

908-908: Avoid specifying long messages outside the exception class

(TRY003)

950-950: Avoid specifying long messages outside the exception class

(TRY003)

997-997: Avoid specifying long messages outside the exception class

(TRY003)

1037-1037: Avoid specifying long messages outside the exception class

(TRY003)

1056-1056: Consider moving this statement to an else block

(TRY300)

1058-1058: Avoid specifying long messages outside the exception class

(TRY003)

1119-1119: Avoid specifying long messages outside the exception class

(TRY003)

1177-1177: Consider moving this statement to an else block

(TRY300)

1179-1179: Avoid specifying long messages outside the exception class

(TRY003)

1236-1236: Consider moving this statement to an else block

(TRY300)

1238-1238: Avoid specifying long messages outside the exception class

(TRY003)

1341-1341: Avoid specifying long messages outside the exception class

(TRY003)

🪛 Shellcheck (0.11.0)

scripts/localtest/sandbox-k3d-e2e.sh

[warning] 637-637: Declare and assign separately to avoid masking return values.

(SC2155)

scripts/localtest/sandbox-e2e.sh

[warning] 85-85: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 151-151: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 167-167: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 196-196: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 198-198: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 199-199: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 236-236: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 249-249: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 278-278: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 280-280: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 292-292: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 295-295: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 308-308: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 313-313: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 403-403: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 427-427: Declare and assign separately to avoid masking return values.

(SC2155)

[warning] 439-439: Declare and assign separately to avoid masking return values.

(SC2155)

🔇 Additional comments (12)

crates/basilica-sdk/src/lib.rs (1)

7-45: LGTM! Well-structured sandbox API surface.

The documentation example is clear and helpful, and the module organization follows Rust best practices. The sandbox_types re-export namespace provides a clean way to access all sandbox-related types.

justfile (1)

406-451: LGTM! Well-organized sandbox testing commands.

The commands follow a consistent pattern and provide comprehensive coverage of the sandbox testing lifecycle (setup, test, status, logs, cleanup). The implementation matches the existing justfile conventions.

scripts/localtest/sandbox-k3d-e2e.sh (1)

1-834: Comprehensive and well-structured E2E testing script.

The script demonstrates excellent practices:

Clear prerequisite checks with actionable guidance

Robust fallback strategies for missing files

Good error handling and user feedback

Proper signal handling (Ctrl+C trap)

Well-organized command-line interface

crates/basilica-sdk-python/python/basilica/__init__.py (1)

138-198: LGTM! Clean sandbox API integration.

The imports are well-organized, and the use of aliases (SandboxGpuSpec, SandboxResourceSpec) elegantly avoids naming conflicts with existing types. All sandbox-related types are properly exposed in __all__.

scripts/localtest/k3d-manifests/api-deploy.yaml (1)

1-30: LGTM for local testing configuration.

The ConfigMap and deployment are well-documented with clear comments indicating this is for local testing only. Auth is intentionally disabled for local development, and the exec agent image points to the local registry appropriately.

scripts/localtest/sandbox-e2e.sh (1)

324-351: LGTM - Well-structured E2E test orchestration.

The run_all_tests function has proper error handling with cleanup on failure. The test sequence (create → exec → files → snapshot → cleanup) covers the core sandbox lifecycle comprehensively.

crates/basilica-sdk/src/sandbox.rs (3)

40-64: LGTM - Well-designed state enum with helper methods.

The SandboxState enum with is_terminal() and is_ready() helpers provides a clean API. Including Executing in is_ready() is correct since commands can still be issued during execution.

710-738: LGTM - Robust polling implementation.

The wait_until_ready method correctly handles the three exit conditions: ready state, terminal state (with error), and timeout. The 1-second poll interval is appropriate for sandbox readiness checks.

1554-1731: LGTM - Comprehensive unit tests for data types.

The tests thoroughly cover configuration building, state helpers, and serialization/deserialization for Git result types. The test for skip_serializing_if behavior (line 1728-1729) is particularly valuable.

crates/basilica-sdk-python/python/basilica/sandbox.py (3)

398-423: LGTM - Proper session handling in factory method.

The create() classmethod correctly creates a one-off request with explicit headers since the Session doesn't exist until after the instance is created. Subsequent operations correctly use self._session.

1343-1354: LGTM - Proper context manager implementation.

The context manager correctly waits for readiness on entry and cleans up on exit. Silencing deletion errors in __exit__ is the right pattern to avoid masking the original exception that caused the context to exit.

1-34: LGTM - Well-documented module with comprehensive docstrings.

The module docstring provides clear usage examples that demonstrate the core Sandbox workflow. The structure mirrors the Rust SDK appropriately.

crates/basilica-sdk-python/python/basilica/sandbox.py

crates/basilica-sdk/src/sandbox.rs

scripts/localtest/sandbox-e2e.sh

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In @crates/basilica-sdk-python/python/basilica/sandbox.py:
- Line 757: Update the docstring for the "depth" parameter in the relevant
function/method in basilica.sandbox.py to remove the incorrect "defaults to 1"
claim and instead state that depth is optional and only sent when provided, with
the actual default determined by the API server (e.g., "Optional clone depth; if
omitted, the server's default is used"). Also mention that the SDK only includes
the depth value in the request when explicitly provided.
- Around line 299-305: The Sandbox class creates a requests.Session
(self._session) but never closes it; add a public close() method that calls
self._session.close() and update the context-manager methods on Sandbox
(implement or modify __enter__ to return self and __exit__ to call close()) so
users can use "with Sandbox(...)" safely; optionally implement __del__ to call
close() as a fallback to ensure sockets are released when the object is
garbage-collected.
- Line 515: The raise statement uses an unnecessary f-string: replace raise
SandboxError(f"Timeout waiting for sandbox to become ready") with either a plain
string literal raise SandboxError("Timeout waiting for sandbox to become ready")
or include the sandbox identifier for context (e.g. raise SandboxError(f"Timeout
waiting for sandbox {sandbox_id} to become ready")), updating the variable name
to the actual sandbox id variable used in the surrounding function where
SandboxError is raised.

🧹 Nitpick comments (3)

crates/basilica-sdk-python/python/basilica/sandbox.py (3)
1276-1298: LSP document versioning is hardcoded, limiting multiple change notifications.

lsp_did_open uses version: 1 and lsp_did_change uses version: 2. For proper LSP compliance, document versions should increment with each change. This works for single-change scenarios but will confuse LSP servers expecting monotonically increasing versions across multiple edits.

Consider tracking versions per document or accepting version as a parameter.
Suggested approach
-    def lsp_did_change(self, file: str, content: str) -> None:
+    def lsp_did_change(self, file: str, content: str, version: int = 2) -> None:
         """
         Notify LSP server that a file was changed.

         Args:
             file: File path (relative to workspace)
             content: New file content
+            version: Document version (should increment with each change)
         """
         request = {
             "method": "textDocument/didChange",
             "params": {
-                "textDocument": {"uri": f"file://{file}", "version": 2},
+                "textDocument": {"uri": f"file://{file}", "version": version},
                 "contentChanges": [{"text": content}],
             },
         }
257-264: ExecutionError exception is defined but never raised in the codebase.

The ExecutionError exception is exported as part of the public API but is never instantiated or raised anywhere. Both run() and exec() methods catch requests.RequestException and raise SandboxError instead. Either use this specialized exception for execution failures or remove it if not intended for public use.

181-188: Remove unused LspError dataclass.

The LspError dataclass is never instantiated or used anywhere in the codebase. LSP error responses are handled by raising SandboxError directly (lines 997, 1037). Remove this unused class.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 95741b0 and 27fe9d1.

📒 Files selected for processing (1)

crates/basilica-sdk-python/python/basilica/sandbox.py

🧰 Additional context used

🪛 Ruff (0.14.10)

crates/basilica-sdk-python/python/basilica/sandbox.py

374-374: Avoid specifying long messages outside the exception class

(TRY003)

410-410: Avoid specifying long messages outside the exception class

(TRY003)

460-460: Abstract raise to an inner function

(TRY301)

466-466: Avoid specifying long messages outside the exception class

(TRY003)

482-482: Abstract raise to an inner function

(TRY301)

489-489: Avoid specifying long messages outside the exception class

(TRY003)

509-509: Avoid specifying long messages outside the exception class

(TRY003)

511-511: Avoid specifying long messages outside the exception class

(TRY003)

515-515: Avoid specifying long messages outside the exception class

(TRY003)

515-515: f-string without any placeholders

Remove extraneous f prefix

(F541)

566-566: Avoid specifying long messages outside the exception class

(TRY003)

621-621: Avoid specifying long messages outside the exception class

(TRY003)

650-650: Abstract raise to an inner function

(TRY301)

650-650: Avoid specifying long messages outside the exception class

(TRY003)

657-657: Avoid specifying long messages outside the exception class

(TRY003)

677-677: Avoid specifying long messages outside the exception class

(TRY003)

708-708: Avoid specifying long messages outside the exception class

(TRY003)

737-737: Avoid specifying long messages outside the exception class

(TRY003)

787-787: Avoid specifying long messages outside the exception class

(TRY003)

823-823: Avoid specifying long messages outside the exception class

(TRY003)

866-866: Avoid specifying long messages outside the exception class

(TRY003)

908-908: Avoid specifying long messages outside the exception class

(TRY003)

950-950: Avoid specifying long messages outside the exception class

(TRY003)

997-997: Avoid specifying long messages outside the exception class

(TRY003)

1037-1037: Avoid specifying long messages outside the exception class

(TRY003)

1056-1056: Consider moving this statement to an else block

(TRY300)

1058-1058: Avoid specifying long messages outside the exception class

(TRY003)

1119-1119: Avoid specifying long messages outside the exception class

(TRY003)

1177-1177: Consider moving this statement to an else block

(TRY300)

1179-1179: Avoid specifying long messages outside the exception class

(TRY003)

1236-1236: Consider moving this statement to an else block

(TRY300)

1238-1238: Avoid specifying long messages outside the exception class

(TRY003)

1341-1341: Avoid specifying long messages outside the exception class

(TRY003)

🔇 Additional comments (7)

crates/basilica-sdk-python/python/basilica/sandbox.py (7)

1-36: LGTM!

The module docstring provides clear, actionable usage examples covering the main workflows (create, run, exec, file operations, cleanup). Imports are appropriate for the module's needs.

38-57: LGTM!

Enums are well-defined with clear docstrings and appropriate values for sandbox lifecycle states and network isolation modes.

59-112: LGTM!

Dataclasses are well-structured with appropriate defaults, type annotations, and the convenient ExecResult.success property.

320-423: LGTM!

The create method has clear parameter documentation, proper environment variable fallbacks, input validation for the API key, and correctly handles the wait behavior. The 300-second default for wait_timeout aligns with the PR objective to allow more time for sandbox cold starts.

517-628: LGTM!

Both run and exec methods properly check sandbox readiness before execution, use appropriate HTTP timeouts with buffer, and consistently map response data to ExecResult.

630-738: LGTM!

File operations and snapshot creation are properly implemented with appropriate timeouts, error handling, and response parsing.

1343-1354: LGTM!

The context manager implementation properly waits for readiness on entry and cleans up the sandbox on exit. Ignoring deletion errors in __exit__ is appropriate to prevent masking the original exception.

crates/basilica-sdk-python/python/basilica/sandbox.py

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In @crates/basilica-sdk-python/python/basilica/sandbox.py:
- Line 738: The raised SandboxError uses an unnecessary f-string prefix: change
the line raising SandboxError in the sandbox readiness code (raise
SandboxError(f"Timeout waiting for sandbox to become ready")) to remove the f
prefix so it reads raise SandboxError("Timeout waiting for sandbox to become
ready"); ensure no other f-string placeholders are expected in the SandboxError
message and run linters/tests to confirm F541 is resolved.
- Around line 508-518: Remove the earlier, incomplete context-manager methods by
deleting the first definitions of __enter__ and __exit__ in the Sandbox class
(the ones that simply return self and swallow all exceptions), so the later,
more robust implementations of __enter__ (which calls wait_until_ready()) and
__exit__ (which only catches SandboxError) remain; locate the duplicate method
names __enter__ and __exit__ and remove the initial pair to avoid the F811
redefinition.
- Around line 29-35: The package imports the requests module in basilica.sandbox
(import requests) but pyproject.toml declares an empty dependencies list; add
"requests" to the dependencies array in pyproject.toml (specify a compatible
version or use "*" / minimal supported version constraint) so requests is
installed at runtime and remove the ImportError risk.
- Around line 334-377: The wrapper methods SandboxGit.clone/push/pull pass
auth_token but the underlying Sandbox methods git_clone, git_push, and git_pull
lack that parameter; update the Sandbox class by adding an auth_token:
Optional[str] = None parameter to the git_clone, git_push, and git_pull method
signatures and ensure each method includes the auth_token in the request
payload/body (e.g., add auth_token to the dict sent to the API/runner) and
forwards it through any internal calls so the wrapper no longer passes an
unexpected argument.

🧹 Nitpick comments (3)

crates/basilica-sdk-python/python/basilica/__init__.py (1)
175-210: Global configuration singleton is not thread-safe.

The _Config class and _config singleton use mutable shared state without synchronization. In multi-threaded applications, concurrent calls to configure() and property access could lead to race conditions.

For an SDK that may be used in async/threaded contexts, consider using a threading.Lock for safe access, or document that configure() should only be called during initialization.
🔧 Optional: Thread-safe configuration
+import threading
+
 class _Config:
     """Global configuration singleton for Basilica SDK."""
     
     def __init__(self):
         self._api_url: Optional[str] = None
         self._api_key: Optional[str] = None
+        self._lock = threading.Lock()
     
     @property
     def api_url(self) -> str:
         """Get API URL from config or environment."""
-        if self._api_url:
-            return self._api_url
-        return os.environ.get("BASILICA_API_URL", DEFAULT_API_URL)
+        with self._lock:
+            if self._api_url:
+                return self._api_url
+            return os.environ.get("BASILICA_API_URL", DEFAULT_API_URL)
crates/basilica-sdk-python/python/basilica/sandbox.py (2)
272-278: Consider returning False explicitly in the else block.

The static analyzer suggests moving the return True into an else block for clearer control flow (TRY300). However, the current logic is correct—this is a minor style suggestion.
♻️ Optional: Clearer control flow
     def exists(self, path: str) -> bool:
         """Check if a file exists."""
         try:
             self._sandbox.read_file(self._resolve_path(path))
-            return True
         except SandboxError:
             return False
+        else:
+            return True
1344-1402: Consider adding a small delay or retry for diagnostics.

lsp_diagnostics() calls lsp_did_open() to trigger diagnostics, then immediately requests them. Some language servers may need time to analyze the file before diagnostics are available. Consider documenting this behavior or adding an optional delay parameter.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 27fe9d1 and 199ff02.

📒 Files selected for processing (3)

crates/basilica-sdk-python/README.md
crates/basilica-sdk-python/python/basilica/__init__.py
crates/basilica-sdk-python/python/basilica/sandbox.py

✅ Files skipped from review due to trivial changes (1)

crates/basilica-sdk-python/README.md

🧰 Additional context used

🧬 Code graph analysis (1)

crates/basilica-sdk-python/python/basilica/sandbox.py (1)

crates/basilica-sdk/src/sandbox.rs (19)

success (256-258)

write_file (848-870)

list_files (873-895)

run (787-801)

exec (741-755)

git_clone (936-966)

status (704-708)

git_status (984-1000)

git_commit (1020-1046)

git_push (1066-1093)

git_pull (1111-1138)

language (699-701)

lsp_init (1164-1212)

lsp_completion (1232-1276)

lsp_hover (1295-1379)

lsp_did_open (1466-1494)

lsp_definition (1399-1459)

lsp_did_change (1501-1525)

lsp_shutdown (1528-1539)

🪛 Ruff (0.14.10)

crates/basilica-sdk-python/python/basilica/sandbox.py

276-276: Consider moving this statement to an else block

(TRY300)

516-518: try-except-pass detected, consider logging the exception

(S110)

516-516: Do not catch blind exception: Exception

(BLE001)

591-593: Avoid specifying long messages outside the exception class

(TRY003)

629-629: Avoid specifying long messages outside the exception class

(TRY003)

683-683: Abstract raise to an inner function

(TRY301)

689-689: Avoid specifying long messages outside the exception class

(TRY003)

705-705: Abstract raise to an inner function

(TRY301)

712-712: Avoid specifying long messages outside the exception class

(TRY003)

732-732: Avoid specifying long messages outside the exception class

(TRY003)

734-734: Avoid specifying long messages outside the exception class

(TRY003)

738-738: Avoid specifying long messages outside the exception class

(TRY003)

738-738: f-string without any placeholders

Remove extraneous f prefix

(F541)

789-789: Avoid specifying long messages outside the exception class

(TRY003)

844-844: Avoid specifying long messages outside the exception class

(TRY003)

873-873: Abstract raise to an inner function

(TRY301)

873-873: Avoid specifying long messages outside the exception class

(TRY003)

880-880: Avoid specifying long messages outside the exception class

(TRY003)

900-900: Avoid specifying long messages outside the exception class

(TRY003)

931-931: Avoid specifying long messages outside the exception class

(TRY003)

960-960: Avoid specifying long messages outside the exception class

(TRY003)

1010-1010: Avoid specifying long messages outside the exception class

(TRY003)

1046-1046: Avoid specifying long messages outside the exception class

(TRY003)

1089-1089: Avoid specifying long messages outside the exception class

(TRY003)

1131-1131: Avoid specifying long messages outside the exception class

(TRY003)

1173-1173: Avoid specifying long messages outside the exception class

(TRY003)

1220-1220: Avoid specifying long messages outside the exception class

(TRY003)

1260-1260: Avoid specifying long messages outside the exception class

(TRY003)

1279-1279: Consider moving this statement to an else block

(TRY300)

1281-1281: Avoid specifying long messages outside the exception class

(TRY003)

1342-1342: Avoid specifying long messages outside the exception class

(TRY003)

1400-1400: Consider moving this statement to an else block

(TRY300)

1402-1402: Avoid specifying long messages outside the exception class

(TRY003)

1459-1459: Consider moving this statement to an else block

(TRY300)

1461-1461: Avoid specifying long messages outside the exception class

(TRY003)

1564-1564: Avoid specifying long messages outside the exception class

(TRY003)

1566-1566: Redefinition of unused __enter__ from line 508

(F811)

1572-1572: Redefinition of unused __exit__ from line 512

(F811)

🔇 Additional comments (9)

crates/basilica-sdk-python/python/basilica/__init__.py (2)

138-162: LGTM!

The sandbox module imports and type aliases are cleanly organized. The aliasing pattern (GpuSpec as SandboxGpuSpec, ResourceSpec as SandboxResourceSpec) helps avoid naming conflicts with existing types in the module.

339-386: LGTM - sensible quick-start defaults.

The factory functions provide good ergonomics with runtime="container" as a sensible default for quick starts (vs. "firecracker" in Sandbox.create()). The alias js_sandbox = javascript_sandbox is a nice convenience.

crates/basilica-sdk-python/python/basilica/sandbox.py (7)

38-167: LGTM - well-structured data classes.

The enum and dataclass definitions are clean and comprehensive. The ExecResult.success property is a nice convenience.

238-317: LGTM - namespace wrappers improve DX.

The SandboxFiles, SandboxProcess, and SandboxGit namespace classes provide a clean, discoverable API surface. The automatic /workspace prefix for relative paths in SandboxFiles is a sensible default.

442-467: LGTM - good lazy initialization pattern for namespaces.

The __init__ properly sets up the requests session with auth headers, and the namespace objects (_files, _process, _git) are lazily initialized via properties—avoiding unnecessary object creation.

533-642: Sandbox creation logic is well-structured.

The create() classmethod properly:

Falls back through config → environment variables for credentials

Builds the request payload conditionally

Optionally waits for ready state

One minor observation: if api_key is an empty string from config.api_key, the or "" fallback on line 589 would still result in an empty string, and the check on line 590 would catch it. This is correct behavior.

740-796: LGTM - run() implementation is correct.

The method properly:

Checks readiness and refreshes if needed

Builds the request with proper env var format

Uses appropriate timeout (execution timeout + 30s buffer)

Returns a properly constructed ExecResult

1566-1577: Keep this context manager implementation (remove the earlier duplicate).

This implementation is preferred because:

__enter__ calls wait_until_ready() if not ready, ensuring the sandbox is usable

__exit__ catches only SandboxError (not bare Exception), which is more specific

This addresses the static analysis concerns about catching blind exceptions (BLE001/S110).

1179-1220: LGTM - LSP initialization is well-implemented.

The lsp_init() method properly extracts server capabilities from the response and provides a clean LspCapabilities dataclass with both parsed boolean flags and the raw capabilities dict for advanced use cases.

crates/basilica-sdk-python/python/basilica/sandbox.py

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@crates/basilica-sdk-python/SANDBOX.md`:
- Line 252: Update the example call to match the exec() signature by replacing
the incorrect parameter name cwd with workdir; locate the sandbox example where
sb.process.exec(["ls", "-la"], cwd="/workspace") is used and change it to
sb.process.exec(["ls", "-la"], workdir="/workspace") so it aligns with the
exec() method and API reference.

🧹 Nitpick comments (3)

crates/basilica-sdk-python/SANDBOX.md (3)

3-3: Optional: Consider using a proper heading instead of emphasis.

The emphasized text could be converted to a proper heading for better semantic HTML and markdown compliance.

📝 Proposed fix

-**Secure, Isolated Code Execution for AI Agents and Untrusted Code**
+## Secure, Isolated Code Execution for AI Agents and Untrusted Code

Alternatively, if emphasis is intentional for styling, you can suppress this markdownlint warning.

Based on static analysis hints.

81-87: Optional: Add language identifier to the state diagram code block.

Adding a language identifier (e.g., text or mermaid) improves rendering and satisfies markdown linting rules.

📝 Proposed fix

-```
+```text
 Creating → Initializing → Ready ⇄ Executing → Terminating → Terminated
                             ↓
                        Snapshotting
                             ↓
                           Failed

</details>

Based on static analysis hints.

---

`976-997`: **Consider improving the REPL example with proper error handling.**

The `SandboxREPL` example manually calls `__enter__` and `__exit__`, which is less robust than using a context manager. Additionally, there's no error handling if sandbox creation fails.


<details>
<summary>♻️ Proposed improvement</summary>

```diff
 class SandboxREPL:
     def __init__(self):
-        self.sandbox = python_sandbox().__enter__()
+        self._context = python_sandbox()
+        self.sandbox = self._context.__enter__()
         self._history = []
     
     def execute(self, code: str) -> str:
         self._history.append(code)
         result = self.sandbox.run(code)
         return result.stdout if result.success else f"Error: {result.stderr}"
     
     def close(self):
-        self.sandbox.__exit__(None, None, None)
+        self._context.__exit__(None, None, None)
+
+    def __enter__(self):
+        return self
+    
+    def __exit__(self, exc_type, exc_val, exc_tb):
+        self.close()
 
 # Usage
-repl = SandboxREPL()
-print(repl.execute("x = 42"))
-print(repl.execute("print(x * 2)"))  # 84
-repl.close()
+with SandboxREPL() as repl:
+    print(repl.execute("x = 42"))
+    print(repl.execute("print(x * 2)"))  # 84

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 199ff02 and 9a5f1e4.

📒 Files selected for processing (1)

crates/basilica-sdk-python/SANDBOX.md

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

crates/basilica-sdk-python/SANDBOX.md

3-3: Emphasis used instead of a heading

(MD036, no-emphasis-as-heading)

81-81: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (3)

crates/basilica-sdk-python/SANDBOX.md (3)

1049-1049: Python version compatibility note.

The example uses dict[str, str] syntax which is available in Python 3.9+. Since the documentation specifies Python 3.10+ as the requirement (line 54), this is compatible. However, for readers on exactly Python 3.10, consider noting that Dict[str, str] from typing is the alternative for earlier versions.

The syntax is correct for the stated Python 3.10+ requirement.

1-1122: Excellent comprehensive documentation for the Basilica Sandbox SDK.

This documentation provides thorough coverage of the Sandbox API with clear examples, security best practices, and practical use cases. The structure is logical, progressing from quick start to advanced topics. The examples are generally accurate and demonstrate real-world usage patterns effectively.

Key strengths:

Clear quick start that gets users productive immediately

Comprehensive API reference with detailed method signatures

Excellent security best practices section with good/bad examples

Practical examples for AI agents, code evaluation, and multi-language support

Thorough troubleshooting section

369-375: The documentation is accurate. The git operations (status, commit, push, pull) all have /workspace/repo as the default path parameter in the implementation, matching the claim in SANDBOX.md. The examples will work correctly as documented.

_{✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.}

crates/basilica-sdk-python/sandbox.md

- Add Sandbox class with create(), get(), run(), exec() methods - Add file operations (read_file, write_file, list_files) - Add snapshot support - Add context manager for automatic cleanup - Export sandbox types from basilica package

- Add sandbox-e2e.sh for testing sandbox functionality locally - Test sandbox creation, exec, files, and snapshot - Include CRD setup for K3s/kubectl environments - Support individual test commands or full test suite

Sandbox Module: - Created sandbox.rs with Sandbox struct and HTTP client - SandboxConfig for configuration with builder pattern - ExecResult, FileInfo, SnapshotInfo, SandboxStatus DTOs - Methods: create, status, wait_until_ready, exec, run, read_file, write_file - Methods: list_files, create_snapshot, delete, websocket_url Features: - Independent HTTP client (not tied to BasilicaClient internals) - Network isolation options (None, Egress, Full) - GPU and resource specifications - Unit tests for config, state, and serialization

- Use generated basilica-sandbox.yaml from basilica-backend - Fallback to runtime CRD generation if file not found - Apply RBAC manifests alongside CRD

K3d Script (sandbox-k3d-e2e.sh): - Automated K3d cluster lifecycle (create/delete) - Local registry setup at localhost:5050 - Image building for exec-agent - Infrastructure deployment (CRDs, RBAC, NetworkPolicies) - Multiple test modes: test, api-test, manual-test - Status and logs commands for debugging K8s Manifests (k3d-manifests/): - operator-deploy.yaml: Minimal operator deployment with RBAC - api-deploy.yaml: API deployment with config, ingress Sandbox E2E Updates: - Added kubectl-only test commands (kubectl-all, kubectl-create, etc) - Tests work without API via direct kubectl exec - Supports both API and kubectl testing modes Usage: ./sandbox-k3d-e2e.sh all # Setup cluster and run tests ./sandbox-k3d-e2e.sh test # Run kubectl-based tests ./sandbox-k3d-e2e.sh cleanup # Delete cluster

Commands added to justfile: - just sandbox-setup - Setup K3d cluster - just sandbox-test - Run kubectl-based tests - just sandbox-api-test - Run API-based tests - just sandbox-all - Setup + test - just sandbox-status - Show cluster status - just sandbox-logs - Show operator/API logs - just sandbox-cleanup - Delete cluster

- Add LSP methods to Python SDK Sandbox class: - lsp_init, lsp_completion, lsp_hover - lsp_diagnostics, lsp_definition, lsp_shutdown - Add LSP structs and methods to Rust SDK - Add LSP-related dataclasses and types - Export LSP types from basilica package

The API returns snake_case states like 'creating', 'initializing', 'ready'. Updated SandboxState enum values to match.

- Add 'runtime' parameter: container (default), firecracker, gvisor - Pass runtime to API in create request

Allow more time for sandbox cold starts under high concurrency. This prevents timeout errors when warm pool is exhausted and new sandboxes need to be provisioned on-demand.

…nomics - Add context manager support (__enter__/__exit__) for automatic cleanup - Add basilica.configure() for global configuration singleton - Add python_sandbox(), js_sandbox() factory functions for quick starts - Add namespaced API: sandbox.files, sandbox.process, sandbox.git - Auto-prefix /workspace for relative paths in file operations - Update README with comprehensive sandbox documentation Example usage: with python_sandbox() as sb: sb.files.write('app.py', code) # -> /workspace/app.py sb.process.run('print(1 + 1)') # Auto-deleted

Add SANDBOX.md with detailed documentation covering: - Quick start and installation - Core concepts (states, isolation levels) - Creating sandboxes (factory functions, full config) - Code execution (run, exec, namespaced APIs) - File operations with /workspace auto-prefixing - Git operations (clone, commit, push, pull) - LSP support (completion, hover, diagnostics, definition) - Snapshots (create, restore, auto-snapshot) - Security & isolation (network modes, runtimes, resource limits) - GPU support - Error handling - Configuration (global, per-sandbox, env vars) - Best practices - Complete API reference - Real-world examples (AI agents, REPL, code evaluation) - Troubleshooting guide

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@crates/basilica-sdk-python/sandbox.md`:
- Around line 81-87: Add a language specifier to the fenced code block that
shows the state diagram in sandbox.md: change the opening triple-backtick to
include a language token (use "text" or "plaintext") so the block becomes
```text; this addresses MD040 by marking the diagram as plain text while leaving
the diagram content unchanged.

In `@crates/basilica-sdk/src/sandbox.rs`:
- Around line 847-868: The client fails when the server returns a literal null
body because write_file expects to deserialize into EmptyResponse; update the
call in write_file to accept an optional/nullable response instead of a concrete
struct—e.g., deserialize the post result into Option<EmptyResponse> or
serde_json::Value (or the unit type) and ignore/handle None so a "null" body
won't error; refer to the write_file method and the EmptyResponse type and
adjust how SandboxHttpClient::post's return is consumed accordingly.

In `@scripts/localtest/sandbox-k3d-e2e.sh`:
- Around line 23-26: The script currently uses set -e and resolves BACKEND_DIR
with a failing cd (BACKEND_DIR="$(cd "$SCRIPT_DIR/../../../basilica-backend" &&
pwd)") which will cause an immediate exit before your later friendly check;
change the logic so you first test the path existence (e.g., check the directory
at "$SCRIPT_DIR/../../../basilica-backend") and only then resolve BACKEND_DIR
using cd && pwd, and if the check fails emit the friendly error and exit; also
enable safer failure semantics by adding set -o pipefail next to set -e so piped
commands fail correctly.

🧹 Nitpick comments (6)

scripts/localtest/k3d-manifests/operator-deploy.yaml (1)
85-118: Add a securityContext to harden the operator container.

Even for local testing manifests, adding a security context serves as a good template and prevents accidental privilege escalation. Checkov flags CKV_K8S_20 and CKV_K8S_23.
🔧 Proposed fix
     spec:
       serviceAccountName: basilica-operator
+      securityContext:
+        runAsNonRoot: true
       containers:
       - name: operator
         image: localhost:5050/basilica-operator:latest
         imagePullPolicy: Always
+        securityContext:
+          allowPrivilegeEscalation: false
+          readOnlyRootFilesystem: true
+          runAsNonRoot: true
         args:
         - "--namespace"
         - "default"
scripts/localtest/k3d-manifests/api-deploy.yaml (2)
86-130: Add a securityContext to harden the API container.

Same as the operator manifest — add pod and container security context for defense-in-depth.
🔧 Proposed fix
     spec:
       serviceAccountName: basilica-api
+      securityContext:
+        runAsNonRoot: true
       containers:
       - name: api
         image: localhost:5050/basilica-api:latest
         imagePullPolicy: Always
+        securityContext:
+          allowPrivilegeEscalation: false
+          readOnlyRootFilesystem: true
+          runAsNonRoot: true
155-157: kubernetes.io/ingress.class annotation is deprecated.

Use spec.ingressClassName: traefik instead. The annotation was deprecated in Kubernetes 1.18+.
🔧 Proposed fix
   annotations:
-    # K3s uses Traefik by default
-    kubernetes.io/ingress.class: traefik
+    {}
 spec:
+  ingressClassName: traefik
   rules:
scripts/localtest/sandbox-k3d-e2e.sh (2)
637-637: Declare and assign separately to avoid masking return values (SC2155).
🔧 Proposed fix
-    local sandbox_name="test-sandbox-$(date +%s)"
+    local sandbox_name
+    sandbox_name="test-sandbox-$(date +%s)"
267-275: build_images silently skips operator/API builds, which may confuse setup users.

setup calls build_images, which only builds the exec-agent. Combined with deploy_infrastructure also skipping operator/API deployment (line 566-568), the setup command produces a cluster with CRDs and RBAC but no running workloads. The all command (setup + test) then runs tests that may expect an operator. Consider either documenting this limitation more prominently in the help output, or logging a more prominent warning during setup.
crates/basilica-sdk/src/lib.rs (1)
39-45: Replace the sandbox_types wrapper module with direct pub use re-exports.

The pub mod sandbox_types creates an unnecessary namespace layer since no code references this module. The existing re-export pattern in this file (see lines 35-37 for client, error, jobs, types) uses direct pub use statements at the crate root. The sandbox types should follow the same pattern for consistency.
♻️ Suggested change
-// Re-export sandbox types under sandbox:: namespace
-pub mod sandbox_types {
-    pub use crate::sandbox::{
-        EnvVar, ExecResult, FileInfo, GpuSpec, NetworkIsolation, ResourceSpec, Sandbox,
-        SandboxConfig, SandboxState, SandboxStatus, SnapshotInfo,
-    };
-}
+// Re-export commonly used sandbox types at the crate root
+pub use sandbox::{
+    EnvVar, ExecResult, FileInfo, GpuSpec, NetworkIsolation, ResourceSpec, Sandbox,
+    SandboxConfig, SandboxState, SandboxStatus, SnapshotInfo,
+};

crates/basilica-sdk-python/sandbox.md

crates/basilica-sdk/src/sandbox.rs

scripts/localtest/sandbox-k3d-e2e.sh

coderabbitai bot reviewed Jan 10, 2026

View reviewed changes

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

crates/basilica-sdk-python/python/basilica/sandbox.py Show resolved Hide resolved

crates/basilica-sdk-python/python/basilica/sandbox.py Outdated Show resolved Hide resolved

coderabbitai bot reviewed Jan 12, 2026

View reviewed changes

coderabbitai bot reviewed Jan 16, 2026

View reviewed changes

crates/basilica-sdk-python/sandbox.md Outdated Show resolved Hide resolved

distributedstatemachine and others added 15 commits February 10, 2026 18:13

feat(localtest): add sandbox E2E testing script

e418df6

- Add sandbox-e2e.sh for testing sandbox functionality locally - Test sandbox creation, exec, files, and snapshot - Include CRD setup for K3s/kubectl environments - Support individual test commands or full test suite

chore(sandbox): update e2e script to use generated CRD

0f96e4a

- Use generated basilica-sandbox.yaml from basilica-backend - Fallback to runtime CRD generation if file not found - Apply RBAC manifests alongside CRD

fix(sdk-python): Use lowercase state values to match API response

ee08206

The API returns snake_case states like 'creating', 'initializing', 'ready'. Updated SandboxState enum values to match.

feat(sdk): Add runtime parameter to Sandbox.create()

6693fd8

- Add 'runtime' parameter: container (default), firecracker, gvisor - Pass runtime to API in create request

feat(sdk): Make Firecracker the default runtime

cbb3ddb

chore: Fix trailing newlines in k3d manifests

26508e1

feat(sdk): increase default wait_timeout from 120s to 300s

935a823

Allow more time for sandbox cold starts under high concurrency. This prevents timeout errors when warm pool is exhausted and new sandboxes need to be provisioned on-demand.

chore: rename SANDBOX.md to sandbox.md

b6d765b

distributedstatemachine force-pushed the feat/sandbox branch from c518cb2 to b6d765b Compare February 10, 2026 14:51

coderabbitai bot reviewed Feb 10, 2026

View reviewed changes

crates/basilica-sdk-python/sandbox.md Outdated Show resolved Hide resolved

crates/basilica-sdk/src/sandbox.rs Show resolved Hide resolved

scripts/localtest/sandbox-k3d-e2e.sh Outdated Show resolved Hide resolved

distributedstatemachine added 2 commits February 11, 2026 20:52

feat(sdk): add snapshot list/get/delete helpers

61c290c

fix(sandbox): address CodeRabbit review findings

511b42f

Conversation

distributedstatemachine commented Jan 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issues

Type of Change

Changes Made

Testing

How Has This Been Tested?

Test Configuration

Checklist

Additional Context

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

distributedstatemachine commented Jan 10, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jan 10, 2026 •

edited

Loading