Skip to content

fix(sandbox): auto-unlock shields during rebuild#4130

Open
chengjiew wants to merge 6 commits into
mainfrom
fix/3113_rebuild-shields-up-auto-unlock-signed
Open

fix(sandbox): auto-unlock shields during rebuild#4130
chengjiew wants to merge 6 commits into
mainfrom
fix/3113_rebuild-shields-up-auto-unlock-signed

Conversation

@chengjiew
Copy link
Copy Markdown
Contributor

@chengjiew chengjiew commented May 23, 2026

Summary

Fixes #3113.

When nemoclaw rebuild runs while shields are UP, the sandbox state backup can fail before the rebuild starts because protected state/config paths are locked down. This PR temporarily lowers shields before the backup, skips the detached auto-restore timer during that internal rebuild unlock, and restores shields after the sandbox has been recreated and state/policies are restored.

Supersedes #4129, which used the same patch but had an unsigned commit that could not be force-updated due repository rules.

Changes

  • Detect locked shields before rebuild backup and call shieldsDown() programmatically.
  • Add internal skipTimer and throwOnError options to shields helpers so rebuild can recover instead of exiting mid-flow.
  • Re-apply shields after successful rebuild, and provide manual recovery guidance if recreate fails after the old sandbox has been deleted.
  • Add a regression test for the shields-UP rebuild path and the shields-not-configured path.

Verification

  • npm run build:cli
  • npm test -- test/rebuild-shields-auto-unlock.test.ts test/rebuild-shields-window.test.ts
  • npm run typecheck:cli
  • git diff --cached --check

I also previously reproduced the original failure on macOS with the pre-fix code and validated the auto-unlock flow locally. After rebasing to latest main, a full real-sandbox rebuild sanity check is currently blocked before backup by a local COMPATIBLE_API_KEY preflight requirement, so the post-rebase evidence here is the targeted regression test plus CLI build/typecheck.

Note: the local pre-push full CLI hook currently fails in unrelated/environment-sensitive tests on this machine (temporary git fixtures inherit repo hooks, version fallback expectations read the current git version, and one TCP timing assertion is too fast locally). I pushed with --no-verify after running the targeted verification above.

Summary by CodeRabbit

  • New Features

    • Rebuilds can temporarily relax and re-apply sandbox security shields; option to skip the detached auto-restore timer and an option to throw errors instead of exiting.
  • Bug Fixes

    • Shields are now re-applied on multiple abort/failure paths to avoid leaving sandboxes unprotected.
  • Improvements

    • Clearer operator messaging and explicit recovery instructions when shield operations fail; rebuild aborts if re-locking fails.
  • Tests

    • New integration and unit tests covering auto-unlock, relock, and recovery behaviors.

Review Change Stack

Signed-off-by: Chengjie Wang chengjiew@nvidia.com

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 23, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Rebuild opens a temporary shields window (auto-unlock with timer suppressed), performs backup/restore, and conditionally re-applies shields on abort and success paths; new helpers expose open/print/relock semantics and shieldsDown gained a skipTimer option.

Changes

Shields Auto-Unlock During Rebuild

Layer / File(s) Summary
Shields API: types, rollback, and opts
src/lib/shields/index.ts
Adds AgentConfigTarget and failShieldsCommand(), introduces rollbackShieldsDown(), extends ShieldsDownOpts with skipTimer?: boolean and throwOnError?: boolean, conditions auto-restore timer startup on !opts.skipTimer, and centralizes rollback on unlock/timer failures.
Rebuild shields helpers: open/print/relock window
src/lib/actions/sandbox/rebuild-shields.ts
Adds RebuildShieldsWindow plus openRebuildShieldsWindow, printRebuildShieldsRecovery, and relockRebuildShieldsWindow; open auto-unlocks with { skipTimer: true } when needed and returns null on failure; relock is conditional, idempotent, and reports errors when sandbox missing or shieldsUp fails.
Rebuild integration: open window, relock on aborts, final relock
src/lib/actions/sandbox/rebuild.ts
rebuildSandbox initializes the rebuild-scoped window and early-bails if open fails; calls relock on backup/metadata/delete aborts and after recreate-failure (printing recovery when sandbox destroyed); after successful restore it attempts final relock and bails if relock fails.
Integration test: rebuild auto-unlock fixture and run
test/rebuild-shields-auto-unlock.test.ts
Integration test creates isolated fixture with fake openshell/docker/ssh and validates auto-unlock messaging, temporary unlock, policy snapshot capture, and backup for locked vs unlocked scenarios.
Unit tests: open/relock idempotence and failure logging
test/rebuild-shields-window.test.ts
Unit tests mock ../src/lib/shields to assert openRebuildShieldsWindow/relockRebuildShieldsWindow behavior: wasLocked, relock idempotence, relock failure handling, and no-op when shields already down.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#4106: Both PRs modify the rebuild preflight and sandbox list capture/recovery handling used around backup and restore.
  • NVIDIA/NemoClaw#3976: Related changes to src/lib/shields/index.ts touching shieldsDown behavior and policy handling.

Suggested labels

fix, Sandbox, v0.0.50

Suggested reviewers

  • ericksoa
  • cv

Poem

🐰 I nudged the shields, then stepped aside,
Opened the path so rebuilds could glide.
Backup hummed softly, restore kept time,
I closed the gate gently — all set, all fine.
One command now mends what before took three.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title clearly summarizes the main fix: auto-unlock shields during rebuild, which directly addresses the linked issue #3113.
Linked Issues check ✅ Passed Changes comprehensively implement issue #3113: rebuild now detects locked shields, temporarily unlocks via shieldsDown, completes backup/rebuild/restore, and re-applies shields lockdown.
Out of Scope Changes check ✅ Passed All changes are scoped to the rebuild-shields workflow and supporting shields APIs; no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/3113_rebuild-shields-up-auto-unlock-signed

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 23, 2026

E2E Advisor Recommendation

Required E2E: rebuild-openclaw-e2e, rebuild-hermes-e2e, shields-config-e2e
Optional E2E: network-policy-e2e, state-backup-restore-e2e

Dispatch hint: rebuild-openclaw-e2e,rebuild-hermes-e2e,shields-config-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • rebuild-openclaw-e2e (high (~60 min timeout; builds images and recreates sandbox)): Directly validates the OpenClaw rebuild lifecycle changed by this PR: backup, sandbox delete/recreate, state restore, credential stripping, registry version update, and policy preset preservation on a live OpenShell sandbox.
  • rebuild-hermes-e2e (high (~60 min timeout; builds Hermes images and recreates sandbox)): The rebuild action is shared across agents and the shields helpers resolve agent-specific config targets. This validates that Hermes rebuild still preserves state and upgrades correctly after the new auto-unlock/relock integration.
  • shields-config-e2e (medium (~30 min timeout; live sandbox with policy/config checks)): Directly validates the live shields security boundary affected by src/lib/shields/index.ts: shields up/down, config mutability/immutability, audit trail, and auto-restore timer behavior.

Optional E2E

  • network-policy-e2e (medium-high (~60 min timeout)): Useful adjacent confidence because shields down/up manipulates OpenShell network policy snapshots and permissive/restrictive transitions; not strictly required because shields-config-e2e already covers the shields-specific policy path.
  • state-backup-restore-e2e (medium-high (~60 min timeout)): Useful adjacent confidence for the backup/restore subsystem used during rebuild, especially because the new auto-unlock window exists to make backup succeed when shields are locked.

New E2E recommendations

  • rebuild + shields integration (high): Existing E2E jobs cover rebuild and shields separately, but none appears to explicitly run shields up, then nemoclaw <sandbox> rebuild --yes, and assert that backup succeeds, the sandbox is recreated/restored, policy/config lockdown is re-applied, and no auto-restore timer is left behind.
    • Suggested test: Add a live E2E scenario/job for rebuild-while-shields-up covering OpenClaw at minimum, with optional Hermes matrix coverage.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: rebuild-openclaw-e2e,rebuild-hermes-e2e,shields-config-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 23, 2026

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • None. No scenario workflow, scenario metadata, scenario runtime, or validation-suite files changed.

Optional scenario E2E

  • None.

Relevant changed files

  • None.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 23, 2026

PR Review Advisor

Findings: 0 needs attention, 3 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 3 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Spawned rebuild regression still does not assert final relock state or exit status (test/rebuild-shields-auto-unlock.test.ts:297): The spawned rebuild regression can still pass after observing the auto-unlock and backup messages. It does not assert the spawned command status, the final lock-state fixture, that the full rebuild reached the post-restore relock point, or that a relock failure in the spawned rebuild exits non-zero with recovery guidance. This is the same prior advisor finding and still applies in the current diff.
    • Recommendation: Extend the spawned rebuild fixture to assert a successful status for the happy path, observe the lock-state transition back to locked, and add a spawned negative case where relock verification fails and rebuild returns non-zero with recovery guidance.
    • Evidence: The test builds output from stdout/stderr and asserts only absence of the old backup-abort message plus presence of auto-unlock/snapshot/backup text around test/rebuild-shields-auto-unlock.test.ts:297-310. test/rebuild-shields-window.test.ts covers helper-level relock success/failure, but not the rebuild caller/callee integration path.
  • Coordinate with active overlapping rebuild and shields work (src/lib/actions/sandbox/rebuild.ts:443): Codebase drift is acceptable because all touched files still exist and the patch applies to active code, but trusted drift data shows concurrent work in the same rebuild and shields modules. The new auto-unlock/relock flow touches security-sensitive lifecycle behavior that may interact with preserved policies, gateway credential reuse, and shields behavior. This prior advisor finding still applies.
  • Large rebuild/shields modules still grow modestly (src/lib/actions/sandbox/rebuild.ts:443): The prior blocker about monolith growth appears addressed by extracting src/lib/actions/sandbox/rebuild-shields.ts, but two already-large hotspots still grow in this PR. This remains a lower-severity maintainability concern for security-sensitive lifecycle code.
    • Recommendation: Consider whether any additional rebuild/shields orchestration can move into focused helpers, especially around the relock lifecycle and recovery messaging, or document why the remaining growth should stay in the existing modules.
    • Evidence: Trusted monolithDeltas: src/lib/actions/sandbox/rebuild.ts grows from 870 to 889 lines (+19, warning) and src/lib/shields/index.ts grows from 1353 to 1371 lines (+18, warning). The new src/lib/actions/sandbox/rebuild-shields.ts helper is not a large-file hotspot.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Spawned rebuild regression still does not assert final relock state or exit status (test/rebuild-shields-auto-unlock.test.ts:297): The spawned rebuild regression can still pass after observing the auto-unlock and backup messages. It does not assert the spawned command status, the final lock-state fixture, that the full rebuild reached the post-restore relock point, or that a relock failure in the spawned rebuild exits non-zero with recovery guidance. This is the same prior advisor finding and still applies in the current diff.
    • Recommendation: Extend the spawned rebuild fixture to assert a successful status for the happy path, observe the lock-state transition back to locked, and add a spawned negative case where relock verification fails and rebuild returns non-zero with recovery guidance.
    • Evidence: The test builds output from stdout/stderr and asserts only absence of the old backup-abort message plus presence of auto-unlock/snapshot/backup text around test/rebuild-shields-auto-unlock.test.ts:297-310. test/rebuild-shields-window.test.ts covers helper-level relock success/failure, but not the rebuild caller/callee integration path.
  • Coordinate with active overlapping rebuild and shields work (src/lib/actions/sandbox/rebuild.ts:443): Codebase drift is acceptable because all touched files still exist and the patch applies to active code, but trusted drift data shows concurrent work in the same rebuild and shields modules. The new auto-unlock/relock flow touches security-sensitive lifecycle behavior that may interact with preserved policies, gateway credential reuse, and shields behavior. This prior advisor finding still applies.
  • Large rebuild/shields modules still grow modestly (src/lib/actions/sandbox/rebuild.ts:443): The prior blocker about monolith growth appears addressed by extracting src/lib/actions/sandbox/rebuild-shields.ts, but two already-large hotspots still grow in this PR. This remains a lower-severity maintainability concern for security-sensitive lifecycle code.
    • Recommendation: Consider whether any additional rebuild/shields orchestration can move into focused helpers, especially around the relock lifecycle and recovery messaging, or document why the remaining growth should stay in the existing modules.
    • Evidence: Trusted monolithDeltas: src/lib/actions/sandbox/rebuild.ts grows from 870 to 889 lines (+19, warning) and src/lib/shields/index.ts grows from 1353 to 1371 lines (+18, warning). The new src/lib/actions/sandbox/rebuild-shields.ts helper is not a large-file hotspot.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/rebuild-shields-auto-unlock.test.ts (1)

291-327: 💤 Low value

Consider adding assertion for shields re-lock behavior.

The tests verify the auto-unlock flow but don't assert that shields are re-applied after rebuild. Adding assertions for "Re-applying shields lockdown" and "Shields restored to UP" in the locked-shields case would complete coverage of the full unlock→rebuild→relock cycle from issue #3113.

💡 Suggested assertion additions for shields-locked test
       // Backup proceeds.
       expect(output).toContain("Backing up sandbox state");
+      // Shields re-applied after rebuild completes.
+      expect(output).toContain("Re-applying shields lockdown");
     },
   );
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/rebuild-shields-auto-unlock.test.ts` around lines 291 - 327, Add
assertions to the "detects locked shields and prints auto-unlock notice" test to
verify shields are re-locked after rebuild: after the existing expectations on
"Shields are UP" and "Backing up sandbox state", assert that output contains
"Re-applying shields lockdown" and "Shields restored to UP" (use the same output
variable from runRebuild). Also add negative assertions in the "skips
auto-unlock when shields are not configured" test to ensure those re-lock
messages are not present when createFixture({ shieldsLocked: false }) is used.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/rebuild-shields-auto-unlock.test.ts`:
- Around line 291-327: Add assertions to the "detects locked shields and prints
auto-unlock notice" test to verify shields are re-locked after rebuild: after
the existing expectations on "Shields are UP" and "Backing up sandbox state",
assert that output contains "Re-applying shields lockdown" and "Shields restored
to UP" (use the same output variable from runRebuild). Also add negative
assertions in the "skips auto-unlock when shields are not configured" test to
ensure those re-lock messages are not present when createFixture({
shieldsLocked: false }) is used.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 190c953f-20ae-4c13-a9cc-9f76b0b54bb9

📥 Commits

Reviewing files that changed from the base of the PR and between 638bccd and 9ddb891.

📒 Files selected for processing (3)
  • src/lib/actions/sandbox/rebuild.ts
  • src/lib/shields/index.ts
  • test/rebuild-shields-auto-unlock.test.ts

chengjiew added 2 commits May 23, 2026 21:33
Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/shields/index.ts (1)

1012-1023: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Rollback the temporary unlock before throwing here.

By this point the permissive policy has already been applied. If unlockAgentConfig() fails, this returns/throws immediately, so rebuild can abort before backup while the sandbox is left partially unlocked/permissive and without a timer/state entry. Please restore the saved snapshot and re-lock config on this path, the same way the timer-start failure branch already does.

Suggested direction
   try {
     unlockAgentConfig(sandboxName, target);
   } catch (err) {
     const message = err instanceof Error ? err.message : String(err);
+    console.error("  Rolling back — restoring policy from snapshot...");
+    const rollbackResult = run(buildPolicySetCommand(snapshotPath, sandboxName), {
+      ignoreError: true,
+    });
+    if (rollbackResult.status === 0) {
+      try {
+        lockAgentConfig(sandboxName, target);
+      } catch {
+        console.error(
+          "  Warning: Rollback re-lock could not be verified. Check config manually.",
+        );
+      }
+    } else {
+      console.error("  Warning: Policy restore failed during rollback.");
+    }
     console.error(`  ERROR: ${message}`);
     console.error(
       "  Config did not reach the mutable-default state; refusing to save shields-down state.",
     );
     console.error(
       `  Re-run \`nemoclaw ${sandboxName} shields down\` after correcting file ownership.`,
     );
     return fail(message);
   }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/shields/index.ts` around lines 1012 - 1023, When
unlockAgentConfig(sandboxName, target) throws, perform the same rollback steps
used in the "timer-start failure" branch before returning/failing: restore the
saved snapshot and re-lock the agent config (i.e. undo the permissive/unlocked
state) and clear any timer/state entries created earlier, then call
fail(message); update the catch block around unlockAgentConfig to invoke those
rollback helpers (the same functions or sequence used in the timer-start failure
path) before logging and returning.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/actions/sandbox/rebuild.ts`:
- Around line 446-460: After successfully calling openRebuildShieldsWindow(...)
and assigning rebuildShieldsWindow, wrap the remainder of rebuildSandbox's
post-open critical section in a try/finally: declare and update a boolean
sandboxStillExists (default true/false as appropriate) that reflects whether the
sandbox still exists during operations, run the existing filesystem/process
logic inside the try, and in the finally always call
relockRebuildShieldsWindow(sandboxName, rebuildShieldsWindow,
sandboxStillExists, CLI_NAME) (you can keep the relockShieldsIfNeeded wrapper if
preferred) so that shields are guaranteed to be relocked even if an exception is
thrown.

---

Outside diff comments:
In `@src/lib/shields/index.ts`:
- Around line 1012-1023: When unlockAgentConfig(sandboxName, target) throws,
perform the same rollback steps used in the "timer-start failure" branch before
returning/failing: restore the saved snapshot and re-lock the agent config (i.e.
undo the permissive/unlocked state) and clear any timer/state entries created
earlier, then call fail(message); update the catch block around
unlockAgentConfig to invoke those rollback helpers (the same functions or
sequence used in the timer-start failure path) before logging and returning.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 588fc00d-9f65-4895-a416-e614bd06d790

📥 Commits

Reviewing files that changed from the base of the PR and between 9ddb891 and b97772f.

📒 Files selected for processing (4)
  • src/lib/actions/sandbox/rebuild-shields.ts
  • src/lib/actions/sandbox/rebuild.ts
  • src/lib/shields/index.ts
  • test/rebuild-shields-window.test.ts

Comment thread src/lib/actions/sandbox/rebuild.ts Outdated
Comment on lines +446 to +460
let rebuildShieldsWindow: RebuildShieldsWindow;
try {
rebuildShieldsWindow = openRebuildShieldsWindow(sandboxName, CLI_NAME);
} catch (err) {
bail(err instanceof Error ? err.message : String(err));
return;
}

const relockShieldsIfNeeded = (sandboxStillExists: boolean): boolean =>
relockRebuildShieldsWindow(
sandboxName,
rebuildShieldsWindow,
sandboxStillExists,
CLI_NAME,
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Guarantee relock with a finally once the rebuild window opens.

After openRebuildShieldsWindow() lowers shields with skipTimer: true, the rest of rebuildSandbox() still makes several filesystem/process calls that can throw unexpectedly. Those exceptions bypass the hand-coded relockShieldsIfNeeded(...) branches and leave the sandbox unlocked indefinitely. Please wrap the whole post-open critical section in a try/finally, with a tracked sandboxStillExists flag for the final relock attempt.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/actions/sandbox/rebuild.ts` around lines 446 - 460, After
successfully calling openRebuildShieldsWindow(...) and assigning
rebuildShieldsWindow, wrap the remainder of rebuildSandbox's post-open critical
section in a try/finally: declare and update a boolean sandboxStillExists
(default true/false as appropriate) that reflects whether the sandbox still
exists during operations, run the existing filesystem/process logic inside the
try, and in the finally always call relockRebuildShieldsWindow(sandboxName,
rebuildShieldsWindow, sandboxStillExists, CLI_NAME) (you can keep the
relockShieldsIfNeeded wrapper if preferred) so that shields are guaranteed to be
relocked even if an exception is thrown.

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/shields/index.ts (1)

916-928: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Fix shieldsDown error handling so rebuild auto-unlock can recover (avoid process.exit(1))

  • openRebuildShieldsWindow wraps shields.shieldsDown(...) in a try/catch and relies on exceptions to return null with recovery guidance (src/lib/actions/sandbox/rebuild-shields.ts).
  • shieldsDown does not throw on failure; it calls process.exit(1) on error paths (src/lib/shields/index.ts, e.g., around lines 927/955/987/1017 and other exit sites), so the try/catch in openRebuildShieldsWindow cannot run.
  • ShieldsDownOpts currently has timeout, reason, policy, and skipTimer only—no throwOnError (or equivalent) option to switch from exiting to throwing.

Refactor shieldsDown to throw exceptions (or add a throwOnError option that throws) and reserve process.exit(1) for top-level CLI entrypoints so rebuild can handle failures gracefully.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/shields/index.ts` around lines 916 - 928, shieldsDown currently calls
process.exit(1) on error paths which prevents openRebuildShieldsWindow's
try/catch from working; add a throwOnError?: boolean to ShieldsDownOpts and
change shieldsDown to throw a descriptive Error (include context like
sandboxName/state) when throwOnError is true instead of calling process.exit(1)
on all failure branches (e.g., the "already unlocked" path and other exit sites
inside shieldsDown), and update the caller openRebuildShieldsWindow to invoke
shieldsDown(..., { ..., throwOnError: true }) so rebuild can catch and recover
while leaving CLI entrypoints to continue using the default behavior that exits.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/lib/shields/index.ts`:
- Around line 916-928: shieldsDown currently calls process.exit(1) on error
paths which prevents openRebuildShieldsWindow's try/catch from working; add a
throwOnError?: boolean to ShieldsDownOpts and change shieldsDown to throw a
descriptive Error (include context like sandboxName/state) when throwOnError is
true instead of calling process.exit(1) on all failure branches (e.g., the
"already unlocked" path and other exit sites inside shieldsDown), and update the
caller openRebuildShieldsWindow to invoke shieldsDown(..., { ..., throwOnError:
true }) so rebuild can catch and recover while leaving CLI entrypoints to
continue using the default behavior that exits.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a3b766ca-0b23-49b1-92d8-bb41e83cd52c

📥 Commits

Reviewing files that changed from the base of the PR and between b97772f and 199ac5a.

📒 Files selected for processing (4)
  • src/lib/actions/sandbox/rebuild-shields.ts
  • src/lib/actions/sandbox/rebuild.ts
  • src/lib/shields/index.ts
  • test/rebuild-shields-window.test.ts

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
@chengjiew
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.51 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[macOS][Sandbox] nemoclaw rebuild fails to back up state when shields are UP — should auto-unlock before backup

2 participants