Skip to content

fix(channels): sync session policy preset and registry policy preset#4013

Merged
ericksoa merged 6 commits into
mainfrom
fix/channels-sync-policy-preset-kinds
May 22, 2026
Merged

fix(channels): sync session policy preset and registry policy preset#4013
ericksoa merged 6 commits into
mainfrom
fix/channels-sync-policy-preset-kinds

Conversation

@hunglp6d
Copy link
Copy Markdown
Contributor

@hunglp6d hunglp6d commented May 21, 2026

Summary

channels add <channel> updates the registry's applied preset list but not the onboard session's recorded preset selection. On a subsequent rebuild, the resume-mode onboard reads the stale session, diffs it against the registry, and narrows the channel's preset away just before the new sandbox boots — so the channel bridge starts up against a policy that briefly denies its upstream API, and downstream SDK state (notably Slack Bolt's per-event authorize path) stays wedged even after Step 5.5 of rebuild reapplies the preset from the backup manifest. This PR keeps session.policyPresets in sync with the registry whenever channels add / channels remove mutates a built-in preset, so rebuild's resume step no longer sees a divergence to "correct."

Related Issue

Fixes #4012

Changes

  • src/lib/actions/sandbox/policy-channel.ts
    • New helper syncSessionPolicyPresetsForChannel(sandboxName, channelName, action) that mirrors policies.applyPreset / policies.removePreset into session.policyPresets via onboardSession.updateSession.
    • Called from applyChannelPresetIfAvailable after a successful apply (action "add"), and from removeChannelPresetIfPresent after a successful remove (action "remove"). Failure paths leave the session untouched so it cannot diverge from the registry on a half-applied state.
  • test/channels-add-preset.test.ts
    • Extended buildPreamble to stub onboardSession.loadSession / updateSession with configurable session state and configurable load/save failure injection. Existing tests are unchanged because the new options all default to a happy-path session.
    • New describe block channels add/remove keeps session.policyPresets in sync with registry with four cases pinning down the invariants:
      1. successful add appends the channel name to session.policyPresets exactly once;
      2. session belonging to a different sandbox is left untouched (registry still gets the preset);
      3. missing session file does not abort channels add;
      4. session save failure is non-fatal — channels add still completes and prompts rebuild.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Hung Le hple@nvidia.com

Summary by CodeRabbit

  • Bug Fixes
    • Preset add/remove now reliably syncs into onboard sessions when the session matches, removes presets when present (including non-built-in cases), skips foreign or missing sessions, and won’t abort the registry operation if session persistence fails.
  • Tests
    • Expanded regression suite covering channels/policy add/remove (built-in and from-file), session synchronization, call ordering, missing-session behavior, and resilience to session-update errors.

Review Change Stack

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 21, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e26d087b-07a1-4324-85be-a499133ffafb

📥 Commits

Reviewing files that changed from the base of the PR and between e256af3 and 9fcdcc9.

📒 Files selected for processing (1)
  • test/policies.test.ts

📝 Walkthrough

Walkthrough

Adds a best-effort helper that mirrors registry preset add/remove into the onboard session's session.policyPresets, wires it into policy and channel apply/remove paths, and expands tests to validate matching, foreign, missing-session, and update-failure scenarios.

Changes

Session Synchronization for Channel and Policy Operations

Layer / File(s) Summary
Session synchronization helper and integration points
src/lib/actions/sandbox/policy-channel.ts
Adds syncSessionPolicyPresetsWithRegistry(sandboxName, presetName, action) and calls it from addSandboxPolicy, applyExternalPreset, applyChannelPresetIfAvailable, removeChannelPresetIfPresent, and removeSandboxPolicy to mirror registry adds/removes into session.policyPresets.
Policy add/remove script tests
test/policy-add-remove-session-sync.test.ts, test/policies.test.ts
Script-based tests that stub the policy engine and onboard session to assert apply, applyPresetContent, and remove calls and corresponding session.policyPresets mutations across success, failure, foreign-sandbox, and no-session cases.
Test harness: session simulation stubs
test/channels-add-preset.test.ts
buildPreamble() accepts session scenario options and stubs onboardSession.loadSession/updateSession, tracks sessionUpdates and removedCalls, and exports getSessionState() for assertions.
Channels add/remove session sync tests
test/channels-add-preset.test.ts
New regression suite asserting channels add and channels remove keep session.policyPresets synchronized for matching sandbox, do not mutate foreign sessions, skip when no onboard session exists, and do not abort when updateSession throws.

Sequence Diagram(s)

sequenceDiagram
  participant CLI
  participant PolicyModule as policy-channel
  participant PolicyEngine as policies
  participant SessionSync as syncSessionPolicyPresetsWithRegistry
  participant OnboardSession as onboardSession
  CLI->>PolicyModule: add/remove preset
  PolicyModule->>PolicyEngine: applyPreset / applyPresetContent / removePreset
  PolicyEngine-->>PolicyModule: success
  PolicyModule->>SessionSync: sync add/remove
  SessionSync->>OnboardSession: loadSession()
  OnboardSession-->>SessionSync: session data (or error)
  SessionSync->>OnboardSession: updateSession(mutator) if required
  OnboardSession-->>SessionSync: saved (or error logged)
  SessionSync-->>PolicyModule: complete
  PolicyModule-->>CLI: operation complete
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#4001: Overlapping updates to src/lib/actions/sandbox/policy-channel.ts handling session preset removal flows.

Suggested labels

Sandbox, NemoClaw CLI

Suggested reviewers

  • ericksoa
  • cv

Poem

🐰 I hopped through sessions, presets in my paws,

When registries change I sync without pause,
Channels and policies now hum in one voice,
Tests pin the sync — no surprise, that's my choice,
Rebuilds keep their state and the bot finds its poise.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: syncing session policy presets with registry policy presets for channels.
Linked Issues check ✅ Passed The PR implementation addresses all coding objectives from issue #4012: it syncs registry-applied/removed presets into session.policyPresets to eliminate transient policy mismatches during rebuild.
Out of Scope Changes check ✅ Passed All changes are scoped to the policy-channel sync mechanism and related tests; no unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/channels-sync-policy-preset-kinds

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

E2E Advisor Recommendation

Required E2E: channels-add-remove-e2e, network-policy-e2e, onboard-resume-e2e
Optional E2E: messaging-providers-e2e, rebuild-openclaw-e2e

Dispatch hint: channels-add-remove-e2e,network-policy-e2e,onboard-resume-e2e

Auto-dispatched E2E: channels-add-remove-e2e, network-policy-e2e, onboard-resume-e2e via nightly-e2e.yaml at 9fcdcc9783934eb58b3883c83f6c3414249b2b6fnightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • channels-add-remove-e2e (high): Directly exercises nemoclaw <sandbox> channels add telegram, rebuild, provider/channel state, matching policy preset presence, egress to api.telegram.org, then channels remove plus rebuild and verifies the preset is removed. This is the closest existing E2E for the changed channel preset/session-sync path.
  • network-policy-e2e (medium): Directly validates live policy-add behavior, dry-run, hot reload, whitelist enforcement, and deny-by-default semantics. The PR changes policy-add/policy-remove control flow and session mirroring for network policy presets, which is security-boundary behavior.
  • onboard-resume-e2e (high): Validates onboard session file handling and resume after interruption at the policy step. The PR reads and mutates onboardSession.policyPresets so stale/invalid session behavior should be checked against a real resume flow.

Optional E2E

  • messaging-providers-e2e (high): Adjacent confidence for the full messaging provider/placeholder/L7-proxy chain across Telegram, Discord, Slack, and WhatsApp. Useful because channel preset synchronization affects messaging bridge egress, but channels-add-remove-e2e is the more direct merge gate.
  • rebuild-openclaw-e2e (high): Adjacent confidence for OpenClaw rebuild state survival. The changed code is intended to preserve policy intent through rebuild, though this existing job is broader and not specifically policy-focused.

New E2E recommendations

  • policy-add-remove session sync across rebuild (high): Existing network-policy-e2e validates live policy-add, and channels-add-remove-e2e validates channel presets across rebuild, but there is no focused E2E for standalone policy-add <preset>, policy-add --from-file, and policy-remove <preset> followed by rebuild/onboard-resume reconciliation. That is the exact new behavior added for session.policyPresets.
    • Suggested test: Add a policy session/rebuild E2E that onboards a sandbox, runs built-in policy-add, custom policy-add --from-file, rebuilds and asserts policy-list plus egress remain consistent, then runs policy-remove, rebuilds again, and asserts the preset is not restored from session state.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: channels-add-remove-e2e,network-policy-e2e,onboard-resume-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

PR Review Advisor

Recommendation: blocked
Confidence: high
Analyzed HEAD: 9fcdcc9783934eb58b3883c83f6c3414249b2b6f
Findings: 2 blocker(s), 4 warning(s), 0 suggestion(s)

This is an automated advisory review. A human maintainer must make the final merge decision.

Limitations: This review used the provided trusted deterministic context and the supplied diff; no tests, package-manager commands, workflows, or PR scripts were executed.; PR title/body/comments/issue text were treated as untrusted evidence and mapped only to diff/test evidence.; The git diff in the prompt was truncated for large test files; the review relied on the provided deterministic context and visible diff hunks for production line evidence.; Required E2E jobs were recommended and auto-dispatched for the current head SHA, but no passed result for those job names at 9fcdcc9 was present in the provided context.; Open PR overlap exists on test/channels-add-preset.test.ts and test/policies.test.ts; a maintainer should reconcile drift before merging.; Human maintainer review remains required; this advisory result must not be treated as approval or a merge decision.

Workflow run

Full advisor summary

PR Review Advisor

Base: origin/main
Head: HEAD
Analyzed SHA: 9fcdcc9783934eb58b3883c83f6c3414249b2b6f
Recommendation: blocked
Confidence: high

The fix direction and unit coverage look plausible, but merge should wait on current-head required E2E evidence and the enforced sandbox policy-channel monolith growth blocker.

Gate status

  • CI: pass — 5 required status context(s) completed with no failures. Non-required contexts still pending: 4; failed: 0.
  • Mergeability: warning — mergeStateStatus=UNKNOWN
  • Review threads: pass — 1 review thread(s), all resolved.
  • Risky code tested: pass — No risky code areas detected by path heuristics.

🔴 Blockers

  • Required E2E pass evidence is missing for the current head SHA (src/lib/actions/sandbox/policy-channel.ts:733): The PR changes sandbox/network-policy lifecycle behavior by syncing registry policy preset mutations into onboard session state. The E2E Advisor required channels-add-remove-e2e, network-policy-e2e, and onboard-resume-e2e for head SHA 9fcdcc9, but the provided context only shows auto-dispatch at this SHA and no passing results for those required jobs. Older or cancelled selective results do not cover this head.
    • Recommendation: Wait for passing channels-add-remove-e2e, network-policy-e2e, and onboard-resume-e2e results that explicitly target 9fcdcc9 before merge consideration.
    • Evidence: E2E Advisor comment: Required E2E: channels-add-remove-e2e, network-policy-e2e, onboard-resume-e2e; Auto-dispatched via nightly-e2e.yaml at 9fcdcc9. No provided selective result shows these jobs passing for 9fcdcc9.
  • Security-sensitive policy-channel monolith grew beyond enforced budget (src/lib/actions/sandbox/policy-channel.ts:797): The sandbox policy-channel file grew by 53 lines. Trusted monolith analysis marks growth of 20 or more lines in this current monolith as a blocker, and recent history shows active refactoring pressure in this file.
    • Recommendation: Extract the session sync helper and related policy/channel glue into a focused helper/module, or offset the growth with equivalent cleanup in the same PR before merge consideration.
    • Evidence: monolithDeltas: src/lib/actions/sandbox/policy-channel.ts baseLines=1050 headLines=1103 delta=53 severity=blocker.

🟡 Warnings

  • Session load failures are silently ignored (src/lib/actions/sandbox/policy-channel.ts:802): syncSessionPolicyPresetsWithRegistry catches onboardSession.loadSession() errors and returns without logging. Best-effort behavior after a successful registry mutation is reasonable, but complete silence makes corrupt or unreadable session state indistinguishable from an intentionally missing session, which can hide the stale-session condition involved in the bug path.
    • Recommendation: Log a low-noise warning for load failures while keeping them non-fatal, or document why unreadable session state must remain silent. Add a negative test for the sessionLoadThrows path if silence is intentional.
    • Evidence: Diff shows try { session = onboardSession.loadSession(); } catch { return; }. The test harness exposes sessionLoadThrows, but the added channel/policy tests cover missing session and save failure, not load failure.
  • Active PR overlap should be reconciled before merge (test/channels-add-preset.test.ts:1): This PR modifies files also touched by other open PRs, increasing drift risk in actively changing CLI/channel/policy test areas.
  • Mergeability is not resolved: GitHub reports mergeStateStatus=UNKNOWN for the current head. Required CI is green in the provided gate context, but mergeability still needs a resolved GitHub state before merge consideration.
    • Recommendation: Wait for GitHub mergeability calculation to settle and verify no branch-protection or conflict condition appears for head SHA 9fcdcc9.
    • Evidence: Trusted gateStatus.mergeability: status=warning, evidence=mergeStateStatus=UNKNOWN.
  • Unit tests mock the live rebuild and policy enforcement path (test/policy-add-remove-session-sync.test.ts:1): The added tests validate helper invariants for built-in policy-add, custom policy-add, policy-remove, channel add/remove, foreign sessions, missing sessions, and save failures. They do not execute a real rebuild/resume, gateway/L7 policy proxy, sandbox boot, or messaging provider authorization path.
    • Recommendation: Keep the unit tests, but require current-head E2E pass evidence. Consider adding the E2E Advisor's suggested focused policy-add/remove session-sync-across-rebuild E2E as follow-up coverage.
    • Evidence: Trusted testDepth verdict is e2e_required for src/lib/actions/sandbox/policy-channel.ts; E2E Advisor recommends channels-add-remove-e2e, network-policy-e2e, and onboard-resume-e2e.

🔵 Suggestions

  • None.

Acceptance coverage

  • partial — [Channels] Channel bot stops responding after channels add <channel> + rebuild (verified on slack; likely affects any messaging channel): The patch targets channel/policy preset session synchronization and adds regression tests for session.policyPresets alignment, but no current-head required E2E pass evidence is provided.
  • partial — When I enable a messaging channel on an already-onboarded sandbox via nemoclaw <sandbox> channels add <channel> followed by nemoclaw <sandbox> rebuild, the rebuild completes successfully, status and policy-list show the channel's preset is active, and the channel reports socket mode connected (or its equivalent) in the logs — but the bot never replies when I @-mention it.: applyChannelPresetIfAvailable now syncs session.policyPresets after successful policy apply, and channel tests assert Slack add appends the preset. The diff does not run rebuild/status/policy-list/socket-mode/mention behavior.
  • unknown — If I instead export the channel's tokens before running bash install.sh (so the channel is configured during the initial onboard), the same channel app/bot/tokens work fine and the bot replies as expected.: Installer and initial-onboard token behavior are not changed or validated by the diff. Tests use dummy token environment variables only.
  • partial — I verified this on Slack, but I suspect it applies to any messaging channel that has a matching built-in network policy preset (discord, telegram, whatsapp, wechat).: The helper is preset-name generic and channel tests cover Slack plus existing Telegram/Discord/WhatsApp paths, but no live multi-provider E2E pass is shown for the current head SHA.
  • met — The channels add code path that wires up the policy is shared across all of them — I just haven't reproduced on the other channels myself.: applyChannelPresetIfAvailable is the shared channel preset apply path and now calls syncSessionPolicyPresetsWithRegistry after successful built-in preset apply.
  • partialExpected: adding a channel via channels add <channel> + rebuild after onboard should leave the bot in the same working state as configuring that channel during the initial onboard.: The code keeps session.policyPresets aligned with successful registry mutations after channels add, addressing the described stale resume-session divergence. Current-head E2E evidence is still missing for equivalent live bot behavior.
  • partialActual: the channel connects (real-time transport is up, periodic auth checks succeed) but every incoming event is dropped.: The fix targets policy/session divergence that can cause policy-denied event authorization, but unit tests do not execute real event processing or messaging provider behavior.
  • partial — On Slack specifically I see bolt-app Authorization of incoming event did not succeed. No listeners will be called. followed by a 403 / policy_denied underneath.: The fix is intended to keep the Slack policy preset in session state after successful channel add. No Slack Bolt authorization path is exercised by the diff or proven by current-head E2E.
  • unknown — 1. On a clean host with no SLACK_* env vars set, run bash install.sh and let onboarding complete with the default presets (no slack).: No installer/onboard E2E evidence for the current head SHA is present. Tests do not run bash install.sh.
  • partial — 2. Export Slack credentials in the same shell:
    export SLACK_APP_TOKEN=xapp-...
    export SLACK_BOT_TOKEN=xoxb-...
    export SLACK_ALLOWED_USERS=U0...,U0...: The channel test harness supplies dummy SLACK_APP_TOKEN and SLACK_BOT_TOKEN values, but it does not validate SLACK_ALLOWED_USERS or real Slack credential handling.
  • met — 3. Add the slack channel: nemoclaw my-assistant channels add slack: New tests call addSandboxChannel("test-sb", { channel: "slack" }) and assert session policy presets include slack after successful preset apply.
  • missing — 4. Apply: nemoclaw my-assistant rebuild and accept the prompt. Wait for ✓ Sandbox 'my-assistant' rebuilt successfully.: Tests only observe the non-interactive prompt/rebuild marker; they do not execute a real rebuild. Required current-head E2E pass evidence is missing.
  • missing — 5. Verify everything looks healthy:
  • nemoclaw my-assistant status → shows Policies: …, slack: No status command or live sandbox status output is exercised in the added tests or posted as current-head E2E evidence.
  • missing — 5. Verify everything looks healthy:
  • nemoclaw my-assistant policy-list → slack is ● (active): No policy-list command or live gateway/registry policy display is exercised in the added tests or posted as current-head E2E evidence.
  • missing — 6. In the Slack workspace where the app is installed, @-mention the bot in any channel it has been invited to.: No real Slack workspace or bot mention is exercised by the diff.
  • missing — 7. Observe: bot does not reply.: The PR does not include live regression proof that the bot replies after the fix; unit tests cannot validate message reply behavior.
  • unknown — For comparison, doing the same on a fresh host but with SLACK_APP_TOKEN / SLACK_BOT_TOKEN / SLACK_ALLOWED_USERS exported before step 1 produces a working bot that replies to mentions.: No comparative initial-onboard Slack E2E evidence is available for this PR.
  • unknown — OS: Ubuntu (Linux x86_64): No current-head Docker-backed channels/network/onboard E2E pass is posted for this environment.
  • unknown — Node.js: v22.22.3 (installed by NemoClaw installer via nvm): The unit tests run under the CI/test process.execPath and do not validate installer-provisioned Node.js v22.22.3.
  • unknown — Docker: Docker Engine 27.x: No current-head Docker-backed required E2E result is present.

Security review

  • pass — 1. Secrets and Credentials: No production secrets, API keys, passwords, or credential files were added. Test token strings such as xoxb-test-1234-5678, xapp-1-test-1234-5678, test-telegram-token, and test-discord-token are dummy values in isolated test harnesses.
  • pass — 2. Input Validation and Data Sanitization: The production change does not add a new external parsing surface. Preset/channel names continue to flow through existing preset lookup and channel validation before registry/session mutation. Custom preset loading remains delegated to existing policies.loadPresetFromFile.
  • pass — 3. Authentication and Authorization: No new endpoints or authentication decisions are introduced. The session sync helper checks session.sandboxName against the target sandbox before mutating policyPresets, reducing cross-sandbox resume-state corruption risk.
  • pass — 4. Dependencies and Third-Party Libraries: No new dependencies, package manager configuration, external registries, installers, or third-party libraries were introduced.
  • warning — 5. Error Handling and Logging: Session update failures are logged and treated as non-fatal after registry success, which is appropriate. However, loadSession failures are silently swallowed, which can hide corrupt or unreadable session state relevant to the rebuild/resume bug path.
  • pass — 6. Cryptography and Data Protection: Not applicable — no cryptographic operations, algorithms, key handling, encryption, or data protection primitives are changed.
  • warning — 7. Configuration and Security Headers: The change modifies sandbox network policy preset state synchronization. The intended effect is security-positive consistency between registry and onboard session, but current-head E2E pass evidence is missing to verify real policy proxy behavior and ensure no stale allow-list or policy bypass side effects.
  • warning — 8. Security Testing: Unit tests cover built-in policy-add success/failure, custom policy-add, policy-remove, Slack channel add/remove, foreign session, missing session, and save failure paths. They mock runtime/gateway/rebuild behavior, and required channels-add-remove-e2e, network-policy-e2e, and onboard-resume-e2e passes are missing for head 9fcdcc9.
  • warning — 9. Holistic Security Posture: The patch likely improves policy consistency and avoids session/registry divergence, but it touches sandbox/network-policy lifecycle code, grows an enforced monolith, has active file overlap with other PRs, and lacks required current-head E2E proof for rebuild/policy enforcement behavior.

Test / E2E status

  • Test depth: e2e_required — Runtime/sandbox/infrastructure paths need real execution coverage: src/lib/actions/sandbox/policy-channel.ts. The added unit tests are useful for helper invariants, but they mock registry, policy engine, gateway, OpenShell, session persistence, and rebuild behavior and cannot prove the original Slack/rebuild policy-denial bug is fixed in a real sandbox.
  • E2E Advisor: missing
  • Required E2E jobs: channels-add-remove-e2e, network-policy-e2e, onboard-resume-e2e
  • Missing for analyzed SHA: channels-add-remove-e2e, network-policy-e2e, onboard-resume-e2e

✅ What looks good

  • The fix is narrowly targeted: session.policyPresets is synchronized only after policies.applyPreset, policies.applyPresetContent, or policies.removePreset reports success, avoiding session divergence on half-applied policy operations.
  • The helper avoids corrupting another sandbox's resume state by checking session.sandboxName before mutating session.policyPresets.
  • The resolved CodeRabbit concern about syncing after failed applyPreset is addressed by gating sync on a truthy apply result.
  • Channel add/remove tests now cover matching sandbox, foreign sandbox, missing session, and session save failure cases.
  • Standalone policy add/remove tests cover built-in preset add, built-in add failure, custom --from-file, remove, foreign session, and missing session behavior.
  • The helper avoids duplicate preset entries and unnecessary session writes when the desired state already holds.
  • No new dependencies, workflow changes, Dockerfile changes, production secrets, or new shell-string execution paths were introduced by this PR.
  • Codebase drift check confirms the patched files still exist on this branch; no rename hints were reported.

Review completeness

  • This review used the provided trusted deterministic context and the supplied diff; no tests, package-manager commands, workflows, or PR scripts were executed.
  • PR title/body/comments/issue text were treated as untrusted evidence and mapped only to diff/test evidence.
  • The git diff in the prompt was truncated for large test files; the review relied on the provided deterministic context and visible diff hunks for production line evidence.
  • Required E2E jobs were recommended and auto-dispatched for the current head SHA, but no passed result for those job names at 9fcdcc9 was present in the provided context.
  • Open PR overlap exists on test/channels-add-preset.test.ts and test/policies.test.ts; a maintainer should reconcile drift before merging.
  • Human maintainer review remains required; this advisory result must not be treated as approval or a merge decision.
  • Human maintainer review required: yes

@hunglp6d hunglp6d self-assigned this May 21, 2026
@hunglp6d hunglp6d added bug Something isn't working fix VRDC Issues and PRs submitted by NVIDIA VRDC test team. labels May 21, 2026
@hunglp6d hunglp6d marked this pull request as ready for review May 22, 2026 10:28
@hunglp6d hunglp6d added the v0.0.50 Release target label May 22, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/channels-add-preset.test.ts (1)

384-530: ⚡ Quick win

Add a channels remove sync regression in this suite.

This block validates only add-path sync, but the suite contract and fix scope include remove-path session synchronization too. Please add a remove test that asserts successful preset removal updates session.policyPresets and keeps best-effort behavior on missing/foreign session.

Suggested test shape
 describe("channels add/remove keeps session.policyPresets in sync with registry", () => {
+  it("removes the channel preset from session.policyPresets after a successful remove", () => {
+    // arrange session with ["npm", "slack"], invoke removeSandboxChannel("slack")
+    // assert sessionUpdates length === 1 and finalSession.policyPresets === ["npm"]
+  });
 });
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/channels-add-preset.test.ts` around lines 384 - 530, Add a new test in
the same "channels add/remove keeps session.policyPresets in sync with registry"
suite that exercises the remove-path: call
ctx.channelModule.removeSandboxChannel("test-sb", { channel: "slack" }) in a
script built with buildPreamble and assert that (1) when the onboard session
belongs to "test-sb" the remove produces exactly one session update
(ctx.sessionUpdates.length === 1) and finalSession.policyPresets no longer
contains "slack", (2) when the session belongs to a different sandbox the
registry still sees the applied call (ctx.appliedCalls includes { sandboxName:
"test-sb", presetName: "slack" }) but ctx.sessionUpdates remains [], and (3)
mirror the existing error best-effort cases: when sessionMissing is true the
call still applies to registry and when sessionUpdateThrows is true the flow
still completes and prompts rebuild (ctx.callOrder includes "promptAndRebuild");
place checks analogous to the add tests using ctx.getSessionState,
ctx.sessionUpdates, ctx.appliedCalls, ctx.callOrder, sessionMissing, and
sessionUpdateThrows to locate behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/actions/sandbox/policy-channel.ts`:
- Around line 163-165: policies.applyPreset(sandboxName, answer) may fail but
the code always calls syncSessionPolicyPresetsWithRegistry(sandboxName, answer,
"add"); change the flow so you capture the boolean/returned result of
policies.applyPreset(...) and only call
syncSessionPolicyPresetsWithRegistry(...) when that result is truthy/successful;
if applyPreset returns false or throws, skip the sync (and optionally
surface/log the failure) to avoid diverging the session from the registry.

---

Nitpick comments:
In `@test/channels-add-preset.test.ts`:
- Around line 384-530: Add a new test in the same "channels add/remove keeps
session.policyPresets in sync with registry" suite that exercises the
remove-path: call ctx.channelModule.removeSandboxChannel("test-sb", { channel:
"slack" }) in a script built with buildPreamble and assert that (1) when the
onboard session belongs to "test-sb" the remove produces exactly one session
update (ctx.sessionUpdates.length === 1) and finalSession.policyPresets no
longer contains "slack", (2) when the session belongs to a different sandbox the
registry still sees the applied call (ctx.appliedCalls includes { sandboxName:
"test-sb", presetName: "slack" }) but ctx.sessionUpdates remains [], and (3)
mirror the existing error best-effort cases: when sessionMissing is true the
call still applies to registry and when sessionUpdateThrows is true the flow
still completes and prompts rebuild (ctx.callOrder includes "promptAndRebuild");
place checks analogous to the add tests using ctx.getSessionState,
ctx.sessionUpdates, ctx.appliedCalls, ctx.callOrder, sessionMissing, and
sessionUpdateThrows to locate behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5ad07f06-6721-42a5-ae19-e35a62a7b79b

📥 Commits

Reviewing files that changed from the base of the PR and between ef84117 and 20a4a61.

📒 Files selected for processing (3)
  • src/lib/actions/sandbox/policy-channel.ts
  • test/channels-add-preset.test.ts
  • test/policy-add-remove-session-sync.test.ts

Comment thread src/lib/actions/sandbox/policy-channel.ts Outdated
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26282580925
Target ref: 20a4a615209adcfa01b0c572b7d3e52f573ca008
Workflow ref: main
Requested jobs: channels-add-remove-e2e,network-policy-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job Result
channels-add-remove-e2e ✅ success
network-policy-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26296041350
Target ref: e256af3a9ca2a77e2368c7ea2321f72479c7fd4e
Workflow ref: main
Requested jobs: channels-add-remove-e2e,network-policy-e2e
Summary: 0 passed, 0 failed, 0 skipped

Job Result
channels-add-remove-e2e ⚠️ cancelled
network-policy-e2e ⚠️ cancelled

Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved after topper. The add path now syncs session policy presets only after successful applyPreset and remove-path sync coverage is in place; the follow-up test mock mirrors the production applyPreset success contract. Local focused validations passed and live checks are green at 9fcdcc9.

@ericksoa ericksoa merged commit aac4073 into main May 22, 2026
29 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26296440535
Target ref: 9fcdcc9783934eb58b3883c83f6c3414249b2b6f
Workflow ref: main
Requested jobs: channels-add-remove-e2e,network-policy-e2e,onboard-resume-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
channels-add-remove-e2e ✅ success
network-policy-e2e ✅ success
onboard-resume-e2e ✅ success

miyoungc added a commit that referenced this pull request May 22, 2026
Audit found the v0.0.49 release notes promised behaviors that did not ship
or were never implemented. Realign to the actual code on main.

- Drop the EXDEV runtime-deps claim: #3820 was reverted by #4051 in this
  release window, so the behavior is not present.
- Drop the "skip broad permission repair" claim: no corresponding commit
  in v0.0.48..v0.0.49.
- Rewrite the gateway probe classifier list in release-notes.mdx and
  commands.mdx to match the real states emitted by
  src/lib/status-command-deps.ts (named gateway unreachable / present
  but not Connected / pointing at a different name / not configured).
  The previous "non-JSON health response" example did not exist in code.
- Expand the channel-removal bullet to describe #4001's user-visible
  teardown (durable QR-paired state wipe, abort-on-failure, config.json
  re-sync) in addition to the existing #4013 sync.
- Add bullets for user-visible PRs that were merged in the release
  window but missing from the notes: #3854 (restricted dmesg in debug
  output), #3866 (shields status and logs --tail UX), #3984 (Hermes
  messaging policy scoping), and #4011 (Docker group security note).

Regenerated nemoclaw-user-overview and nemoclaw-user-reference skills
from the updated docs via scripts/docs-to-skills.py.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cv pushed a commit that referenced this pull request May 22, 2026
## Summary
Refreshes the NemoClaw docs for the v0.0.49 hardening release, including
release notes, command reference updates, troubleshooting guidance,
version metadata, and regenerated user skills.

## Changes
- #3796, #3854, #3863, #3866, #3984, #4001, #4011, #4013, #4020, #4022,
#4023, #4060, #4062 -> `docs/about/release-notes.mdx`: Adds the v0.0.49
hardening release summary covering gateway reliability,
status/doctor/shields and debug UX, OpenClaw compatibility, messaging
channel teardown, Hermes policy scoping, snapshots, source installs and
Docker group security note, GPU preflight, CLI usage, E2E, and CI
improvements.
- #3796 -> `docs/manage-sandboxes/backup-restore.mdx` and
`docs/reference/commands.mdx`: Documents `snapshot restore --to`
overwrite protection and the `--force` opt-in.
- #3863, #4013, #4020, #4023 -> `docs/reference/commands.mdx`: Documents
missing channel argument usage, sandbox-scoped custom preset matching,
session policy preset sync, and gateway failure classification (uses the
real probe states from `src/lib/status-command-deps.ts`).
- #4022, #4060, #4062 -> `docs/reference/troubleshooting.mdx`: Adds
guidance for gateway-down `connect`, source checkout OpenShell
bootstrapping, WDDM placeholder GPU names, and Jetson sandbox GPU
passthrough.
- Release prep -> `docs/project.json`, `docs/versions1.json`,
`.agents/skills/nemoclaw-user-*`: Bumps docs metadata to 0.0.49 and
refreshes generated user skills from the Fern docs.

## Type of Change
- [ ] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [x] Doc only (includes code sample changes)

## Verification
- [x] `npx prek run --all-files` passes
- [ ] `npm test` passes
- [ ] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [x] Docs updated for user-facing behavior changes
- [ ] `make docs` builds without warnings (doc changes only)
- [x] Doc pages follow the [style
guide](https://github.com/NVIDIA/NemoClaw/blob/main/docs/CONTRIBUTING.md)
(doc changes only)
- [ ] New doc pages include SPDX header and frontmatter (new pages only)

\`make docs\` was attempted locally but did not complete because \`npm\`
returned \`403 Forbidden\` while fetching \`fern-api\` from
\`registry.npmjs.org\` in the sandboxed environment.

---
Signed-off-by: Miyoung Choi <miyoungc@nvidia.com>

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **Documentation**
* Released v0.0.49 with reliability and compatibility improvements
including faster gateway failure diagnostics and safer snapshot restore
behavior
* Enhanced snapshot restore documentation with `--to` cloning and
`--force` overwrite requirements
* Expanded troubleshooting guides for source installs, GPU setup, and
gateway recovery
* Clarified Docker group access requirements and improved CLI command
reference

* **Chores**
  * Version bumped to 0.0.49

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/4078?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working fix v0.0.50 Release target VRDC Issues and PRs submitted by NVIDIA VRDC test team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Channels] Channel bot stops responding after channels add <channel> + rebuild (verified on slack; likely affects any messaging channel)

2 participants