fix(cli): make channels add atomic when policy preset is missing by laitingsheng · Pull Request #4577 · NVIDIA/NemoClaw

laitingsheng · 2026-05-31T11:31:03Z

Summary

nemoclaw <sandbox> channels add <channel> reached process.exit(0) and wrote the channel into sandboxes.json even when the matching policy preset YAML was missing on disk, leaving the sandbox advertising a bridge that had no network policy. The add flow now parses the preset YAML (file present and network_policies: section yields at least one entry via parsePresetPolicyKeys) before --dry-run returns, before snapshotting registry/credentials state, and before token acquisition, and rolls back the gateway/registry/credential mutations when the post-write preset application fails. Rollback is best-effort: gateway-side steps that fail (provider detach/delete, or re-upsert during a re-add restore) print a Rollback could not fully clean <surfaces> warning so the operator can clean up manually.

Related Issue

Fixes #4548

Changes

addSandboxChannel runs the policies.loadPreset(canonical) + policies.parsePresetPolicyKeys(...) preflight before --dry-run returns, before snapshotting registry/credentials state, and before token acquisition, so a missing, unloadable, or unparseable preset (no usable entries under network_policies:) exits non-zero with a restore-and-re-run hint on stderr without prompting the operator or contacting the gateway.
The non-QR branch applies the preset after applyChannelAddToGatewayAndRegistry. On apply failure the new rollbackChannelAdd helper handles two cases. A fresh add clears credentials then calls applyChannelRemoveToGatewayAndRegistry (best-effort, prints residual warning before returning). A failed re-add of an already-enabled channel restores the prior registry entry first (so a later gateway-upsert exit cannot orphan the channel), prints the gateway-providers residual warning, then restores on-disk credentials and best-effort re-upserts the prior bridge providers.
applyChannelRemoveToGatewayAndRegistry grew an optional { bestEffort: true } mode that downgrades each gateway-side process.exit(1) to a log-and-continue so the registry rollback finishes even when openshell detach or delete fails; returns { ok, residual }.
applyChannelPresetIfAvailable no longer short-circuits via policies.listPresets(); the filesystem scan conflated "channel has no preset" with "preset file missing on disk", which is what let the original bug pass silently.
Docs (docs/manage-sandboxes/messaging-channels.mdx, docs/reference/commands.mdx): describe the missing-or-unparseable-preset abort, the best-effort fresh-add rollback with explicit residual warning, the re-add restoration semantics, and narrow --dry-run to "validate channel name and matching policy preset without prompting for credentials, contacting the gateway, or rebuilding"; one sentence per line per docs style guide.
test/channels-add-preset.test.ts: added presetFileMissing, presetMissingNetworkPolicies, presetMalformedYaml fixture toggles plus regression tests covering each abort/rollback path (missing YAML for both telegram and whatsapp/QR, missing network_policies: section, malformed YAML body, applyPreset-after-write rollback, rollback resilience when openshell detach fails, prior-config restoration on a failed re-add, and registry restoration ordering when the re-upsert during re-add rollback itself throws); the source-of-truth test now parses every preset via parsePresetPolicyKeys instead of just checking file existence.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
npm run docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Tinson Lai tinsonl@nvidia.com

Summary by CodeRabbit

New Features
- Added an exported helper to parse channel preset policy keys.
Bug Fixes
- Channel adds now require a valid built-in network policy preset; missing or malformed presets abort early (exit code 1) before prompting or persisting credentials. If preset application fails after registry write, staged tokens and registry/bridge changes are rolled back and residual gateway items are warned with guidance to restore the preset YAML and re-run.
Refactor
- Rollback and provider cleanup now support best-effort behavior, collecting failures and reporting residuals instead of hard exiting.
Documentation
- Updated guides and command reference to reflect preset validation, rollback/credential behavior, residual warnings, and restore-and-retry instructions.
Tests
- Added/updated tests for preset loading failures, apply-failure aborts, credential rollback, residual warnings, and preset coverage.

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai · 2026-05-31T11:31:14Z

📝 Walkthrough

Walkthrough

The CLI now requires the channel's built-in network policy preset to load and validates parsed network_policies before finalizing channel add. Missing/malformed presets or preset-apply failures abort with exit 1 and trigger credential/bridge/registry rollback; tests and docs updated accordingly.

Changes

Preset Application Prerequisite for Channel Add

Layer / File(s)	Summary
Rollback helper refactor (best-effort and residual reporting) `src/lib/actions/sandbox/policy-channel.ts`	`applyChannelRemoveToGatewayAndRegistry` now accepts `{ bestEffort?: boolean }`, gates detach/delete on gateway reachability, collects residual cleanup items (e.g., `"gateway-providers"`), and returns `{ ok, residual }`.
Preset load/check, snapshot, and abort on failure `src/lib/actions/sandbox/policy-channel.ts`	`addSandboxChannel` preloads `policies.loadPreset` and validates parsed `network_policies`; missing/malformed preset causes `process.exit(1)` with a restore-and-re-run hint. For non-QR (token) flows, failed `applyChannelPresetIfAvailable()` clears staged tokens, snapshots prior registry/provider credential state, and triggers `rollbackChannelAdd` which either restores prior registry/provider credentials or removes the new bridge via the refactored removal helper (best-effort), warning on residuals, then exits `1`.
Provider upsert: best-effort option and tests `src/lib/onboard/providers.ts`, `src/lib/onboard/providers.test.ts`	`upsertMessagingProviders()` gains `options.bestEffort` to accumulate failures and throw an aggregated error instead of immediately exiting; tests verify best-effort mode throws and does not call `process.exit`.
Policy export `src/lib/policy/index.ts`	Exports `parsePresetPolicyKeys(presetContent)` so callers can inspect preset keys (e.g., `network_policies`).
Tests: harness flags, credential tracking, and negative-paths `test/channels-add-preset.test.ts`	Test harness gains flags to simulate missing/malformed presets; credential store stubs record `savedCredentialKeys` and `deletedCredentialKeys`; new tests assert early aborts for missing/malformed presets (including QR-only WhatsApp), rollback sequencing when `applyPreset` fails (credential deletion and registry rollback), residual gateway-provider warnings on detach failures, and a source-of-truth test ensuring a preset YAML exists for every known channel.
Documentation: channels add behavior `docs/manage-sandboxes/messaging-channels.mdx`, `docs/reference/commands.mdx`	Docs require the built-in network policy preset YAML for `channels add`, state the preset is applied before rebuild, and instruct restoring the preset YAML and re-running `channels add` when load/apply fails; document rollback that removes bridge providers, the `messagingChannels` registry entry, and persisted credentials and exits without prompting for rebuild.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

NVIDIA/NemoClaw#4582: Overlaps src/lib/actions/sandbox/policy-channel.ts changes around token persistence and gating before registry writes.

Suggested labels

bug, enhancement: policy, v0.0.56

Suggested reviewers

ericksoa
cv
cjagwani

Poem

🐇 I found a YAML where presets slept,
A channel half-configured, promises kept.
Restore the file, then run the add again,
Rollback clears crumbs and rights the train.
Hop, patch, retry — channels whole again.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and concisely summarizes the main fix: making channels add atomic when the policy preset is missing, which directly addresses the core issue.
Linked Issues check	✅ Passed	The PR comprehensively addresses all coding requirements from `#4548`: preset validation before mutations, non-zero exit on preset failures, rollback on apply failure, and atomic operation semantics.
Out of Scope Changes check	✅ Passed	All changes directly support the PR objective: preset validation helpers, best-effort cleanup modes, rollback logic, and comprehensive test/doc updates for atomicity.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix-4548-channels-add-preset-atomicity

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-31T11:32:54Z

E2E Advisor Recommendation

Required E2E: channels-add-remove-e2e, messaging-providers-e2e
Optional E2E: token-rotation-e2e, network-policy-e2e, ubuntu-repo-cloud-openclaw-telegram

Dispatch hint: channels-add-remove-e2e,messaging-providers-e2e

Auto-dispatched E2E: channels-add-remove-e2e, messaging-providers-e2e via nightly-e2e.yaml at 82fb0467c0459df78b3e9115529058aac9bf432d — nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

channels-add-remove-e2e (high (~75 min timeout)): Direct regression coverage for nemoclaw <sandbox> channels add telegram: verifies provider registration, matching policy preset application before rebuild, post-rebuild Telegram egress through the L7 policy, and cleanup on remove. This is the closest existing E2E for the changed policy-channel.ts add/remove and policy alignment behavior.
messaging-providers-e2e (high (~75 min timeout)): Validates the full messaging provider/placeholder/L7-proxy chain for Telegram, Discord, Slack, WeChat, plus the QR-only WhatsApp no-provider path. This should catch regressions from the new preset preflight, provider upsert behavior, and QR-only channel add policy requirement.

Optional E2E

token-rotation-e2e (medium-high (~45 min timeout)): Useful adjacent confidence because providers.ts and providerCredentialHashes behavior are touched, and the PR changes re-add/rotation rollback semantics. Existing coverage focuses on onboard-driven provider rotation rather than the new channel-add rollback path, so this is helpful but not the primary merge gate.
network-policy-e2e (high): Broad confidence for policy preset application and egress enforcement. The PR exports parsePresetPolicyKeys and changes channel policy fail-closed behavior, but the direct channel-policy gate is better covered by channels-add-remove-e2e; this broader suite is optional unless reviewers want extra policy-engine confidence.
ubuntu-repo-cloud-openclaw-telegram (high (~90 min timeout)): Typed scenario coverage for onboarding with Telegram messaging enabled, including provider attachment, placeholder/no-secret checks, and bridge reachability. Useful as an alternate/adjacent scenario-runner check for the shared provider helper changes.

New E2E recommendations

messaging-channel rollback (high): The new rollback paths for channels add after a successful provider/registry write but failed policy apply are covered by unit-style tests, but no existing E2E appears to sabotage policy application in a live CLI/gateway flow and assert no rebuild prompt plus registry/credential/provider cleanup or residual warnings.
- Suggested test: Add a channels-add-preset-rollback E2E that forces policy-add/preset application failure after bridge provider registration, then asserts credentials and messagingChannels are rolled back, rebuild is not prompted, and residual gateway-provider warnings are actionable.
policy preset fail-closed validation (medium): The PR requires missing or malformed built-in channel preset YAML to abort before token prompts, provider upsert, registry writes, or rebuild prompts. Existing E2Es validate the happy path with present presets, not the fail-closed missing/malformed preset path.
- Suggested test: Add an E2E that temporarily hides or corrupts a channel preset YAML in an isolated checkout and runs nemoclaw <sandbox> channels add telegram --dry-run and non-dry-run to verify fail-fast behavior and no side effects.

Dispatch hint

Workflow: .github/workflows/nightly-e2e.yaml
jobs input: channels-add-remove-e2e,messaging-providers-e2e

github-actions · 2026-05-31T11:32:55Z

E2E Scenario Advisor Recommendation

Required scenario E2E: ubuntu-repo-cloud-openclaw-token-rotation, ubuntu-repo-cloud-openclaw-telegram
Optional scenario E2E: ubuntu-repo-cloud-openclaw-slack, ubuntu-repo-cloud-hermes-slack

Dispatch required scenario E2E:

gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-token-rotation
gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-telegram

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: medium

Required scenario E2E

ubuntu-repo-cloud-openclaw-token-rotation: Primary targeted scenario for provider update/token-rotation behavior. The PR changes messaging provider upsert behavior and channel re-add rollback paths, which are closest to the token-rotation messaging scenario surface.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-token-rotation
ubuntu-repo-cloud-openclaw-telegram: Primary token-based messaging scenario for OpenClaw. The PR changes channel policy preset validation/application and provider registration helpers; the Telegram scenario exercises a token-backed messaging provider with the common messaging suite on the default OpenClaw onboarding path.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-telegram

Optional scenario E2E

ubuntu-repo-cloud-openclaw-slack: Optional adjacent coverage for the multi-provider Slack path, including the Slack bot/app provider naming and registration shape affected by the shared provider helper changes.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-openclaw-slack
ubuntu-repo-cloud-hermes-slack: Optional adjacent coverage for the same Slack messaging/provider surface under Hermes onboarding, useful if reviewers want cross-agent confidence beyond the primary OpenClaw target.
- Dispatch: gh workflow run e2e-scenarios.yaml --ref <pr-head-ref> --field scenarios=ubuntu-repo-cloud-hermes-slack

Relevant changed files

src/lib/actions/sandbox/policy-channel.ts
src/lib/onboard/providers.ts
src/lib/policy/index.ts

github-actions · 2026-05-31T11:35:38Z

PR Review Advisor

Findings: 2 needs attention, 6 worth checking, 0 nice ideas
Since last review: 1 prior item resolved, 4 still apply, 1 new item found

Review findings

🛠️ Needs attention

Restore disabledChannels on failed re-add rollback (src/lib/actions/sandbox/policy-channel.ts:946): The already-enabled rollback snapshots and restores messagingChannels, providerCredentialHashes, and staged credentials, but it does not snapshot or restore disabledChannels. applyChannelAddToGatewayAndRegistry removes the channel from disabledChannels before the fallible policy apply step. If an operator re-adds or rotates a channel that was configured but stopped, a policy apply failure can leave the channel enabled for the next rebuild even though rollback claimed to restore the prior configuration.
- Recommendation: Snapshot prior disabledChannels alongside priorMessagingChannels and restore it in the wasAlreadyEnabled rollback branch. Add a regression test where the channel is present in both messagingChannels and disabledChannels before re-add, applyPreset fails, and disabledChannels still contains the channel afterward.
- Evidence: applyChannelAddToGatewayAndRegistry filters disabledChannels to remove the channel during the add update. rollbackChannelAdd's wasAlreadyEnabled branch calls registry.updateSandbox with messagingChannels and providerCredentialHashes only; registry.updateSandbox uses Object.assign, so omitted disabledChannels keeps the post-add value.
Offset the policy-channel.ts monolith growth (src/lib/actions/sandbox/policy-channel.ts:385): This PR adds preset preflight, best-effort gateway/registry rollback, residual-state handling, and re-add restoration to an already-large sandbox action file at a high-risk credential/gateway/policy boundary. The deterministic drift check reports policy-channel.ts grew by 117 lines and remains over 1300 lines.
- Recommendation: Extract the new channel-add preflight and rollback behavior into a focused helper module, or offset the growth with nearby cleanup before merge.
- Evidence: The diff adds bestEffort support to applyChannelRemoveToGatewayAndRegistry, adds preset preflight in addSandboxChannel, and adds rollbackChannelAdd in src/lib/actions/sandbox/policy-channel.ts.

🔎 Worth checking

Source-of-truth review needed: Best-effort fresh channel-add rollback after post-write policy apply failure: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: addSandboxChannel calls persistChannelTokens and applyChannelAddToGatewayAndRegistry before applyChannelPresetIfAvailable, then rollbackChannelAdd calls applyChannelRemoveToGatewayAndRegistry with { bestEffort: true }.
Source-of-truth review needed: Already-enabled channel re-add rollback: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: rollbackChannelAdd's wasAlreadyEnabled branch updates messagingChannels and providerCredentialHashes, clears/re-saves env credentials when available, attempts best-effort provider re-upsert, and always reports gateway-providers residual.
Source-of-truth review needed: bestEffort provider upsert helper: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: upsertMessagingProviders collects failures and throws when options.bestEffort is true; rollbackChannelAdd catches and logs this in the already-enabled restore path.
Complete the source-of-truth plan for best-effort rollback (src/lib/actions/sandbox/policy-channel.ts:983): The fresh token-backed add path still creates a transient mixed state by staging credentials, upserting OpenShell gateway providers, and writing the sandbox registry before the fallible policy apply step. The new rollback improves cleanup and operator visibility, but gateway detach/delete failures can leave residual providers or credentials behind after the registry is rolled back.
- Recommendation: Prefer making the invalid state impossible by moving fallible policy work before gateway/provider/registry mutation where feasible. If best-effort rollback remains, document why the source cannot be fixed in this PR, what real-boundary regression proves it cannot silently regress, and when the workaround can be removed.
- Evidence: addSandboxChannel calls persistChannelTokens and applyChannelAddToGatewayAndRegistry before applyChannelPresetIfAvailable. On failure it calls rollbackChannelAdd, which invokes applyChannelRemoveToGatewayAndRegistry with { bestEffort: true } and may print Rollback could not fully clean gateway-providers.
Keep re-add rollback tied to the gateway credential source of truth (src/lib/actions/sandbox/policy-channel.ts:962): For an already-enabled channel, rollback restores registry/hash state and only re-saves prior credentials that were available from process.env. NemoClaw cannot read the old secret back from the OpenShell gateway, which the credential storage docs identify as the credential system of record. If the prior value is not env-staged, or if the env contains the new rotation token, rollback cannot prove the gateway provider returned to the prior credential.
- Recommendation: Prefer a gateway/source-of-truth snapshot and restore transaction, or reorder the flow so provider overwrite cannot happen before all fallible policy work succeeds. If best-effort rollback remains, add a negative test for prior registry/hash state with no prior env value or a different new env token, and keep operator guidance explicit that gateway verification is required.
- Evidence: rollbackChannelAdd snapshots priorCreds using getCredential, clears current tokens, re-saves snapshot.priorCreds only when present, then attempts onboardProviders.upsertMessagingProviders with bestEffort. docs/security/credential-storage.mdx states credentials are not persisted to host disk and the OpenShell gateway is the only system of record.
Add targeted runtime validation for the policy/gateway/registry boundary (test/channels-add-preset.test.ts:63): The unit coverage is much stronger and covers the linked WhatsApp missing-preset repro plus rollback branches, but most changed behavior is still exercised through monkeypatched policy, registry, gateway, provider, and OpenShell modules. The modified runtime path spans real preset files, policy parsing/application, OpenShell gateway provider operations, registry writes, CLI exit behavior, and rebuild prompting.
- Recommendation: Add or identify a targeted runtime/integration validation that hides whatsapp.yaml in a controlled install/source tree and verifies non-zero exit, no registry write, no provider work, no policy preset application, and no rebuild prompt. Where feasible, also cover a post-write applyPreset failure through the real provider/registry boundary.
- Evidence: buildPreamble stubs policies.loadPreset/applyPreset, registry, recoverNamedGatewayRuntime, upsertMessagingProviders, and runOpenshell. The deterministic test-depth context recommends runtime validation for the changed runtime/sandbox/infrastructure paths.

🌱 Nice ideas

None.

Since last review details

Current findings:

Source-of-truth review needed: Best-effort fresh channel-add rollback after post-write policy apply failure: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: addSandboxChannel calls persistChannelTokens and applyChannelAddToGatewayAndRegistry before applyChannelPresetIfAvailable, then rollbackChannelAdd calls applyChannelRemoveToGatewayAndRegistry with { bestEffort: true }.
Source-of-truth review needed: Already-enabled channel re-add rollback: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: rollbackChannelAdd's wasAlreadyEnabled branch updates messagingChannels and providerCredentialHashes, clears/re-saves env credentials when available, attempts best-effort provider re-upsert, and always reports gateway-providers residual.
Source-of-truth review needed: bestEffort provider upsert helper: The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: upsertMessagingProviders collects failures and throws when options.bestEffort is true; rollbackChannelAdd catches and logs this in the already-enabled restore path.
Restore disabledChannels on failed re-add rollback (src/lib/actions/sandbox/policy-channel.ts:946): The already-enabled rollback snapshots and restores messagingChannels, providerCredentialHashes, and staged credentials, but it does not snapshot or restore disabledChannels. applyChannelAddToGatewayAndRegistry removes the channel from disabledChannels before the fallible policy apply step. If an operator re-adds or rotates a channel that was configured but stopped, a policy apply failure can leave the channel enabled for the next rebuild even though rollback claimed to restore the prior configuration.
- Recommendation: Snapshot prior disabledChannels alongside priorMessagingChannels and restore it in the wasAlreadyEnabled rollback branch. Add a regression test where the channel is present in both messagingChannels and disabledChannels before re-add, applyPreset fails, and disabledChannels still contains the channel afterward.
- Evidence: applyChannelAddToGatewayAndRegistry filters disabledChannels to remove the channel during the add update. rollbackChannelAdd's wasAlreadyEnabled branch calls registry.updateSandbox with messagingChannels and providerCredentialHashes only; registry.updateSandbox uses Object.assign, so omitted disabledChannels keeps the post-add value.
Offset the policy-channel.ts monolith growth (src/lib/actions/sandbox/policy-channel.ts:385): This PR adds preset preflight, best-effort gateway/registry rollback, residual-state handling, and re-add restoration to an already-large sandbox action file at a high-risk credential/gateway/policy boundary. The deterministic drift check reports policy-channel.ts grew by 117 lines and remains over 1300 lines.
- Recommendation: Extract the new channel-add preflight and rollback behavior into a focused helper module, or offset the growth with nearby cleanup before merge.
- Evidence: The diff adds bestEffort support to applyChannelRemoveToGatewayAndRegistry, adds preset preflight in addSandboxChannel, and adds rollbackChannelAdd in src/lib/actions/sandbox/policy-channel.ts.
Complete the source-of-truth plan for best-effort rollback (src/lib/actions/sandbox/policy-channel.ts:983): The fresh token-backed add path still creates a transient mixed state by staging credentials, upserting OpenShell gateway providers, and writing the sandbox registry before the fallible policy apply step. The new rollback improves cleanup and operator visibility, but gateway detach/delete failures can leave residual providers or credentials behind after the registry is rolled back.
- Recommendation: Prefer making the invalid state impossible by moving fallible policy work before gateway/provider/registry mutation where feasible. If best-effort rollback remains, document why the source cannot be fixed in this PR, what real-boundary regression proves it cannot silently regress, and when the workaround can be removed.
- Evidence: addSandboxChannel calls persistChannelTokens and applyChannelAddToGatewayAndRegistry before applyChannelPresetIfAvailable. On failure it calls rollbackChannelAdd, which invokes applyChannelRemoveToGatewayAndRegistry with { bestEffort: true } and may print Rollback could not fully clean gateway-providers.
Keep re-add rollback tied to the gateway credential source of truth (src/lib/actions/sandbox/policy-channel.ts:962): For an already-enabled channel, rollback restores registry/hash state and only re-saves prior credentials that were available from process.env. NemoClaw cannot read the old secret back from the OpenShell gateway, which the credential storage docs identify as the credential system of record. If the prior value is not env-staged, or if the env contains the new rotation token, rollback cannot prove the gateway provider returned to the prior credential.
- Recommendation: Prefer a gateway/source-of-truth snapshot and restore transaction, or reorder the flow so provider overwrite cannot happen before all fallible policy work succeeds. If best-effort rollback remains, add a negative test for prior registry/hash state with no prior env value or a different new env token, and keep operator guidance explicit that gateway verification is required.
- Evidence: rollbackChannelAdd snapshots priorCreds using getCredential, clears current tokens, re-saves snapshot.priorCreds only when present, then attempts onboardProviders.upsertMessagingProviders with bestEffort. docs/security/credential-storage.mdx states credentials are not persisted to host disk and the OpenShell gateway is the only system of record.
Add targeted runtime validation for the policy/gateway/registry boundary (test/channels-add-preset.test.ts:63): The unit coverage is much stronger and covers the linked WhatsApp missing-preset repro plus rollback branches, but most changed behavior is still exercised through monkeypatched policy, registry, gateway, provider, and OpenShell modules. The modified runtime path spans real preset files, policy parsing/application, OpenShell gateway provider operations, registry writes, CLI exit behavior, and rebuild prompting.
- Recommendation: Add or identify a targeted runtime/integration validation that hides whatsapp.yaml in a controlled install/source tree and verifies non-zero exit, no registry write, no provider work, no policy preset application, and no rebuild prompt. Where feasible, also cover a post-write applyPreset failure through the real provider/registry boundary.
- Evidence: buildPreamble stubs policies.loadPreset/applyPreset, registry, recoverNamedGatewayRuntime, upsertMessagingProviders, and runOpenshell. The deterministic test-depth context recommends runtime validation for the changed runtime/sandbox/infrastructure paths.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

github-actions · 2026-05-31T11:47:59Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26711436094
Target ref: 63507a34e88a1a390dbb4c36b569abdc9c58f6b1
Workflow ref: main
Requested jobs: channels-add-remove-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job	Result
channels-add-remove-e2e	✅ success

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

github-actions · 2026-05-31T11:56:47Z

🌿 Preview your docs: https://nvidia-preview-pr-4577.docs.buildwithfern.com/nemoclaw

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/manage-sandboxes/messaging-channels.mdx`:
- Around line 167-168: The two-sentence paragraph describing `channels add`
should be split so each sentence is on its own line and the em dash before the
remediation instruction should be replaced with a period; edit the block that
starts with "`channels add` requires the matching built-in network policy preset
YAML..." so the first sentence ends after "sandbox is never left advertising a
channel without a matching network policy." and the second sentence begins with
"If the file is present..." and ends after "rebuild prompt.", then convert the
em dash in the remediation sentence after "apply itself fails" into a period and
put "restore the preset YAML and re-run `nemoclaw <sandbox> channels add
<channel>`." on its own line.

In `@docs/reference/commands.mdx`:
- Line 726: The sentence in the docs line about NemoClaw packs multiple
sentences onto one line and uses an em dash; split that paragraph into separate
lines with one sentence per line and replace the em dash with a period, e.g.
break the long line starting with "NemoClaw requires..." into three separate
sentences/lines describing (1) missing preset YAML behavior, (2) behavior when
the file is present and apply occurs, and (3) recovery instruction starting with
"restore the preset YAML and re-run `nemoclaw <name> channels add <channel>`".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0fd7308a-cb1a-4806-aa4d-dd5969adb849

📥 Commits

Reviewing files that changed from the base of the PR and between 63507a3 and 1f9359a.

📒 Files selected for processing (4)

docs/manage-sandboxes/messaging-channels.mdx
docs/reference/commands.mdx
src/lib/actions/sandbox/policy-channel.ts
test/channels-add-preset.test.ts

🚧 Files skipped from review as they are similar to previous changes (1)

src/lib/actions/sandbox/policy-channel.ts

github-actions · 2026-05-31T12:14:45Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26711992585
Target ref: 1f9359a2c7d958231f4d2c6d1981d1f4fa190e48
Workflow ref: main
Requested jobs: channels-add-remove-e2e,messaging-providers-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
channels-add-remove-e2e	✅ success
messaging-providers-e2e	✅ success

… style Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

docs/manage-sandboxes/messaging-channels.mdx (1)

168-170: 💤 Low value

Consider varying sentence structure to improve readability.

Three consecutive sentences begin with "If", creating a repetitive pattern. While grammatically correct, varying the structure would improve flow.

💡 Possible rewrite

-If the file is missing, the command exits with a non-zero status before any token prompt, registry write, or rebuild prompt, so the sandbox is never left advertising a channel without a matching network policy.
-If the file is present, `channels add` applies the preset to the sandbox automatically before the rebuild so the bridge has egress to its upstream API.
-If the apply itself fails, the command exits with a non-zero status without prompting for rebuild.
+When the file is missing, the command exits with a non-zero status before any token prompt, registry write, or rebuild prompt, so the sandbox is never left advertising a channel without a matching network policy.
+When the file is present, `channels add` applies the preset to the sandbox automatically before the rebuild so the bridge has egress to its upstream API.
+Should the apply itself fail, the command exits with a non-zero status without prompting for rebuild.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/manage-sandboxes/messaging-channels.mdx` around lines 168 - 170, Reword
the three consecutive sentences starting with "If" to vary sentence openings and
improve flow: keep the same facts about the missing file (command exits non-zero
before prompts so sandbox isn't left advertising a channel), the present file
behavior (`channels add` auto-applies the preset before rebuild so bridge has
egress), and the failed-apply behavior (command exits non-zero without prompting
for rebuild), but change sentence structure — e.g., start one sentence with the
consequence, another with the condition using a different phrasing, and the
third as a dependent clause — updating the lines that contain "If the file is
missing", "`channels add` applies", and "If the apply itself fails".

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/manage-sandboxes/messaging-channels.mdx`:
- Line 168: Rewrite the passive sentence in the given paragraph to active voice:
replace "so the sandbox is never left advertising a channel without a matching
network policy" with an active construction such as "so the command prevents the
sandbox from advertising a channel without a matching network policy." Update
the sentence that begins "If the file is missing, the command exits..." in the
messaging-channels.mdx content to use this active phrasing while preserving
meaning.

---

Nitpick comments:
In `@docs/manage-sandboxes/messaging-channels.mdx`:
- Around line 168-170: Reword the three consecutive sentences starting with "If"
to vary sentence openings and improve flow: keep the same facts about the
missing file (command exits non-zero before prompts so sandbox isn't left
advertising a channel), the present file behavior (`channels add` auto-applies
the preset before rebuild so bridge has egress), and the failed-apply behavior
(command exits non-zero without prompting for rebuild), but change sentence
structure — e.g., start one sentence with the consequence, another with the
condition using a different phrasing, and the third as a dependent clause —
updating the lines that contain "If the file is missing", "`channels add`
applies", and "If the apply itself fails".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 070d214f-3545-41cc-80ef-5431b1b49e8a

📥 Commits

Reviewing files that changed from the base of the PR and between 1f9359a and 2f7f685.

📒 Files selected for processing (4)

docs/manage-sandboxes/messaging-channels.mdx
docs/reference/commands.mdx
src/lib/actions/sandbox/policy-channel.ts
test/channels-add-preset.test.ts

🚧 Files skipped from review as they are similar to previous changes (2)

src/lib/actions/sandbox/policy-channel.ts
docs/reference/commands.mdx

…ay rollback Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

github-actions · 2026-05-31T12:45:44Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26712645489
Target ref: 2f7f685579b215710cacec721745ccd06c5d1bbd
Workflow ref: main
Requested jobs: channels-add-remove-e2e,messaging-providers-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
channels-add-remove-e2e	✅ success
messaging-providers-e2e	✅ success

coderabbitai

🧹 Nitpick comments (1)

docs/manage-sandboxes/messaging-channels.mdx (1)

170-170: ⚡ Quick win

Add article before "rebuild" for consistency.

The phrase "without prompting for rebuild" treats "rebuild" as an uncountable noun, but elsewhere in this document "rebuild" consistently takes an article as a countable noun: "the rebuild" (lines 169, 172, 177, 236, 238), "a manual rebuild" (line 184).

📝 Suggested fix

-When the apply step itself fails after the registry write, NemoClaw rolls back the bridge providers, the `messagingChannels` entry, and the persisted credentials, then exits without prompting for rebuild.
+When the apply step itself fails after the registry write, NemoClaw rolls back the bridge providers, the `messagingChannels` entry, and the persisted credentials, then exits without prompting for a rebuild.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/manage-sandboxes/messaging-channels.mdx` at line 170, The sentence
containing "without prompting for rebuild" should include the definite article
to match the document's consistent usage; update that phrase to "without
prompting for the rebuild" in the same sentence that starts "When the apply step
itself fails after the registry write, NemoClaw rolls back the bridge providers,
the `messagingChannels` entry, and the persisted credentials, then exits without
prompting for rebuild" so it reads "...then exits without prompting for the
rebuild."

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@docs/manage-sandboxes/messaging-channels.mdx`:
- Line 170: The sentence containing "without prompting for rebuild" should
include the definite article to match the document's consistent usage; update
that phrase to "without prompting for the rebuild" in the same sentence that
starts "When the apply step itself fails after the registry write, NemoClaw
rolls back the bridge providers, the `messagingChannels` entry, and the
persisted credentials, then exits without prompting for rebuild" so it reads
"...then exits without prompting for the rebuild."

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d82988eb-fc01-4d45-af25-0438299b37ab

📥 Commits

Reviewing files that changed from the base of the PR and between 2f7f685 and 0c0a4a9.

📒 Files selected for processing (4)

docs/manage-sandboxes/messaging-channels.mdx
docs/reference/commands.mdx
src/lib/actions/sandbox/policy-channel.ts
test/channels-add-preset.test.ts

✅ Files skipped from review due to trivial changes (1)

docs/reference/commands.mdx

🚧 Files skipped from review as they are similar to previous changes (2)

src/lib/actions/sandbox/policy-channel.ts
test/channels-add-preset.test.ts

github-actions · 2026-05-31T13:02:30Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26713016536
Target ref: 0c0a4a93cf875f8822c150d50faa9f9aa4494ae6
Workflow ref: main
Requested jobs: channels-add-remove-e2e,messaging-providers-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
channels-add-remove-e2e	✅ success
messaging-providers-e2e	✅ success

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

github-actions · 2026-05-31T13:26:38Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26713559497
Target ref: 3416e2a750e9849b20362b5e5db4cd43d38994d3
Workflow ref: main
Requested jobs: channels-add-remove-e2e,messaging-providers-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
channels-add-remove-e2e	✅ success
messaging-providers-e2e	✅ success

…d re-add Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/lib/actions/sandbox/policy-channel.ts (1)

810-828: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make --dry-run honor the new preset preflight.

Right now --dry-run returns before loadPreset / parsePresetPolicyKeys run, so it exits 0 even when the real channels add would now fail on a missing or malformed built-in preset. That makes the dry-run contract lie about whether the command is actually runnable.

Suggested change

-  if (dryRun) {
-    console.log(`  --dry-run: would enable channel '${canonical}' for '${sandboxName}'.`);
-    return;
-  }
-
   const presetContent = policies.loadPreset(canonical);
   const presetPolicyKeys =
     presetContent === null ? [] : policies.parsePresetPolicyKeys(presetContent);
   if (presetContent === null || presetPolicyKeys.length === 0) {
     if (presetContent !== null && presetPolicyKeys.length === 0) {
       console.error(
         `  Preset YAML for channel '${canonical}' has no parseable entries under 'network_policies:'.`,
       );
     }
     console.error(
       `    Restore the preset YAML and re-run: ${CLI_NAME} ${sandboxName} channels add ${canonical}`,
     );
     process.exit(1);
   }
+
+  if (dryRun) {
+    console.log(`  --dry-run: would enable channel '${canonical}' for '${sandboxName}'.`);
+    return;
+  }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/actions/sandbox/policy-channel.ts` around lines 810 - 828, The
dry-run branch returns early and skips the new preset preflight, so dryRun in
the channel add flow can report success even when policies.loadPreset(canonical)
or policies.parsePresetPolicyKeys(...) would fail; modify the control flow in
the function handling the --dry-run (check the dryRun variable) so that it still
calls policies.loadPreset(canonical) and
policies.parsePresetPolicyKeys(presetContent) and performs the same
validation/error handling (including the console.error messages and
process.exit(1) path using sandboxName, canonical, and CLI_NAME) but only skips
the actual side-effect of enabling the channel—i.e., run the preset preflight
before returning for dryRun.

🧹 Nitpick comments (1)

test/channels-add-preset.test.ts (1)

1374-1388: ⚡ Quick win

Validate preset contents, not just file presence.

This guard still passes if <channel>.yaml exists but is malformed or has no parseable network_policies, which are the exact failure modes addSandboxChannel() now treats as fatal. Parsing each on-disk preset here would make the source-of-truth test actually cover the new contract.

♻️ Tighten the source-of-truth check

   it("every channel registered in KNOWN_CHANNELS ships a matching preset YAML on disk", () => {
     const { knownChannelNames } = require(path.join(repoRoot, "dist", "lib", "sandbox", "channels.js")) as {
       knownChannelNames: () => string[];
     };
+    const { parsePresetPolicyKeys } = require(path.join(repoRoot, "dist", "lib", "policy", "index.js")) as {
+      parsePresetPolicyKeys: (content: string) => string[];
+    };
     const presetDir = path.join(repoRoot, "nemoclaw-blueprint", "policies", "presets");
     const missing: string[] = [];
     for (const name of knownChannelNames()) {
       const file = path.join(presetDir, `${name}.yaml`);
-      if (!fs.existsSync(file)) missing.push(file);
+      if (!fs.existsSync(file)) {
+        missing.push(file);
+        continue;
+      }
+      const content = fs.readFileSync(file, "utf8");
+      if (parsePresetPolicyKeys(content).length === 0) {
+        missing.push(`${file} (no parseable network_policies)`);
+      }
     }
     assert.deepEqual(
       missing,
       [],
       `every channel in KNOWN_CHANNELS must have a matching preset YAML; missing: ${missing.join(", ")}`,

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/channels-add-preset.test.ts` around lines 1374 - 1388, The test
currently only checks that a preset file exists for each channel; instead, load
and parse each YAML preset from presetDir for every name from
knownChannelNames(), and assert that parsing succeeds and the resulting object
contains a non-empty network_policies (or the exact key/shape
addSandboxChannel() expects). Update the loop that builds missing to also read
fs.readFileSync(file, "utf8"), parse with the same YAML loader used in
production (or require the preset parser used by addSandboxChannel()), and push
file onto missing (or a new failures list) if parsing throws or network_policies
is absent/empty; finally assert failures is empty with a helpful message
referencing addSandboxChannel()'s contract.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/actions/sandbox/policy-channel.ts`:
- Around line 924-945: The rollback currently calls
onboardProviders.upsertMessagingProviders which can call process.exit on failure
and therefore prevents the registry restore; change this to a non-exiting,
best-effort provider-restore path (e.g. call a non-exiting variant like
onboardProviders.upsertMessagingProvidersBestEffort(...) or pass a { bestEffort:
true } flag to upsertMessagingProviders) inside a try/catch/finally so that any
gateway restore errors are caught and result in
residual.push("gateway-providers"), and ensure registry.updateSandbox(...) is
always executed (move it to finally or after the catch) so the prior registry
state is restored even when the gateway restore flakes.

---

Outside diff comments:
In `@src/lib/actions/sandbox/policy-channel.ts`:
- Around line 810-828: The dry-run branch returns early and skips the new preset
preflight, so dryRun in the channel add flow can report success even when
policies.loadPreset(canonical) or policies.parsePresetPolicyKeys(...) would
fail; modify the control flow in the function handling the --dry-run (check the
dryRun variable) so that it still calls policies.loadPreset(canonical) and
policies.parsePresetPolicyKeys(presetContent) and performs the same
validation/error handling (including the console.error messages and
process.exit(1) path using sandboxName, canonical, and CLI_NAME) but only skips
the actual side-effect of enabling the channel—i.e., run the preset preflight
before returning for dryRun.

---

Nitpick comments:
In `@test/channels-add-preset.test.ts`:
- Around line 1374-1388: The test currently only checks that a preset file
exists for each channel; instead, load and parse each YAML preset from presetDir
for every name from knownChannelNames(), and assert that parsing succeeds and
the resulting object contains a non-empty network_policies (or the exact
key/shape addSandboxChannel() expects). Update the loop that builds missing to
also read fs.readFileSync(file, "utf8"), parse with the same YAML loader used in
production (or require the preset parser used by addSandboxChannel()), and push
file onto missing (or a new failures list) if parsing throws or network_policies
is absent/empty; finally assert failures is empty with a helpful message
referencing addSandboxChannel()'s contract.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8e731a0f-5c07-40f2-adb2-85f42e707d2a

📥 Commits

Reviewing files that changed from the base of the PR and between 3416e2a and d7b6402.

📒 Files selected for processing (5)

docs/manage-sandboxes/messaging-channels.mdx
docs/reference/commands.mdx
src/lib/actions/sandbox/policy-channel.ts
src/lib/policy/index.ts
test/channels-add-preset.test.ts

✅ Files skipped from review due to trivial changes (1)

docs/manage-sandboxes/messaging-channels.mdx

🚧 Files skipped from review as they are similar to previous changes (1)

docs/reference/commands.mdx

github-actions · 2026-05-31T14:11:33Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26714552896
Target ref: d7b64028e6b30ea7684fb7240b540a04995920c1
Workflow ref: main
Requested jobs: channels-add-remove-e2e,messaging-providers-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
channels-add-remove-e2e	✅ success
messaging-providers-e2e	✅ success

…ilure safety Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

github-actions · 2026-05-31T14:44:04Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26715309625
Target ref: c12353992f00cd968c95972a9c149a6065ad2ba1
Workflow ref: main
Requested jobs: channels-add-remove-e2e,messaging-providers-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
channels-add-remove-e2e	✅ success
messaging-providers-e2e	✅ success

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

github-actions · 2026-05-31T15:10:22Z

Selective E2E Results — ✅ All requested jobs passed

Run: 26715901484
Target ref: 3b3c520b07653471121e4a19838f32ea5ea62cfe
Workflow ref: main
Requested jobs: channels-add-remove-e2e,messaging-providers-e2e
Summary: 2 passed, 0 failed, 0 skipped

Job	Result
channels-add-remove-e2e	✅ success
messaging-providers-e2e	✅ success

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/lib/actions/sandbox/policy-channel.ts (1)

950-972: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Only report gateway-providers residuals when the restore actually fails.

This branch prepopulates residual and prints the cleanup warning before the best-effort restore runs, so a successful upsertMessagingProviders(..., { bestEffort: true }) still returns { ok: false, residual: ["gateway-providers"] } and tells the user to do manual cleanup unnecessarily.

Suggested fix

-    const residual: string[] = ["gateway-providers"];
-    console.error(
-      `  ${YW}⚠${R} Rollback could not fully clean ${residual.join(", ")}; run '${CLI_NAME} ${sandboxName} channels remove ${canonical}' once the gateway is reachable.`,
-    );
+    const residual: string[] =
+      Object.keys(snapshot.priorCreds).length > 0 ? [] : ["gateway-providers"];
     if (Object.keys(snapshot.priorCreds).length > 0) {
       try {
         const priorTokenDefs = Object.entries(snapshot.priorCreds).map(([envKey, token]) => ({
           name: bridgeProviderName(sandboxName, canonical, envKey),
           envKey,
           token,
         }));
         onboardProviders.upsertMessagingProviders(priorTokenDefs, runOpenshell, {
           bestEffort: true,
         });
       } catch (err) {
+        if (!residual.includes("gateway-providers")) residual.push("gateway-providers");
         console.error(
           `  ${YW}⚠${R} Failed to restore gateway providers for '${canonical}': ${
             err instanceof Error ? err.message : String(err)
           }`,
         );
       }
     }
+    if (residual.length > 0) {
+      console.error(
+        `  ${YW}⚠${R} Rollback could not fully clean ${residual.join(", ")}; run '${CLI_NAME} ${sandboxName} channels remove ${canonical}' once the gateway is reachable.`,
+      );
+    }
     return { ok: false, residual };

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/actions/sandbox/policy-channel.ts` around lines 950 - 972, The code
prepopulates residual = ["gateway-providers"] and logs the cleanup warning
before attempting the best-effort restore, causing false-positive residual
reporting; change the flow so you only add "gateway-providers" to residual and
emit the console.error/return it when the restore actually fails. Concretely,
move initialization of residual to an empty array, run the
upsertMessagingProviders call inside the try block (using snapshot.priorCreds ->
priorTokenDefs built with bridgeProviderName(sandboxName, canonical, envKey)),
and only push("gateway-providers") and log the CLI_NAME warning if
upsertMessagingProviders throws (catch path) or if priorCreds is
empty/unrestorable; keep runOpenshell and bestEffort usage unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/lib/actions/sandbox/policy-channel.ts`:
- Around line 950-972: The code prepopulates residual = ["gateway-providers"]
and logs the cleanup warning before attempting the best-effort restore, causing
false-positive residual reporting; change the flow so you only add
"gateway-providers" to residual and emit the console.error/return it when the
restore actually fails. Concretely, move initialization of residual to an empty
array, run the upsertMessagingProviders call inside the try block (using
snapshot.priorCreds -> priorTokenDefs built with bridgeProviderName(sandboxName,
canonical, envKey)), and only push("gateway-providers") and log the CLI_NAME
warning if upsertMessagingProviders throws (catch path) or if priorCreds is
empty/unrestorable; keep runOpenshell and bestEffort usage unchanged.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a977310f-b44f-4224-96b8-2b5610cde13b

📥 Commits

Reviewing files that changed from the base of the PR and between 3b3c520 and 82fb046.

📒 Files selected for processing (7)

docs/manage-sandboxes/messaging-channels.mdx
docs/reference/commands.mdx
src/lib/actions/sandbox/policy-channel.ts
src/lib/onboard/providers.test.ts
src/lib/onboard/providers.ts
src/lib/policy/index.ts
test/channels-add-preset.test.ts

✅ Files skipped from review due to trivial changes (1)

docs/reference/commands.mdx

🚧 Files skipped from review as they are similar to previous changes (3)

docs/manage-sandboxes/messaging-channels.mdx
src/lib/policy/index.ts
test/channels-add-preset.test.ts

github-actions · 2026-06-01T22:24:34Z

Selective E2E Results — ❌ Some jobs failed

Run: 26785074041
Target ref: 82fb0467c0459df78b3e9115529058aac9bf432d
Workflow ref: main
Requested jobs: channels-add-remove-e2e,messaging-providers-e2e
Summary: 1 passed, 1 failed, 0 skipped

Job	Result
channels-add-remove-e2e	✅ success
messaging-providers-e2e	❌ failure

Failed jobs: messaging-providers-e2e. Check run artifacts for logs.

## Summary - Add the missing `v0.0.57` release-notes section with links to the detailed docs pages for command, inference, onboarding, messaging, status, installer, and policy changes. - Remove public references to docs-skip terms from source docs and regenerate the NemoClaw user skills from the current Fern MDX docs. - Carry forward generated references for the per-agent documentation split, including Hermes-specific reference files. ## Source summary - #4615 and #4653 -> `docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release notes now cover host-side `sessions` and `agents` commands plus `NEMOCLAW_EXTRA_AGENTS_JSON` secondary-agent baking. - #4163, #4204, #4611, #4619, and #4676 -> `docs/about/release-notes.mdx`, `docs/inference/use-local-inference.mdx`: Release notes now cover managed vLLM progress/readiness, DGX Spark model default changes, local Ollama streaming usage, and inference route divergence warnings. - #4267, #4601, #4609, #4642, #4645, and #4661 -> `docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release notes now cover UFW auto-remediation, local-inference reachability gates, gateway reuse/binding, cancel rollback, and policy selection persistence. - #4577, #4582, #4607, and #4660 -> `docs/about/release-notes.mdx`, `docs/manage-sandboxes/messaging-channels.mdx`: Release notes now cover Slack validation, atomic `channels add`, WhatsApp QR diagnostics, and Slack placeholder normalization. - #4388, #4600, #4646, and #4647 -> `docs/about/release-notes.mdx`, `docs/reference/commands.mdx`: Release notes now cover status failure layers, paused-container hints, Docker-driver doctor behavior, and non-destructive stale-registry recovery. - #4569, #4579, and #4678 -> `docs/about/release-notes.mdx`, `docs/manage-sandboxes/lifecycle.mdx`, `docs/network-policy/integration-policy-examples.mdx`: Release notes now cover installer tag pinning, PyPI `uv` policy access, and observable Jira validation. - #4632 -> `.agents/skills/`: Regenerated user skills from the current per-agent docs source, including newly generated Hermes reference files. ## Verification - `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx` - `rg "permissive mode|shields down|shields up|shields status|config rotate-token|rotate-token" docs --glob "*.mdx"` - `rg "permissive mode|shields down|shields up|shields status|config rotate-token|rotate-token" .agents/skills --glob "*.md"` - `npm run docs` - `npm run build:cli` - Commit hooks: markdownlint, docs-to-skills verification, gitleaks, skills YAML, commitlint  ## Summary by CodeRabbit * **Documentation** * Restructured documentation to clearly distinguish OpenClaw and Hermes agent variants throughout user guides. * Enhanced security, credential storage, and deployment guidance with clearer setup flows. * Added Hermes plugin installation and ecosystem documentation. * Improved workspace, messaging, and policy management references with variant-specific command examples. * Refined troubleshooting and CLI reference sections for clarity.

fix(cli): make channels add atomic when policy preset is missing

63507a3

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

laitingsheng added the fix label May 31, 2026

fix(cli): probe preset YAML before tokens; doc behaviour update

1f9359a

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai Bot reviewed May 31, 2026

View reviewed changes

Comment thread docs/manage-sandboxes/messaging-channels.mdx Outdated

Comment thread docs/reference/commands.mdx Outdated

fix(cli): roll back gateway and registry on preset apply failure; doc…

2f7f685

… style Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai Bot reviewed May 31, 2026

View reviewed changes

Comment thread docs/manage-sandboxes/messaging-channels.mdx Outdated

fix(cli): validate parsed preset content and clear creds before gatew…

0c0a4a9

…ay rollback Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai Bot reviewed May 31, 2026

View reviewed changes

fix(cli): make channels add rollback best-effort across gateway failures

3416e2a

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

fix(cli): parse-validate preset YAML and restore prior state on faile…

d7b6402

…d re-add Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

coderabbitai Bot reviewed May 31, 2026

View reviewed changes

Comment thread src/lib/actions/sandbox/policy-channel.ts Outdated

fix(cli): order channels add preflight, snapshot, and rollback for fa…

c123539

…ilure safety Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

fix(cli): print rollback residual warning before fallible gateway upsert

3b3c520

Signed-off-by: Tinson Lai <tinsonl@nvidia.com>

laitingsheng added the v0.0.56 Release target label May 31, 2026

sandl99 added the enhancement: messaging label Jun 1, 2026

cv added v0.0.57 Release target and removed v0.0.56 Release target labels Jun 1, 2026

merge(main): update PR 4577

82fb046

coderabbitai Bot reviewed Jun 1, 2026

View reviewed changes

cv approved these changes Jun 1, 2026

View reviewed changes

cv merged commit 9865611 into main Jun 1, 2026
33 checks passed

cv deleted the fix-4548-channels-add-preset-atomicity branch June 1, 2026 22:16

This was referenced Jun 2, 2026

fix(cli): warn on cross-sandbox messaging credential conflicts in channels add #4652

Merged

feat(messaging): migrate enrollment to manifest hooks #4248

Merged

wscurran added area: messaging Messaging channels, bridges, manifests, or channel lifecycle bug-fix PR fixes a bug or regression feature PR adds or expands user-visible functionality and removed fix labels Jun 3, 2026

coderabbitai Bot mentioned this pull request Jun 3, 2026

fix(policy): refuse to apply a preset when the live policy could not be read #4589

Merged

3 tasks

miyoungc mentioned this pull request Jun 3, 2026

docs: refresh 0.0.57 release docs #4716

Merged

coderabbitai Bot mentioned this pull request Jun 4, 2026

feat(policy): add safe common egress defaults #4768

Merged

Conversation

laitingsheng commented May 31, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

github-actions Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Scenario Advisor Recommendation

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

github-actions Bot commented May 31, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented May 31, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 31, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented May 31, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 31, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

github-actions Bot commented May 31, 2026

Selective E2E Results — ✅ All requested jobs passed

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented May 31, 2026

Selective E2E Results — ✅ All requested jobs passed

laitingsheng commented May 31, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 31, 2026 •

edited

Loading

github-actions Bot commented May 31, 2026 •

edited

Loading

github-actions Bot commented May 31, 2026 •

edited

Loading

github-actions Bot commented May 31, 2026 •

edited

Loading