fix(cli): warn on cross-sandbox messaging credential conflicts in channels add#4652
Conversation
…nnels add `nemoclaw <name> channels add <channel>` accepted the same messaging bot credential (e.g. a Telegram bot token) on a second sandbox with no warning, so two sandboxes silently competed for the single allowed getUpdates/gateway consumer and one looped on HTTP 409. The onboard path already guarded against this; the channels-add path did not. Mirror the onboard conflict check inside addSandboxChannel: after token acquisition but before the gateway/registry mutation, hash the acquired credentials and call findChannelConflicts against the other sandboxes in the registry (with a tri-state gateway probe to backfill legacy entries). On a matching or unverifiable token, interactive runs warn and prompt `Continue anyway? [y/N]` (default no); non-interactive runs abort with `exit(1)` and actionable guidance; a new `--force` flag is the documented override. The check runs before any gateway/registry mutation, so declining leaves nothing half-wired. QR-paired (tokenless) adds and `--dry-run` are unaffected, and only non-secret credential hashes — never raw tokens — are compared or surfaced. Fixes NVIDIA#4305 Signed-off-by: Tony Luo <xialuo@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds a ChangesCross-sandbox channel conflict detection with force bypass
Sequence DiagramsequenceDiagram
participant CLI
participant addSandboxChannel
participant GatewayProbe
participant Backfill
participant ChannelsRegistry
participant Prompt
CLI->>addSandboxChannel: invoke with flags (dry-run?, force?)
addSandboxChannel->>GatewayProbe: probe gateway/provider liveness
GatewayProbe-->>addSandboxChannel: tri-state (present/absent/error)
addSandboxChannel->>Backfill: attempt non-fatal backfill
Backfill-->>addSandboxChannel: backfill result
addSandboxChannel->>ChannelsRegistry: lookup other sandboxes' credential hashes
ChannelsRegistry-->>addSandboxChannel: conflicts found / none
addSandboxChannel->>Prompt: show "Continue anyway?" (interactive) / exit (non-interactive) unless force
Prompt-->>addSandboxChannel: user decision (continue/abort)
addSandboxChannel-->>ChannelsRegistry: register/update channel if not aborted or if forced
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
The `--force` override added for the cross-sandbox messaging conflict check (NVIDIA#4305) lived on the shared channelMutationFlags, so it surfaced as a no-op `--force` on `channels remove/start/stop` and failed the CLI/docs flag-parity check (the flag was undocumented on those commands). Move `--force` into an add-only flag set, document it under the `channels add` reference section, and update the command-layer test to assert the flag is exposed only on add. No behavior change to the conflict check itself. Signed-off-by: Tony Luo <xialuo@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
verifyChannelBridgeAfterRebuild (run after a successful interactive `channels add`) probes the sandbox via executeSandboxExecCommand, which calls getOpenshellBinary() -> process.exit(1) when the openshell binary is absent. On the CI unit-test runner openshell is not installed, so that exit fired (caught by the exec wrapper, but still recorded by the process.exit spy), failing the "add proceeds" assertion expect(exitMock).not.toHaveBeenCalled() in policy-channel-conflict.test.ts. Locally openshell is installed, so it only reproduced in CI. Stub executeSandboxExecCommand/executeSandboxCommand at the process-recovery boundary (matching channel-status.test.ts) so the downstream bridge verification never shells out. Test-only; no source/behavior change. Signed-off-by: Tony Luo <xialuo@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
docs/reference/commands.mdx (1)
1604-1614: ⚡ Quick winKeep one full sentence per source line in this section.
These paragraphs wrap single sentences across multiple lines; reflow so each sentence is contained on exactly one line to match docs diff/readability conventions.
As per coding guidelines: "One sentence per line in source (makes diffs readable). Flag paragraphs where multiple sentences appear on the same line."
Also applies to: 1631-1634
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/reference/commands.mdx` around lines 1604 - 1614, The paragraph in docs/reference/commands.mdx contains multiple sentences on the same source lines; reflow each sentence so each appears on its own line (one sentence per source line) to follow the docs diff/readability convention—specifically split the block starting with "NemoClaw registers a hidden `internal` command namespace." so every sentence is a separate line, and do the same for the other affected block referenced around lines 1631-1634; ensure you preserve punctuation and spacing but break lines at sentence boundaries without altering wording.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@docs/reference/commands.mdx`:
- Around line 1604-1614: The paragraph in docs/reference/commands.mdx contains
multiple sentences on the same source lines; reflow each sentence so each
appears on its own line (one sentence per source line) to follow the docs
diff/readability convention—specifically split the block starting with "NemoClaw
registers a hidden `internal` command namespace." so every sentence is a
separate line, and do the same for the other affected block referenced around
lines 1631-1634; ensure you preserve punctuation and spacing but break lines at
sentence boundaries without altering wording.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 27b44dc9-e516-4a2a-bdd1-21c382ebab0f
📒 Files selected for processing (1)
docs/reference/commands.mdx
The agent-variant parity check (preview job) requires commands-nemohermes.mdx to mirror commands.mdx. Regenerate via docs:sync-agent-variants to propagate the new --force flag row that was added to commands.mdx. Generated file; no manual content. Signed-off-by: Tony Luo <xialuo@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…rocess test) The `checks` prek Test (CLI) hook runs vitest at the default 5s timeout; test/repro-2666-silent-list-status.test.ts spawns the real CLI per case and intermittently exceeds it under runner load. Unrelated to this PR's files; passed in prior runs. Empty commit to re-run CI (no rerun admin on the repo). Signed-off-by: Tony Luo <xialuo@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/lib/onboard/machine/handlers/finalization.test.ts (1)
22-23: ⚡ Quick winHarden ordering tests with an async/deferred approval mock.
Current assertions validate call order, but not that verification waits for approval completion. Add one test with a deferred
autoPairScopeApprovalpromise and assertverifyis not called until it resolves.Also applies to: 43-43, 183-229
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/lib/onboard/machine/handlers/finalization.test.ts` around lines 22 - 23, Replace the synchronous autoPairScopeApproval mock with a deferred Promise in the finalization tests: create a deferred (resolve/reject) object, have the autoPairScopeApproval mock return deferred.promise, call the code that should wait for approval, assert that verify (the mocked verification function) has not been called yet, then resolve the deferred and assert verify is called after resolution; update any other affected tests (lines referencing autoPairScopeApproval/getChatUiUrl and the verify assertions) to follow this pattern so ordering is hardened and verification truly waits for approval completion.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/lib/onboard.ts`:
- Around line 7032-7036: The new autoPairScopeApproval hook wiring was added
directly inside src/lib/onboard.ts (the autoPairScopeApproval property calling
connect.runConnectAutoPairApprovalPass), which increases the file size and
violates growth guardrails; move this wiring into a new or existing helper under
src/lib/onboard/** (preferably
src/lib/onboard/machine/handlers/finalization.ts): create the same function or
handler there that requires ./actions/sandbox/connect and calls
connect.runConnectAutoPairApprovalPass(name), then import or reference that
handler from onboard.ts (or better, register the handler from the new module) so
that src/lib/onboard.ts remains net-neutral; ensure exported symbol names (e.g.,
autoPairScopeApproval handler) match the original usage so no callers change.
---
Nitpick comments:
In `@src/lib/onboard/machine/handlers/finalization.test.ts`:
- Around line 22-23: Replace the synchronous autoPairScopeApproval mock with a
deferred Promise in the finalization tests: create a deferred (resolve/reject)
object, have the autoPairScopeApproval mock return deferred.promise, call the
code that should wait for approval, assert that verify (the mocked verification
function) has not been called yet, then resolve the deferred and assert verify
is called after resolution; update any other affected tests (lines referencing
autoPairScopeApproval/getChatUiUrl and the verify assertions) to follow this
pattern so ordering is hardened and verification truly waits for approval
completion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 43d2b252-f6e9-413e-9ca7-a900447abc40
📒 Files selected for processing (5)
src/lib/actions/sandbox/connect.tssrc/lib/onboard.tssrc/lib/onboard/machine/handlers/finalization.test.tssrc/lib/onboard/machine/handlers/finalization.tstest/sandbox-connect-inference.test.ts
| autoPairScopeApproval: (name) => { | ||
| const connect: typeof import("./actions/sandbox/connect") = | ||
| require("./actions/sandbox/connect"); | ||
| connect.runConnectAutoPairApprovalPass(name); | ||
| }, |
There was a problem hiding this comment.
CI blocker: move this new hook wiring out of src/lib/onboard.ts to satisfy growth guardrails.
This added block is part of the net +5 growth currently failing codebase-growth-guardrails, which blocks merge. Please relocate this wiring into src/lib/onboard/machine/handlers/finalization.ts (or another src/lib/onboard/** helper) and keep src/lib/onboard.ts net-neutral.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/lib/onboard.ts` around lines 7032 - 7036, The new autoPairScopeApproval
hook wiring was added directly inside src/lib/onboard.ts (the
autoPairScopeApproval property calling connect.runConnectAutoPairApprovalPass),
which increases the file size and violates growth guardrails; move this wiring
into a new or existing helper under src/lib/onboard/** (preferably
src/lib/onboard/machine/handlers/finalization.ts): create the same function or
handler there that requires ./actions/sandbox/connect and calls
connect.runConnectAutoPairApprovalPass(name), then import or reference that
handler from onboard.ts (or better, register the handler from the new module) so
that src/lib/onboard.ts remains net-neutral; ensure exported symbol names (e.g.,
autoPairScopeApproval handler) match the original usage so no callers change.
Commit a0085a7 was intended as an empty CI re-trigger but accidentally swept in staged work from a parallel branch (sandbox connect auto-pair, NVIDIA#4462/NVIDIA#4504): connect.ts, onboard.ts, finalization handler + test. Revert restores the tree to 0f2533d so this PR contains only the NVIDIA#4305 cross-sandbox channels-add conflict changes. The connect-autopair work lives on its own branch (fix/4504-onboard-scope-upgrade-autopair). Signed-off-by: Tony Luo <xialuo@nvidia.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
prekshivyas
left a comment
There was a problem hiding this comment.
LGTM — approving. Verified the conflict-check logic:
- Security: only
hashCredential(token)hashes are compared and surfaced; raw tokens never appear in the probe, warning, or abort message. Good. - Fail-open:
makeChannelsConflictProbereturns"error"when the gateway isn't alive,backfillMessagingChannelsis try/caught, andfindChannelConflictsreturning on throw all degrade to "no conflict → proceed". A flaky gateway never wrongly blocks an add — correct direction for an advisory check. It reuses the same fail-safeproviderExistsInGatewayas the onboard path, so the two checks stay consistent. - QR/tokenless adds (empty
acquired) short-circuit to proceed, avoiding "unknown-token" noise;--forceand the non-interactive abort path are both wired with actionable guidance.
Test coverage is thorough (16 scenarios incl. matching/unknown token, interactive continue/abort, non-interactive abort, --force, dry-run, idempotent re-add, QR skip, Slack two-token, probe-failure-swallowed, no-token-in-output). CI is green.
Note: CodeRabbit's lone finding points at src/lib/onboard.ts:7036 / a guardrail failure, but onboard.ts isn't in this PR and codebase-growth-guardrails passes — it's stale, disregard.
Non-blocking suggestion: move the checkChannelAddConflict(...) call above persistChannelTokens(acquired). The check hashes acquired directly and doesn't depend on the persist, so as written an aborted conflicted add still writes the token to the host credential store via saveCredential. Reordering makes "declining leaves nothing wired" literally true at the host-store level too (it's already true for gateway/registry). Low-harm either way — fine to merge as-is.
## Summary
- Add the v0.0.59 release notes from the GitHub announcement discussion.
- Refresh local inference and credential-storage guidance for the
current release behavior.
- Regenerate the user skills from the updated Fern docs.
- Tighten release-prep and docs review guidance for generated skills, PR
labels, and shared `$$nemoclaw` command placeholders.
## Verification
- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --doc-platform fern-mdx`
- `rg "permissive mode|shields down|shields up|shields status|config
rotate-token|rotate-token" --glob '*.{md,mdx}'`
- `git diff --check`
- `npm run docs` (rerun outside sandbox after sandbox-only `tsx` IPC
permission failure)
- `npm run typecheck:cli`
- Pre-commit hooks during commit passed, including markdownlint,
docs-to-skills verification, gitleaks, commitlint, and skills YAML
tests.
## Source Summary
- #3679, #4437, #4681, #4766, #4772, #4775, #4786 ->
`docs/about/release-notes.mdx`, `docs/reference/commands.mdx`,
`docs/reference/troubleshooting.mdx`: Summarize OpenClaw 2026.5.27
compatibility, runtime path pinning, plugin registry recovery, live
gateway reconciliation, and clearer host-alias/startup diagnostics.
- #4332, #4402, #4769, #4776, #4779 -> `docs/about/release-notes.mdx`,
`docs/inference/inference-options.mdx`,
`docs/inference/use-local-inference.mdx`,
`docs/inference/switch-inference-providers.mdx`: Document the release
inference changes covering Local NIM waits, Hermes Anthropic routing,
Nemotron 3 Ultra, the current Ollama starter fallback, and Spark
managed-vLLM context length.
- #4628, #4652, #4733, #4745 -> `docs/about/release-notes.mdx`,
`docs/security/credential-storage.mdx`,
`docs/manage-sandboxes/messaging-channels.mdx`,
`docs/reference/troubleshooting.mdx`: Capture permission healing,
gateway-stored credential reuse, cross-sandbox messaging credential
conflict checks, and CDI preflight diagnostics.
- #4728, #4737, #4743, #4744, #4782 -> `.agents/skills/nemoclaw-user-*`:
Regenerate the user skill references from the updated source docs.
- Follow-up maintenance ->
`.agents/skills/nemoclaw-contributor-update-docs/SKILL.md`,
`.coderabbit.yaml`: Add release-prep area labels for docs and skills
PRs, and teach docs review guidance that `$$nemoclaw` is the correct
shared command placeholder for examples that work across agent aliases.
Note: the `documentation` label was not present in the repository, so
this PR is labeled with `v0.0.59` only.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
* **Documentation**
* Updated default model for local Ollama inference setup to qwen3.5:9b
* Added Nemotron 3 Ultra 550B as an NVIDIA Endpoints model option
* Clarified credential storage and reuse behavior for post-deployment
(day-two) operations
* Added v0.0.59 release notes covering OpenClaw compatibility, inference
options, Hermes messaging sync, and troubleshooting
* Clarified CLI selection guidance and updated OpenClaw version example
in status output
* Revised release-prep instructions and docs review guidance for CLI
alias usage
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
nemoclaw <name> channels add <channel>accepted the same messaging bot credential (e.g. a Telegram bot token) on a second sandbox with no warning, silently overlapping two sandboxes on a single-consumer channel — TelegramgetUpdatesallows one consumer, so the loser loops on HTTP 409. This adds the cross-sandbox conflict check the onboard path already performs, bringingchannels addto parity so it warns / prompts / aborts instead of failing silently.Related Issue
Fixes #4305
Changes
checkChannelAddConflict()plus a tri-state gateway probe (makeChannelsConflictProbe()) toaddSandboxChannel, reusing the existingmessaging-conflict.tsprimitives and mirroring the onboard check (src/lib/actions/sandbox/policy-channel.ts). The check runs after token acquisition but before any gateway/registry mutation, so declining leaves nothing half-wired.Continue anyway? [y/N](default No). Non-interactive runs abort withexit(1)and actionable guidance (channels removeon the other sandbox, or--force).--forceflag onchannels addas the documented override (src/lib/sandbox/channels-command-support.ts).--dry-runare unaffected; an idempotent same-sandbox re-add does not self-conflict; a different token does not false-positive.src/lib/actions/sandbox/policy-channel-conflict.test.ts(16 scenarios — matching/unknown token, interactive continue/abort, non-interactive abort,--force, dry-run, idempotent re-add, QR skip, Slack two-token, probe failure swallowed, no token in output) +src/commands/sandbox/channels/mutate.test.tsflag-mapping (7).Type of Change
Verification
npx prek run --all-filespasses — see notenpm testpasses — see noteSigned-off-by: Tony Luo xialuo@nvidia.com
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests