Skip to content

fix(cli): probe live sandbox agent versions#4550

Open
ChunkyMonkey11 wants to merge 11 commits into
NVIDIA:mainfrom
ChunkyMonkey11:fix/probe-live-sandbox-agent-version
Open

fix(cli): probe live sandbox agent versions#4550
ChunkyMonkey11 wants to merge 11 commits into
NVIDIA:mainfrom
ChunkyMonkey11:fix/probe-live-sandbox-agent-version

Conversation

@ChunkyMonkey11
Copy link
Copy Markdown
Contributor

@ChunkyMonkey11 ChunkyMonkey11 commented May 29, 2026

Summary

NemoClaw now verifies the OpenClaw version actually running inside live sandboxes before reporting status or deciding whether an upgrade is needed. This prevents reused sandboxes from being marked up to date only because cached host metadata matches the current expected version.

Related Issue

Fixes #4429

Changes

  • Force live agent version probing for running sandboxes in status and upgrade-sandboxes.
  • Preserve the previously recorded sandbox agent version when reusing an existing sandbox instead of overwriting it with the current expected version.
  • Treat unavailable SSH config output as an unavailable runtime probe instead of spawning ssh with an empty config.
  • Add regression coverage for status output, upgrade classification, runtime probing fallback, and reused sandbox metadata preservation.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run --all-files passes
  • npm test passes
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • make docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Additional checks run:

  • npm run build:cli
  • npm run typecheck:cli
  • npx vitest run src/lib/onboard/sandbox-registry-metadata.test.ts src/lib/sandbox/version.test.ts src/lib/domain/maintenance/upgrade.test.ts src/lib/actions/gateway-drift-preflight.test.ts src/lib/actions/sandbox/status.test.ts
  • git diff --check upstream/main...HEAD

Note: local full hook execution was not completed because the repo's CLI coverage hook recursively invoked itself through a temporary git commit in a test fixture. The branch was pushed with local hooks skipped after the focused checks above passed.


Signed-off-by: Revant Patel revant.h.patel@gmail.com

Summary by CodeRabbit

  • New Features

    • Status shows probed live agent versions and gives clearer guidance when verification is forced or unavailable.
    • Upgrade checks probe running sandboxes first before classifying staleness.
    • Registry updates preserve cached agentVersion while updating other agent fields.
  • Bug Fixes

    • Do not trust empty or unavailable probe responses; omit stale cached versions from status.
    • Runtime readiness now recognizes both "Ready" and "Running" phases.
  • Tests

    • Expanded tests for probing behavior, probe-first upgrade checks, registry metadata reuse, and CLI/status output.

Signed-off-by: Revant Patel <revant.h.patel@gmail.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 29, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c91ba797-ef46-497a-8940-bbc823ead7ab

📥 Commits

Reviewing files that changed from the base of the PR and between 37eaa03 and b59404d.

📒 Files selected for processing (1)
  • test/cli.test.ts

📝 Walkthrough

Walkthrough

Adds a VersionCheckOptions contract, treats empty SSH probe output as failure, conditionally forces live SSH probing for running sandboxes in status and upgrade flows, preserves cached registry agentVersion during metadata updates, broadens readiness parsing, and adds tests and CLI cases validating probe-first behavior.

Changes

Live sandbox version probing and drift detection

Layer / File(s) Summary
Version check options contract and empty output handling
src/lib/sandbox/version.ts
Adds exported VersionCheckOptions { forceProbe?: boolean; skipProbe?: boolean }, updates checkAgentVersion signature, and treats empty SSH-config output as probe failure.
Version check probing test coverage
src/lib/sandbox/version.test.ts
New test ensures forceProbe does not fall back to cached agentVersion when SSH probing yields no data; detection becomes unavailable, sandboxVersion is null, and isStale is false.
Status command conditional probing
src/lib/actions/sandbox/status.ts
Adds shouldProbeSandboxRuntimeVersion, passes forceProbe/skipProbe into checkAgentVersion, prints live sandboxVersion only when verified, and emits "version not verified" / "unable to verify" lines when forced probing is inconclusive.
Upgrade command conditional probing
src/lib/actions/upgrade-sandboxes.ts
Wraps checkAgentVersion to pass { forceProbe: true } for sandboxes present in the live sandbox name set, enabling probe-first staleness classification for running sandboxes.
Registry metadata agent field handling
src/lib/onboard/sandbox-registry-metadata.ts
Refactors updateReusedSandboxMetadata to compute agent fields separately and explicitly set agentVersion to existingEntry?.agentVersion ?? null.
Registry metadata test infrastructure and reuse validation
src/lib/onboard/sandbox-registry-metadata.test.ts
Switches to async makeHelpers dynamic import, adds openclawAgent helper, adds test verifying model/provider updates while preserving existing agentVersion, and updates existing runtime-field tests to await helpers.
Runtime readiness parsing
src/lib/runtime-recovery.ts, src/lib/runtime-recovery.test.ts
parseReadySandboxNames now treats parsed phases Ready or Running as live and skips NotReady; tests updated to include Running and to ignore phase-like tokens outside the PHASE column.
CLI tests for live probing behavior
test/cli.test.ts
Adds optional agentVersion to SandboxEntry, adds status test asserting live SSH-reported agent version is displayed (not cached), and updates/adds upgrade-sandboxes --check tests to stub sandbox list/get, ssh-config, and ssh for probe-first checks.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

fix, NemoClaw CLI, Sandbox, Integration: OpenClaw, v0.0.55

Suggested reviewers

  • ericksoa
  • jyaunches
  • cv

"🐰 I hop through sandboxes, nose to the shell,
I probe for versions and listen quite well.
When probes go silent and cached claims mislead,
I flag the drift and nudge a rebuild deed.
Tests steady my paws; CI helps me excel."

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 45.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'fix(cli): probe live sandbox agent versions' directly reflects the main objective: adding live probing of agent versions instead of relying solely on cached metadata.
Linked Issues check ✅ Passed The PR addresses all coding requirements from issue #4429: forces live agent version probing in status and upgrade-sandboxes commands, preserves cached agent versions for reused sandboxes, treats unavailable SSH config as failed probe, and includes regression tests for new behavior.
Out of Scope Changes check ✅ Passed All changes are directly related to live agent version probing: runtime-recovery.ts updates support proper sandbox state detection, version.ts/test.ts implement probing logic, CLI tests validate probe-first behavior, and sandbox-registry-metadata changes preserve existing agent versions when reusing sandboxes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/lib/actions/sandbox/status.ts (1)

314-342: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Fix status agent version probing so it reports live versions when cached agentVersion is missing.

In src/lib/actions/sandbox/status.ts, shouldProbeSandboxRuntimeVersion(...) returns lookup.state === "present" && Boolean(sandbox.agentVersion), so when sandbox.agentVersion is null status sets skipProbe: true and checkAgentVersion returns sandboxVersion: null. The logging only prints an Agent: line when versionCheck.sandboxVersion exists, so running sandboxes with missing cached metadata can produce no agent version output.

This is inconsistent with src/lib/actions/upgrade-sandboxes.ts, which forces { forceProbe: true } for all live sandboxes (liveNames.has(sandboxName)), independent of cached agentVersion, so rebuild/upgrade can determine the actual live agent version.

Align status with upgrade by probing for running sandboxes even when sandbox.agentVersion is null (e.g., remove the Boolean(sandbox.agentVersion) gating from shouldProbeSandboxRuntimeVersion).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/lib/actions/sandbox/status.ts` around lines 314 - 342, The status code
currently sets shouldProbeRuntimeVersion = lookup.state === "present" &&
Boolean(sb.agentVersion), which prevents probing live sandboxes when cached
sb.agentVersion is null; change the logic so shouldProbeRuntimeVersion is true
for any live sandbox (i.e., remove the Boolean(sb.agentVersion) gating), and
pass that into sandboxVersion.checkAgentVersion (forceProbe:
shouldProbeRuntimeVersion, skipProbe: !shouldProbeRuntimeVersion) so live
sandboxes are probed for their actual runtime agent version even when cached
metadata is missing; update references to shouldProbeRuntimeVersion,
lookup.state, sb.agentVersion, and the sandboxVersion.checkAgentVersion call
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@src/lib/actions/sandbox/status.ts`:
- Around line 314-342: The status code currently sets shouldProbeRuntimeVersion
= lookup.state === "present" && Boolean(sb.agentVersion), which prevents probing
live sandboxes when cached sb.agentVersion is null; change the logic so
shouldProbeRuntimeVersion is true for any live sandbox (i.e., remove the
Boolean(sb.agentVersion) gating), and pass that into
sandboxVersion.checkAgentVersion (forceProbe: shouldProbeRuntimeVersion,
skipProbe: !shouldProbeRuntimeVersion) so live sandboxes are probed for their
actual runtime agent version even when cached metadata is missing; update
references to shouldProbeRuntimeVersion, lookup.state, sb.agentVersion, and the
sandboxVersion.checkAgentVersion call accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 05f54ea1-b125-4b1c-90d7-06bdf5cdfa82

📥 Commits

Reviewing files that changed from the base of the PR and between 182d3fb and a8b0dc9.

📒 Files selected for processing (2)
  • src/lib/actions/sandbox/status.ts
  • test/cli.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/cli.test.ts

@cv cv added the v0.0.56 Release target label May 29, 2026
@ChunkyMonkey11
Copy link
Copy Markdown
Contributor Author

@cv I pushed a fix for the failing CI / Pull Request Vitest assertions. The issue was Running sandboxes not being treated as live for upgrade probing, plus a stale hard-coded OpenClaw expected version in the test. Targeted and full affected CLI tests pass locally.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/lib/runtime-recovery.test.ts`:
- Around line 62-77: The test for parseReadySandboxNames needs to assert that
sandboxes with PHASE "NotReady" are excluded: update the test case inside the it
block that calls parseReadySandboxNames to include an extra line with a sandbox
whose PHASE is "NotReady" (e.g., "zeta ... NotReady") and ensure the expected
result array passed to toEqual does not contain that name; reference
parseReadySandboxNames and the current test that checks for
["alpha","epsilon","delta"] when adding the NotReady row and asserting it is
absent.

In `@src/lib/runtime-recovery.ts`:
- Around line 54-55: The current check uses cols.includes(...) which can match
tokens outside the PHASE column; instead extract the single PHASE token from
cols (e.g., let phase = cols[PHASE_INDEX] or parse the token known to represent
phase) and then evaluate it exactly: const isReadyOrRunning = phase === "Ready"
|| phase === "Running"; if (!isReadyOrRunning || phase === "NotReady") continue;
Update the logic that computes isReadyOrRunning and the subsequent check to use
the single phase variable (phase) rather than cols.includes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 05d1011a-4a2a-476e-8e38-8f0ec1aac6d9

📥 Commits

Reviewing files that changed from the base of the PR and between 5e4d09f and 08fba22.

📒 Files selected for processing (3)
  • src/lib/runtime-recovery.test.ts
  • src/lib/runtime-recovery.ts
  • test/cli.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/cli.test.ts

Comment thread src/lib/runtime-recovery.test.ts
Comment thread src/lib/runtime-recovery.ts Outdated
Test User added 2 commits May 29, 2026 18:41
Signed-off-by: Test User <test@example.com>
Signed-off-by: Test User <test@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

v0.0.56 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[WSL2 x86_64][CLI&UX] After nemoclaw upgrade, sandbox openclaw stays old; status silent on drift

2 participants