Skip to content

test: prove OpenClaw 2026.5.27 resolves #4434 unreachable inference TUI error#4437

Draft
ericksoa wants to merge 9 commits into
mainfrom
issue-4434-openclaw-2026-5-27-proof
Draft

test: prove OpenClaw 2026.5.27 resolves #4434 unreachable inference TUI error#4437
ericksoa wants to merge 9 commits into
mainfrom
issue-4434-openclaw-2026-5-27-proof

Conversation

@ericksoa
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa commented May 28, 2026

Release target

Refs #4434. This PR targets v0.0.55; #4434 should remain open until this OpenClaw upgrade is merged, tagged, and verified in the shipped .55 release.

Why this resolves #4434

NemoClaw #4434 reports that openclaw tui keeps an active spinner and connected status with no visible terminal error when the NVIDIA inference endpoint is unreachable. This branch moves the sandbox OpenClaw pin from 2026.5.22 to 2026.5.27 with npm integrity:

sha512-2N93zhdAo88KAbHt6T7KvYXf4s7XIkYXBgv1npYpn7e1Y9FvrtgtpsA38my9rtFW+70uXEojRPX5/OqnuDqJPw==

Upstream proof:

Changes

  • Bumps Dockerfile, Dockerfile.base, agents/openclaw/manifest.yaml, and package metadata to OpenClaw 2026.5.27.
  • Updates OpenClaw pin/integrity tests, deployment/version tests, and the existing TUI chat-correlation E2E assertion.
  • Updates scripts/patch-openclaw-chat-send.js so NemoClaw's chat-send run-id preservation shim still recognizes the compiled OpenClaw 2026.5.27 followup-runner admission shape.
  • Adds a CI-safe Vitest contract harness for the [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 TUI failure signature and expected visible-error behavior.
  • Adds the privileged live repro: test/e2e/test-issue-4434-tui-unreachable-inference.sh.
  • Wires that live repro into nightly-e2e.yaml as issue-4434-tui-unreachable-inference-e2e, including selective dispatch, public-install target-ref handling, failure artifacts, aggregate reporting coverage, and trusted workflow-script checkout for the secret/sudo firewall job.

Local validation

  • npm ci
  • npm ci --include=dev
  • npm run build:cli
  • npm run typecheck:cli
  • npm test -- test/fetch-guard-patch-regression.test.ts test/openclaw-chat-send-patch.test.ts test/openclaw-tui-chat-correlation.test.ts test/issue-4434-tui-unreachable-inference.test.ts
  • npm test -- src/lib/sandbox/version.test.ts src/lib/verify-deployment.test.ts
  • npm test -- test/validate-e2e-coverage.test.ts test/e2e-advisor-dispatch.test.ts test/e2e-script-workflow.test.ts test/issue-4434-tui-unreachable-inference.test.ts nemoclaw/src/package-metadata.test.ts
  • shellcheck test/e2e/test-issue-4434-tui-unreachable-inference.sh
  • bash -n test/e2e/test-issue-4434-tui-unreachable-inference.sh
  • bash -n test/e2e/test-openclaw-tui-chat-correlation.sh
  • NEMOCLAW_ISSUE_4434_LIVE=0 bash test/e2e/test-issue-4434-tui-unreachable-inference.sh
  • git diff --check
  • Fresh npm pack openclaw@2026.5.27 dist smoke with node scripts/patch-openclaw-chat-send.js "$tmp/package/dist"
  • Runtime Docker smoke: docker build -f Dockerfile --build-arg BASE_IMAGE=ghcr.io/nvidia/nemoclaw/sandbox-base:latest -t nemoclaw-issue4434-openclaw-runtime-smoke:2026-5-27 .
  • Runtime image version smoke: docker run --rm --entrypoint openclaw nemoclaw-issue4434-openclaw-runtime-smoke:2026-5-27 --version -> OpenClaw 2026.5.27 (27ae826)
  • Base-style OpenClaw install smoke in Docker for the 2026.5.27 npm integrity and install path.
  • Pre-commit suite on 98e0a763efe0925f26cf89129cd4ab63cb0b05f3: passed, including CLI/plugin coverage hooks.
  • Pre-push suite reran CLI/plugin coverage; one unrelated test/nemoclaw-start.test.ts case timed out during the full concurrent run, then passed directly with npx vitest run --project cli test/nemoclaw-start.test.ts -t "captures baseline snapshot when openclaw.json is valid and no baseline exists".

Nightly proof

Targeted nightly E2E passed on the final PR head:

The live job runs the requested end-to-end flow on Linux with the repository NVIDIA_API_KEY secret: public install from this PR ref, cloud onboard with NVIDIA Endpoints and nvidia/nemotron-3-super-120b-a12b, pre-block nemoclaw <sandbox> status, pre-block nemoclaw <sandbox> connect --probe-only, exact DOCKER-USER DROP rules for 75.2.113.119 and 99.83.136.103, in-sandbox endpoint-block verification, openclaw tui, hello, and final TUI assertion.

The passing assertion was:

PASS: openclaw tui surfaced a visible unreachable-inference error and stopped the spinner

The dispatch command for reruns while this job only exists on the PR branch is:

gh workflow run nightly-e2e.yaml --repo NVIDIA/NemoClaw \
  --ref issue-4434-openclaw-2026-5-27-proof \
  -f target_ref=5f549f661fe81b485f75903146512af4225d4698 \
  -f pr_number=4437 \
  -f jobs=issue-4434-tui-unreachable-inference-e2e

Remaining release note

Summary by CodeRabbit

  • Tests

    • Added opt-in live E2E repro and new unit tests for TUI behavior when inference endpoints are unreachable, validating error visibility, spinner/shutdown behavior, and compatibility with updated runtime shapes and followup-runner variants.
  • Chores

    • Bumped OpenClaw/runtime to 2026.5.27 across builds, manifests, docs, and test expectations.
    • Added a selective/nightly E2E job to run the opt-in repro and include its results in aggregated reports.

Review Change Stack

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 28, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: daf817c7-eeff-4e7a-96e9-9b059c58e635

📥 Commits

Reviewing files that changed from the base of the PR and between dcd1f25 and 5f549f6.

📒 Files selected for processing (3)
  • .coderabbit.yaml
  • .github/workflows/nightly-e2e.yaml
  • test/e2e/test-openclaw-tui-chat-correlation.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/nightly-e2e.yaml

📝 Walkthrough

Walkthrough

Bumps OpenClaw to 2026.5.27 across builds/manifests/tests, widens chat-send patching for admission-shaped runners, adds unit and E2E tests reproducing TUI behavior when inference endpoints are unreachable, and adds a selective/nightly E2E job wired into CI aggregations.

Changes

OpenClaw 2026.5.27 upgrade with issue #4434 TUI error handling

Layer / File(s) Summary
OpenClaw version pin upgrade to 2026.5.27
Dockerfile, Dockerfile.base, agents/openclaw/manifest.yaml, nemoclaw/package.json, nemoclaw/src/package-metadata.test.ts, docs/reference/commands.mdx
Build ARG defaults, integrity args, agent manifest expected_version, package metadata, and example docs updated to 2026.5.27 and corresponding in-file review markers adjusted.
Chat-send patch script updates for 2026.5.27 admission flow
scripts/patch-openclaw-chat-send.js
Widen followup-runner detection and apply a two-stage runId-preservation patch to match createReplyOperation and admission-shaped replyOperation patterns.
Followup-runner fixture & tests
test/openclaw-chat-send-patch.test.ts
Add fixture writer and VM context stubs for the 2026.5.27 admission-shaped followup runner and a test verifying runId preservation across admitted/queued/fallback flows.
Sandbox/version and deployment verification tests
src/lib/sandbox/version.test.ts, src/lib/verify-deployment.test.ts
Mocked outputs and assertions updated to expect OpenClaw 2026.5.27 across version-detection and verification tests.
Fetch-guard patch regression tests for 2026.5.27
test/fetch-guard-patch-regression.test.ts
Reviewed patch-classifier versions, SSRF policy shape, integrity constants, loaders, and Dockerfile parsing updated to 2026.5.27 shapes and hashes.
E2E bash test for TUI firewall block scenario
test/e2e/test-issue-4434-tui-unreachable-inference.sh
New opt-in E2E script provisions sandbox, adds iptables DOCKER-USER drops for NVIDIA IPs, runs openclaw tui under expect, normalizes capture, and asserts visible error and spinner-stop behavior.
Unit tests for TUI error visibility on unreachable inference
test/issue-4434-tui-unreachable-inference.test.ts
Vitest suite adding ANSI-stripping, capture classification, TUI rendering, event simulation, and three cases validating error visibility and spinner state under unreachable endpoint scenarios.
Nightly/selective E2E job and CI wiring
.github/workflows/nightly-e2e.yaml, .coderabbit.yaml, test/validate-e2e-coverage.test.ts
New issue-4434-tui-unreachable-inference-e2e job added and allowlisted for selective/manual dispatch; job installs expect/iptables, validates trusted checkout/ref reachability, runs the new E2E script with artifact redaction on failure, is aggregated into notify-on-failure, report-to-pr, and scorecard, and coverage validation updated to support privileged trusted-script jobs.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

"I hopped where spinners used to spin,
I nudged the logs so truth gets in.
From 5.22 we leap to 5.27,
The TUI now will shout—be driven!
Logs redacted, tests pass—hip, hop, hooray!"

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: upgrading OpenClaw to 2026.5.27 to resolve the #4434 TUI spinner error issue, which is the core objective of the PR.
Linked Issues check ✅ Passed The PR comprehensively addresses #4434 by upgrading OpenClaw to 2026.5.27 (which includes upstream fixes for broadcastChatError and error broadcasting), adding live E2E repro tests with expected error visibility assertions, updating test suites, and wiring the repro into nightly CI.
Out of Scope Changes check ✅ Passed All changes are directly related to the #4434 resolution: OpenClaw version bumps across dependencies/tests, patch script updates for 2026.5.27 compatibility, new E2E live repro scripts, test suite additions for TUI error handling, and CI wiring for the new tests.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch issue-4434-openclaw-2026-5-27-proof

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

E2E Advisor Recommendation

Required E2E: cloud-e2e, openclaw-tui-chat-correlation-e2e, issue-4434-tui-unreachable-inference-e2e, rebuild-openclaw-e2e, sandbox-survival-e2e
Optional E2E: openclaw-inference-switch-e2e, inference-routing-e2e, hermes-e2e

Dispatch hint: cloud-e2e,openclaw-tui-chat-correlation-e2e,issue-4434-tui-unreachable-inference-e2e,rebuild-openclaw-e2e,sandbox-survival-e2e

Auto-dispatched E2E: cloud-e2e, openclaw-tui-chat-correlation-e2e, rebuild-openclaw-e2e, sandbox-survival-e2e via nightly-e2e.yaml at 5f549f661fe81b485f75903146512af4225d4698nightly run

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • cloud-e2e (high): Required because the pinned OpenClaw version and sandbox Docker image changed. This is the broadest install → onboard → live cloud inference proof that the rebuilt image and gateway still work.
  • openclaw-tui-chat-correlation-e2e (high): Required because scripts/patch-openclaw-chat-send.js changed for the 2026.5.27 OpenClaw dist shape. This E2E is the direct hermetic proof that TUI chat turns remain correlated and visible after the baked patch is applied.
  • issue-4434-tui-unreachable-inference-e2e (high): Required because this PR adds the live privileged E2E job and script for unreachable NVIDIA inference. It validates the new workflow wiring plus the real user-visible TUI error and spinner shutdown behavior under a firewall block.
  • rebuild-openclaw-e2e (high): Required because the OpenClaw runtime pin and Docker image layers changed. Rebuild coverage verifies an existing OpenClaw sandbox can pick up the new image/runtime while preserving workspace and state.
  • sandbox-survival-e2e (medium): Required because the sandbox image and gateway runtime version changed. This validates gateway restart recovery, persisted sandbox state, workspace availability, and post-recovery inference on the new runtime.

Optional E2E

  • openclaw-inference-switch-e2e (medium): Useful adjacent confidence for the OpenClaw version bump: verifies route/config patching and a live request after switching inference without a rebuild.
  • inference-routing-e2e (medium): Useful adjacent confidence for inference credential isolation and error classification, especially because this PR touches unreachable-inference behavior and deployment verification tests.
  • hermes-e2e (high): Optional multi-agent smoke for shared CI/image assumptions. The changed runtime pin is OpenClaw-specific, but Dockerfile.base changes can still affect shared sandbox build/onboard behavior.

New E2E recommendations

  • None.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: cloud-e2e,openclaw-tui-chat-correlation-e2e,issue-4434-tui-unreachable-inference-e2e,rebuild-openclaw-e2e,sandbox-survival-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • None. No scenario workflow, scenario metadata, scenario runtime, or validation-suite files changed.

Optional scenario E2E

  • None.

Relevant changed files

  • None.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

PR Review Advisor

Findings: 2 needs attention, 7 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 7 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • Privileged [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 job still lacks an explicit trusted workflow-ref gate (.github/workflows/nightly-e2e.yaml:345): The new live proof receives NVIDIA_API_KEY and GITHUB_TOKEN, installs host packages, uses passwordless sudo, and mutates DOCKER-USER iptables rules, but it checks out and runs the workflow harness from `${{ github.ref }}` without first proving that ref is a protected/default trusted ref. The target_ref SHA and ancestry checks constrain the product-under-test commit, but they do not prove the workflow YAML and shell script are trusted code.
    • Recommendation: Before any secrets, apt installs, Docker access, or sudo/iptables use, require the workflow ref to be a trusted ref such as `refs/heads/main`, or split the job so trusted main workflow code performs the privileged proof while any PR-head/product code runs without repository secrets and host privileges.
    • Evidence: The job checks out `ref: ${{ github.ref }}` with `fetch-depth: 0`, resolves `trusted_head` from that checkout, then runs `test/e2e/test-issue-4434-tui-unreachable-inference.sh` with `NVIDIA_API_KEY`, `GITHUB_TOKEN`, and sudo firewall access. No step rejects non-main or unprotected workflow refs.
  • [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 live proof does not assert the structured error, reporting layer, or recovery hint clauses (test/e2e/test-issue-4434-tui-unreachable-inference.sh:24): Issue [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 requires a structured error within 180s that includes an HTTP status or concrete cause, the reporting layer, and a one-line recovery hint. The live proof only requires a broad error-looking token and a final `| error` status, so it can pass on a generic message without proving the user receives the issue-required actionability.
    • Recommendation: Tighten the TUI capture assertions to require the literal issue contract in the visible TUI context: an HTTP status or network cause, a gateway/proxy/upstream/API layer indicator, and a one-line recovery hint such as checking egress policy, checking the API key, retrying, or restoring endpoint access.
    • Evidence: `VISIBLE_ERROR_RE` is `error|failed|timeout|timed out|unavailable|fetch failed|ETIMEDOUT|ECONN|upstream`; the expect block uses the same broad alternatives; the final status assertion only requires `| error`. There is no live assertion for HTTP status/cause, reporting layer, or recovery hint.

🔎 Worth checking

  • Source-of-truth review needed: test/e2e/test-issue-4434-tui-unreachable-inference.sh tolerant TUI capture parsing: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `VISIBLE_ERROR_RE` accepts broad tokens and the final assertion only requires `| error`.
  • Source-of-truth review needed: .github/workflows/nightly-e2e.yaml trusted workflow/product ref split: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: The resolver checks `git merge-base --is-ancestor` for `target_ref`, but no step rejects a non-main workflow ref before secrets and sudo are used.
  • Tolerant TUI capture parsing lacks a source-of-truth contract and removal criteria (test/e2e/test-issue-4434-tui-unreachable-inference.sh:24): The live harness adds localized tolerant parsing for noisy TUI output, but it does not document the invalid capture states being tolerated, the exact upstream/source boundary, or when the broad parser can be removed. Because that same broad parsing under-asserts the issue's structured error clauses, it risks turning the live proof into a weak smoke test instead of a regression guard.
    • Recommendation: Document the source boundary and removal condition for the tolerant parser, and prefer explicit structured assertions over broad catch-all terms. If the upstream TUI output is expected to vary, assert named required fields or patterns for cause, layer, hint, and final error status separately.
    • Evidence: The script accepts any match for `error|failed|timeout|timed out|unavailable|fetch failed|ETIMEDOUT|ECONN|upstream` and then checks only the final status line. The synthetic unit test models a richer error, but the live runtime proof does not require that richer shape.
  • [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 failure artifact redaction skips capture directory contents (.github/workflows/nightly-e2e.yaml:414): The failure sanitizer only edits paths that are regular files directly matched by `/tmp/nemoclaw-issue-4434.*`, but the E2E creates that path as a directory and writes expect, status, blocked-probe, and TUI capture logs inside it. The upload step uses the same directory glob, so nested files from a secret-bearing run can be uploaded without recursive redaction.
    • Recommendation: Redact recursively before upload, for example by finding all regular files under `/tmp/nemoclaw-issue-4434.*` directories and applying the same token replacements, or copy only sanitized files into a separate artifact directory and upload that directory.
    • Evidence: `CAPTURE_DIR` is created with `mktemp -d .../nemoclaw-issue-4434.XXXXXX`; the workflow sanitizer runs `for file in ... /tmp/nemoclaw-issue-4434.*; do [ -f "$file" ] || continue`; the upload path includes `/tmp/nemoclaw-issue-4434.*`.
  • Record advisory review evidence for the high-trust OpenClaw bump (Dockerfile:31): OpenClaw is a high-trust sandbox/gateway dependency involved in inference, chat handling, SSRF-sensitive fetch paths, network policy behavior, and runtime patching. The PR pins `openclaw@2026.5.27` and verifies npm integrity, which is good, but the changed repository files do not record OSV/GHSA/CVE or equivalent advisory-review evidence for the new version. Global npm installs also continue to allow lifecycle scripts without an explicit exception rationale.
    • Recommendation: Record the vulnerability/advisory review result for `openclaw@2026.5.27` in a release or dependency-review artifact, and add an explicit lifecycle-script exception rationale or constraints where global installs require scripts.
    • Evidence: `Dockerfile` and `Dockerfile.base` set `OPENCLAW_VERSION=2026.5.27` with `sha512-2N93zhdAo88KAbHt6T7KvYXf4s7XIkYXBgv1npYpn7e1Y9FvrtgtpsA38my9rtFW+70uXEojRPX5/OqnuDqJPw==`; global installs use `npm install -g`; no changed file records advisory database review evidence.
  • Privileged secret-bearing job installs unpinned host packages (.github/workflows/nightly-e2e.yaml:390): The [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 job installs `expect` and `iptables` from the runner apt repositories immediately before running with repository secrets, Docker access, passwordless sudo, and host firewall mutation. This is common in CI, but it is a sensitive trust boundary for this specific job because package resolution occurs inside the same privileged secret-bearing execution path.
    • Recommendation: Pin or otherwise constrain these host package versions where practical, or move the tools into a prebuilt trusted runner/image used specifically for this privileged proof job.
    • Evidence: The workflow runs `sudo apt-get update` and `sudo apt-get install -y expect iptables` before passing `NVIDIA_API_KEY` and `GITHUB_TOKEN` to the live E2E script.
  • Generated assistant reference still shows the old OpenClaw version (.agents/skills/nemoclaw-user-reference/references/commands.md:421): The public MDX command reference was updated to show `OpenClaw v2026.5.27`, but the repository's assistant reference copy still shows `OpenClaw v2026.5.22`. If that reference is consumed by assistant or user-facing answers, it will drift from the runtime pin changed by this PR.
    • Recommendation: Refresh the generated/reference skill copy or document why it is intentionally not updated in this PR.
    • Evidence: `docs/reference/commands.mdx` now shows `Agent: OpenClaw v2026.5.27`, while `.agents/skills/nemoclaw-user-reference/references/commands.md` still shows `Agent: OpenClaw v2026.5.22`.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Source-of-truth review needed: test/e2e/test-issue-4434-tui-unreachable-inference.sh tolerant TUI capture parsing: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `VISIBLE_ERROR_RE` accepts broad tokens and the final assertion only requires `| error`.
  • Source-of-truth review needed: .github/workflows/nightly-e2e.yaml trusted workflow/product ref split: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: The resolver checks `git merge-base --is-ancestor` for `target_ref`, but no step rejects a non-main workflow ref before secrets and sudo are used.
  • Privileged [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 job still lacks an explicit trusted workflow-ref gate (.github/workflows/nightly-e2e.yaml:345): The new live proof receives NVIDIA_API_KEY and GITHUB_TOKEN, installs host packages, uses passwordless sudo, and mutates DOCKER-USER iptables rules, but it checks out and runs the workflow harness from `${{ github.ref }}` without first proving that ref is a protected/default trusted ref. The target_ref SHA and ancestry checks constrain the product-under-test commit, but they do not prove the workflow YAML and shell script are trusted code.
    • Recommendation: Before any secrets, apt installs, Docker access, or sudo/iptables use, require the workflow ref to be a trusted ref such as `refs/heads/main`, or split the job so trusted main workflow code performs the privileged proof while any PR-head/product code runs without repository secrets and host privileges.
    • Evidence: The job checks out `ref: ${{ github.ref }}` with `fetch-depth: 0`, resolves `trusted_head` from that checkout, then runs `test/e2e/test-issue-4434-tui-unreachable-inference.sh` with `NVIDIA_API_KEY`, `GITHUB_TOKEN`, and sudo firewall access. No step rejects non-main or unprotected workflow refs.
  • [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 live proof does not assert the structured error, reporting layer, or recovery hint clauses (test/e2e/test-issue-4434-tui-unreachable-inference.sh:24): Issue [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 requires a structured error within 180s that includes an HTTP status or concrete cause, the reporting layer, and a one-line recovery hint. The live proof only requires a broad error-looking token and a final `| error` status, so it can pass on a generic message without proving the user receives the issue-required actionability.
    • Recommendation: Tighten the TUI capture assertions to require the literal issue contract in the visible TUI context: an HTTP status or network cause, a gateway/proxy/upstream/API layer indicator, and a one-line recovery hint such as checking egress policy, checking the API key, retrying, or restoring endpoint access.
    • Evidence: `VISIBLE_ERROR_RE` is `error|failed|timeout|timed out|unavailable|fetch failed|ETIMEDOUT|ECONN|upstream`; the expect block uses the same broad alternatives; the final status assertion only requires `| error`. There is no live assertion for HTTP status/cause, reporting layer, or recovery hint.
  • Tolerant TUI capture parsing lacks a source-of-truth contract and removal criteria (test/e2e/test-issue-4434-tui-unreachable-inference.sh:24): The live harness adds localized tolerant parsing for noisy TUI output, but it does not document the invalid capture states being tolerated, the exact upstream/source boundary, or when the broad parser can be removed. Because that same broad parsing under-asserts the issue's structured error clauses, it risks turning the live proof into a weak smoke test instead of a regression guard.
    • Recommendation: Document the source boundary and removal condition for the tolerant parser, and prefer explicit structured assertions over broad catch-all terms. If the upstream TUI output is expected to vary, assert named required fields or patterns for cause, layer, hint, and final error status separately.
    • Evidence: The script accepts any match for `error|failed|timeout|timed out|unavailable|fetch failed|ETIMEDOUT|ECONN|upstream` and then checks only the final status line. The synthetic unit test models a richer error, but the live runtime proof does not require that richer shape.
  • [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 failure artifact redaction skips capture directory contents (.github/workflows/nightly-e2e.yaml:414): The failure sanitizer only edits paths that are regular files directly matched by `/tmp/nemoclaw-issue-4434.*`, but the E2E creates that path as a directory and writes expect, status, blocked-probe, and TUI capture logs inside it. The upload step uses the same directory glob, so nested files from a secret-bearing run can be uploaded without recursive redaction.
    • Recommendation: Redact recursively before upload, for example by finding all regular files under `/tmp/nemoclaw-issue-4434.*` directories and applying the same token replacements, or copy only sanitized files into a separate artifact directory and upload that directory.
    • Evidence: `CAPTURE_DIR` is created with `mktemp -d .../nemoclaw-issue-4434.XXXXXX`; the workflow sanitizer runs `for file in ... /tmp/nemoclaw-issue-4434.*; do [ -f "$file" ] || continue`; the upload path includes `/tmp/nemoclaw-issue-4434.*`.
  • Record advisory review evidence for the high-trust OpenClaw bump (Dockerfile:31): OpenClaw is a high-trust sandbox/gateway dependency involved in inference, chat handling, SSRF-sensitive fetch paths, network policy behavior, and runtime patching. The PR pins `openclaw@2026.5.27` and verifies npm integrity, which is good, but the changed repository files do not record OSV/GHSA/CVE or equivalent advisory-review evidence for the new version. Global npm installs also continue to allow lifecycle scripts without an explicit exception rationale.
    • Recommendation: Record the vulnerability/advisory review result for `openclaw@2026.5.27` in a release or dependency-review artifact, and add an explicit lifecycle-script exception rationale or constraints where global installs require scripts.
    • Evidence: `Dockerfile` and `Dockerfile.base` set `OPENCLAW_VERSION=2026.5.27` with `sha512-2N93zhdAo88KAbHt6T7KvYXf4s7XIkYXBgv1npYpn7e1Y9FvrtgtpsA38my9rtFW+70uXEojRPX5/OqnuDqJPw==`; global installs use `npm install -g`; no changed file records advisory database review evidence.
  • Privileged secret-bearing job installs unpinned host packages (.github/workflows/nightly-e2e.yaml:390): The [DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable #4434 job installs `expect` and `iptables` from the runner apt repositories immediately before running with repository secrets, Docker access, passwordless sudo, and host firewall mutation. This is common in CI, but it is a sensitive trust boundary for this specific job because package resolution occurs inside the same privileged secret-bearing execution path.
    • Recommendation: Pin or otherwise constrain these host package versions where practical, or move the tools into a prebuilt trusted runner/image used specifically for this privileged proof job.
    • Evidence: The workflow runs `sudo apt-get update` and `sudo apt-get install -y expect iptables` before passing `NVIDIA_API_KEY` and `GITHUB_TOKEN` to the live E2E script.
  • Generated assistant reference still shows the old OpenClaw version (.agents/skills/nemoclaw-user-reference/references/commands.md:421): The public MDX command reference was updated to show `OpenClaw v2026.5.27`, but the repository's assistant reference copy still shows `OpenClaw v2026.5.22`. If that reference is consumed by assistant or user-facing answers, it will drift from the runtime pin changed by this PR.
    • Recommendation: Refresh the generated/reference skill copy or document why it is intentionally not updated in this PR.
    • Evidence: `docs/reference/commands.mdx` now shows `Agent: OpenClaw v2026.5.27`, while `.agents/skills/nemoclaw-user-reference/references/commands.md` still shows `Agent: OpenClaw v2026.5.22`.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 28, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26579419772
Target ref: 8b747f31a9a91527f2146dcce1d5346212105f1b
Workflow ref: issue-4434-openclaw-2026-5-27-proof
Requested jobs: issue-4434-tui-unreachable-inference-e2e
Summary: 0 passed, 1 failed, 0 skipped

Job Result
issue-4434-tui-unreachable-inference-e2e ❌ failure

Failed jobs: issue-4434-tui-unreachable-inference-e2e. Check run artifacts for logs.

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26580021337
Target ref: 00a9a185863a68fb81b96a9b8cb00ef32ca9e5b5
Workflow ref: issue-4434-openclaw-2026-5-27-proof
Requested jobs: issue-4434-tui-unreachable-inference-e2e
Summary: 0 passed, 1 failed, 0 skipped

Job Result
issue-4434-tui-unreachable-inference-e2e ❌ failure

Failed jobs: issue-4434-tui-unreachable-inference-e2e. Check run artifacts for logs.

@ericksoa ericksoa changed the title draft: prove OpenClaw 2026.5.27 resolves #4434 unreachable inference TUI error test: prove OpenClaw 2026.5.27 resolves #4434 unreachable inference TUI error May 28, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26580885137
Target ref: 6e94679363fb6f33c8f5a72a7a3126c379ade2f7
Workflow ref: issue-4434-openclaw-2026-5-27-proof
Requested jobs: issue-4434-tui-unreachable-inference-e2e
Summary: 0 passed, 1 failed, 0 skipped

Job Result
issue-4434-tui-unreachable-inference-e2e ❌ failure

Failed jobs: issue-4434-tui-unreachable-inference-e2e. Check run artifacts for logs.

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26581483934
Target ref: 84fcd3d3b355e68d31faa640dbedfd160ed6c2fe
Workflow ref: issue-4434-openclaw-2026-5-27-proof
Requested jobs: issue-4434-tui-unreachable-inference-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
issue-4434-tui-unreachable-inference-e2e ✅ success

@ericksoa ericksoa added v0.0.55 Release target fix labels May 28, 2026
@ericksoa ericksoa marked this pull request as ready for review May 28, 2026 15:23
@ericksoa
Copy link
Copy Markdown
Contributor Author

/nvskills-ci

@ericksoa
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26584199493
Target ref: 98e0a763efe0925f26cf89129cd4ab63cb0b05f3
Workflow ref: issue-4434-openclaw-2026-5-27-proof
Requested jobs: issue-4434-tui-unreachable-inference-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
issue-4434-tui-unreachable-inference-e2e ✅ success

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 354-362: The checkout step currently leaves default git
credentials in the workspace; update the Checkout action configuration (the step
named "Checkout" that uses
actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd) to disable persisted
credentials by adding persist-credentials: false under its with: block so the
privileged job does not retain Git credentials in the workspace.
- Around line 345-350: The .coderabbit.yaml is missing a path_instructions entry
for the workflow job issue-4434-tui-unreachable-inference-e2e; add a mapping
under path_instructions that references the workflow/job key
(issue-4434-tui-unreachable-inference-e2e) and points to the relevant test
script paths (e.g., test-issue-4434-tui-unreachable-inference or the actual e2e
test directory/files), ensuring the key matches the job name used in the
workflow and the path glob patterns match where the test scripts live so
coderabbit can trigger correct instructions for that job.

In `@test/e2e/test-openclaw-tui-chat-correlation.sh`:
- Line 49: The version check uses regex-style grep which treats "." as any char;
change the command used in the conditional that references openclaw_version to
use fixed-string matching (e.g., replace grep -q "2026.5.27" with grep -Fq
"2026.5.27") so the literal string "2026.5.27" is matched exactly (update the if
condition around the grep invocation).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: a2107b9a-8f01-4ffa-bfe2-206858d44df8

📥 Commits

Reviewing files that changed from the base of the PR and between 0c108ae and 98e0a76.

📒 Files selected for processing (15)
  • .github/workflows/nightly-e2e.yaml
  • Dockerfile
  • Dockerfile.base
  • agents/openclaw/manifest.yaml
  • nemoclaw/package.json
  • nemoclaw/src/package-metadata.test.ts
  • scripts/patch-openclaw-chat-send.js
  • src/lib/sandbox/version.test.ts
  • src/lib/verify-deployment.test.ts
  • test/e2e/test-issue-4434-tui-unreachable-inference.sh
  • test/e2e/test-openclaw-tui-chat-correlation.sh
  • test/fetch-guard-patch-regression.test.ts
  • test/issue-4434-tui-unreachable-inference.test.ts
  • test/openclaw-chat-send-patch.test.ts
  • test/validate-e2e-coverage.test.ts

Comment thread .github/workflows/nightly-e2e.yaml
Comment thread .github/workflows/nightly-e2e.yaml
Comment thread test/e2e/test-openclaw-tui-chat-correlation.sh Outdated
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26584355281
Target ref: 98e0a763efe0925f26cf89129cd4ab63cb0b05f3
Workflow ref: main
Requested jobs: cloud-onboard-e2e,cloud-inference-e2e,inference-routing-e2e,network-policy-e2e,rebuild-openclaw-e2e,openclaw-tui-chat-correlation-e2e
Summary: 6 passed, 0 failed, 0 skipped

Job Result
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
inference-routing-e2e ✅ success
network-policy-e2e ✅ success
openclaw-tui-chat-correlation-e2e ✅ success
rebuild-openclaw-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26585206489
Target ref: 260a808ce0d4d05c5c51db8bb54854c4d63f474c
Workflow ref: issue-4434-openclaw-2026-5-27-proof
Requested jobs: issue-4434-tui-unreachable-inference-e2e
Summary: 0 passed, 1 failed, 0 skipped

Job Result
issue-4434-tui-unreachable-inference-e2e ❌ failure

Failed jobs: issue-4434-tui-unreachable-inference-e2e. Check run artifacts for logs.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26585283728
Target ref: 260a808ceb37417308653c232f2011d550399d45
Workflow ref: issue-4434-openclaw-2026-5-27-proof
Requested jobs: issue-4434-tui-unreachable-inference-e2e
Summary: 0 passed, 1 failed, 0 skipped

Job Result
issue-4434-tui-unreachable-inference-e2e ❌ failure

Failed jobs: issue-4434-tui-unreachable-inference-e2e. Check run artifacts for logs.

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ❌ Some jobs failed

Run: 26585367114
Target ref: 260a808ceb37417308653c232f2011d550399d45
Workflow ref: main
Requested jobs: rebuild-openclaw-e2e,cloud-onboard-e2e,cloud-inference-e2e,openclaw-tui-chat-correlation-e2e,inference-routing-e2e,network-policy-e2e
Summary: 5 passed, 1 failed, 0 skipped

Job Result
cloud-inference-e2e ✅ success
cloud-onboard-e2e ✅ success
inference-routing-e2e ✅ success
network-policy-e2e ✅ success
openclaw-tui-chat-correlation-e2e ❌ failure
rebuild-openclaw-e2e ✅ success

Failed jobs: openclaw-tui-chat-correlation-e2e. Check run artifacts for logs.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26586330696
Target ref: dcd1f25f5caf0cb57670620e52f71ef8b7f19d6f
Workflow ref: issue-4434-openclaw-2026-5-27-proof
Requested jobs: issue-4434-tui-unreachable-inference-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
issue-4434-tui-unreachable-inference-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26586391765
Target ref: dcd1f25f5caf0cb57670620e52f71ef8b7f19d6f
Workflow ref: main
Requested jobs: rebuild-openclaw-e2e,cloud-onboard-e2e,openclaw-tui-chat-correlation-e2e,inference-routing-e2e,network-policy-e2e
Summary: 5 passed, 0 failed, 0 skipped

Job Result
cloud-onboard-e2e ✅ success
inference-routing-e2e ✅ success
network-policy-e2e ✅ success
openclaw-tui-chat-correlation-e2e ✅ success
rebuild-openclaw-e2e ✅ success

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26586935610
Target ref: 5f549f661fe81b485f75903146512af4225d4698
Workflow ref: issue-4434-openclaw-2026-5-27-proof
Requested jobs: issue-4434-tui-unreachable-inference-e2e
Summary: 1 passed, 0 failed, 0 skipped

Job Result
issue-4434-tui-unreachable-inference-e2e ✅ success

@ericksoa ericksoa requested review from cv and jyaunches May 28, 2026 16:25
@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26587080976
Target ref: 5f549f661fe81b485f75903146512af4225d4698
Workflow ref: main
Requested jobs: cloud-e2e,openclaw-tui-chat-correlation-e2e,rebuild-openclaw-e2e,sandbox-survival-e2e
Summary: 4 passed, 0 failed, 0 skipped

Job Result
cloud-e2e ✅ success
openclaw-tui-chat-correlation-e2e ✅ success
rebuild-openclaw-e2e ✅ success
sandbox-survival-e2e ✅ success

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

@wscurran wscurran added Integration: OpenClaw Support for OpenClaw NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). labels May 28, 2026
@wscurran
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

Selective E2E Results — ✅ All requested jobs passed

Run: 26652959641
Target ref: 5f549f661fe81b485f75903146512af4225d4698
Workflow ref: issue-4434-openclaw-2026-5-27-proof
Requested jobs: openclaw-inference-switch-e2e,inference-routing-e2e,hermes-e2e
Summary: 3 passed, 0 failed, 0 skipped

Job Result
hermes-e2e ✅ success
inference-routing-e2e ✅ success
openclaw-inference-switch-e2e ✅ success

Copy link
Copy Markdown
Contributor

@jyaunches jyaunches left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving on regression-risk grounds.

Regression-risk review

Surface area is small: only one product-code change (scripts/patch-openclaw-chat-send.js, +16/−3) plus version/integrity bumps and test fixture alignment. The rest of the behavior change comes from upstream OpenClaw 2026.5.222026.5.27.

Past OpenClaw bumps have rolled back in production (e.g. #3820 reverted by #4051 in <5h), so I wanted breadth of E2E rather than just count of green checks before approving.

E2E coverage on head 5f549f66

All 5 advisor-required jobs green via auto-dispatch + the live privileged job:

  • cloud-e2e
  • openclaw-tui-chat-correlation-e2e (direct proof the chat-send shim still applies on the new admitReplyTurn shape)
  • rebuild-openclaw-e2e
  • sandbox-survival-e2e
  • issue-4434-tui-unreachable-inference-e2e (live privileged repro on Linux w/ NVIDIA_API_KEY)

The advisor also flagged three optional adjacent surfaces that weren't auto-dispatched. I dispatched them manually against this PR head (run 26652959641):

  • openclaw-inference-switch-e2e
  • inference-routing-e2e
  • hermes-e2e

8/8 advisor-recommended E2Es now green on 5f549f66.

Non-blocking follow-ups (PR Review Advisor)

The PR Review Advisor only ran once on the initial commit (924c1d52) and never re-evaluated after the "harden review gates" / "address review feedback" commits. Verifying each finding directly against HEAD:

  • 🛠 Privileged job lacks explicit trusted workflow-ref gate (if: github.ref == 'refs/heads/main') — partially mitigated by persist-credentials: false and SHA-pinned + ancestor-checked target_ref, but workflow YAML/script ref itself isn't gated.
  • 🛠 Live proof's VISIBLE_ERROR_RE and final-status assertion are still broad; the synthetic Vitest covers the structured-error shape, so regression-guard quality lives mostly there.
  • 🔎 Failure-artifact sanitizer is non-recursive into CAPTURE_DIR/; tokens are still env-redacted.
  • 🔎 Adjacent items: pin expect/iptables host packages, record OSV/GHSA advisory review evidence for openclaw@2026.5.27, refresh .agents/skills/nemoclaw-user-reference/references/commands.md (still shows v2026.5.22), and document tolerant-parser source-of-truth/removal contract.

CodeRabbit's three actionable items (path_instructions mapping, persist-credentials: false, fixed-string version match) are all addressed.

These are CI hygiene / assertion rigor, not product correctness, and reasonable to track as a follow-up issue.

Verdict

LGTM — approving. Recommend opening one tracking issue for the advisor follow-ups before merging.

@jyaunches jyaunches added R2 v0.0.56 Release target and removed v0.0.55 Release target R2 labels May 29, 2026
@cv cv marked this pull request as draft May 30, 2026 18:21
@cv
Copy link
Copy Markdown
Collaborator

cv commented May 30, 2026

Converting to draft so we don't merge a new OpenClaw without discussing first.

@cv cv added v0.0.57 Release target and removed v0.0.56 Release target labels Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

fix Integration: OpenClaw Support for OpenClaw NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). v0.0.57 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DGX Spark][CLI&UX] openclaw tui shows indefinite spinner with no error when inference endpoint is unreachable

4 participants