Skip to content

fix(install): self re-exec via sg(1) so non-interactive curl|bash finishes Docker group activation#4419

Merged
cv merged 6 commits into
mainfrom
fix/4414-non-interactive-docker-group-reexec
May 30, 2026
Merged

fix(install): self re-exec via sg(1) so non-interactive curl|bash finishes Docker group activation#4419
cv merged 6 commits into
mainfrom
fix/4414-non-interactive-docker-group-reexec

Conversation

@jason-ma-nv
Copy link
Copy Markdown
Contributor

@jason-ma-nv jason-ma-nv commented May 28, 2026

Summary

On a clean Ubuntu 24.04 VM where the user is not in the docker group, the non-interactive curl|bash installer would install Docker, run usermod -aG docker, then exit 0 with instructions to run newgrp docker and re-curl — breaking the NEMOCLAW_NON_INTERACTIVE=1 contract. This PR self-reactivates docker group membership via sg(1) and re-execs the installer so a single curl … | bash completes the install.

Related Issue

Fixes #4414

Changes

  • scripts/install.sh (ensure_docker): when group refresh is needed AND installer_non_interactive is true AND sg is available, exec sg docker -c "exec bash <script> <args>" instead of exiting 0. Guarded by NEMOCLAW_DOCKER_GROUP_REACTIVATED=1 so we don't loop if docker is still unreachable after the re-exec.
  • scripts/install.sh (main): capture original argv into _NEMOCLAW_INSTALLER_ARGS at entry so ensure_docker can forward CLI flags (e.g. --non-interactive --yes-i-accept-third-party-software) across the re-exec.
  • test/install-docker-group-reexec.test.ts: new vitest covering both the non-interactive re-exec path (asserts sg docker -c … is invoked with the original args preserved) and the interactive fallback (asserts the legacy newgrp docker/re-curl message + exit 0 still apply).
  • Interactive runs are unchanged: if NEMOCLAW_NON_INTERACTIVE is unset, the existing newgrp docker/re-curl instructions still fire.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Verification

  • npx prek run (shellcheck, shfmt, biome, SPDX, gitleaks, etc.) passes on scripts/install.sh and the new test file
  • npx prek run --all-files passes — 25 pre-existing CLI test failures on main (ssrf-parity, cli, fetch-guard-patch-regression, generate-openclaw-config, nemoclaw-start, e2e-lib-helpers), unrelated to this change; confirmed by re-running on main with this branch stashed
  • npm test passes — same pre-existing failures as above
  • Tests added or updated for new or changed behavior (test/install-docker-group-reexec.test.ts)
  • All test/install-*.test.ts and test/install-docker-group-reexec.test.ts pass on this branch
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes — behavior change is transparent to the user (single curl|bash now finishes); no user-facing doc edit needed
  • npm run docs builds without warnings (doc changes only)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Jason Ma jama@nvidia.com

Summary by CodeRabbit

  • New Features

    • Installer now auto-handles Docker group access in non-interactive installs by re-invoking itself so a single automated run can continue without manual steps; includes a guard to prevent repeated re-exec loops.
  • Tests

    • Added automated tests covering non-interactive re-exec behavior, interactive fallback, and the re-exec guard to ensure installer behaves correctly.

Review Change Stack

…ishes after adding user to docker group

On a clean Ubuntu VM where the user is not yet in the docker group, the
non-interactive curl|bash installer used to install Docker, run usermod
-aG docker, then exit 0 with instructions to run newgrp docker and
re-curl. That broke the contract of NEMOCLAW_NON_INTERACTIVE=1 — the
human still had to round-trip the shell themselves.

When non-interactive mode is set and sg(1) is available, re-execute the
installer under sg docker so the rest of the install (Node.js, CLI,
onboard) runs in a shell with active docker group membership. A
NEMOCLAW_DOCKER_GROUP_REACTIVATED guard prevents looping if docker is
still unreachable after the re-exec. Interactive runs keep the existing
"run newgrp docker and re-curl" message.

Fixes #4414

Signed-off-by: Jason Ma <jama@nvidia.com>
@jason-ma-nv jason-ma-nv self-assigned this May 28, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 26b03644-24c5-4f8c-b7ba-58250c19ad3a

📥 Commits

Reviewing files that changed from the base of the PR and between 51333f0 and 3d3b30f.

📒 Files selected for processing (1)
  • test/install-docker-group-reexec.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/install-docker-group-reexec.test.ts

📝 Walkthrough

Walkthrough

Installer captures original argv at startup; when ensure_docker finds docker group inactive during a non-interactive run it sets a guard and re-executes the installer under sg docker -c, forwarding the original installer arguments so the single curl|bash invocation continues. Tests validate non-interactive, interactive, and guard paths.

Changes

Non-interactive Docker group re-activation

Layer / File(s) Summary
Argument forwarding and non-interactive re-execution
scripts/install.sh
main() saves original "$@" into _NEMOCLAW_INSTALLER_ARGS; ensure_docker() detects missing docker group in non-interactive mode, sets NEMOCLAW_DOCKER_GROUP_REACTIVATED=1, and execs sg docker -c with a bash command that forwards the captured installer argv to continue the install without manual rerun.
Test helper and behavior validation
test/install-docker-group-reexec.test.ts
Adds runEnsureDocker() helper to stub sg and run ensure_docker under controlled conditions. Tests assert sg docker -c invocation preserves scripts/install.sh and original flags in non-interactive mode, confirm no sg call in interactive mode, and verify the re-exec guard prevents loops.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#4011: touches scripts/install.sh ensure_docker messaging while this PR adds non-interactive sg docker re-exec logic and tests.

Suggested labels

v0.0.49

Suggested reviewers

  • jyaunches

Poem

A rabbit hops through Docker's gate,
Args bundled tight, it won't be late,
sg docker whispers, "carry on",
One curl|bash — the job is done. 🐇✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: enabling non-interactive curl|bash installations to self re-exec via sg(1) for Docker group activation instead of exiting early.
Linked Issues check ✅ Passed The changes fully implement the objectives from #4414: enabling non-interactive re-exec via sg docker with argv forwarding, guarding against loops, preserving interactive fallback behavior, and adding comprehensive test coverage.
Out of Scope Changes check ✅ Passed All changes directly address #4414: argv capture in main(), sg re-exec logic in ensure_docker(), and tests validating both re-exec and fallback paths with no extraneous modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/4414-non-interactive-docker-group-reexec

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread test/install-docker-group-reexec.test.ts Fixed
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

E2E Advisor Recommendation

Required E2E: cloud-onboard-e2e, cloud-e2e
Optional E2E: launchable-smoke-e2e, wsl-e2e

Dispatch hint: cloud-onboard-e2e,cloud-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • cloud-onboard-e2e (medium): Exercises the public curl|bash non-interactive installer path, then verifies onboarding, sandbox health, security checks, and live cloud inference. This is the closest existing E2E coverage for the changed installer flow.
  • cloud-e2e (medium): Runs the repo-local install.sh --non-interactive full user journey: install, onboard, sandbox verification, CLI operations, and live inference. Required because scripts/install.sh changed core installer control flow.

Optional E2E

  • launchable-smoke-e2e (medium): Useful adjacent confidence for Ubuntu host setup and community bootstrap behavior, but it does not specifically force the inactive docker-group re-exec path.
  • wsl-e2e (high): Optional platform regression check that Docker setup remains skipped under WSL after nearby ensure_docker changes.

New E2E recommendations

  • installer Docker group reactivation (high): Existing E2E jobs generally run on hosts where Docker is already reachable or the socket is pre-permissioned, so they will not exercise the [Ubuntu 24.04][Install] non-interactive curl-bash install exits after adding user to docker group instead of self-reentering #4414 path: clean Linux user not in active docker group, non-interactive install, sudo usermod, sg docker re-exec, and completion without asking for newgrp/re-curl. Add a fresh Ubuntu VM scenario that starts without Docker or without active docker group membership and asserts a single non-interactive curl|bash completes onboarding and sandbox creation.
    • Suggested test: Add an E2E scenario/job such as ubuntu-clean-docker-group-reexec-e2e that runs the public installer on a clean Ubuntu VM with inactive docker group membership and validates sg docker re-exec, argv preservation, loop guard behavior, and successful onboard.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: cloud-onboard-e2e,cloud-e2e

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • None. No scenario workflow, scenario metadata, scenario runtime, or validation-suite files changed.

Optional scenario E2E

  • None.

Relevant changed files

  • None.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 28, 2026

PR Review Advisor

Findings: 1 needs attention, 5 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 6 still apply, 0 new items found

Review findings

🛠️ Needs attention

  • curl|bash stdin installs still lack a file-backed self path for sg re-exec (scripts/install.sh:2189): The linked issue's required invocation is a piped stdin installer, but the new re-exec branch only runs when the installer has a real file path. In a true `curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash` run, `BASH_SOURCE[0]`/`$0` is not a reliable file-backed copy of the installer, so this can skip `sg docker -c` and fall back to the old exit-0 `newgrp`/re-curl guidance instead of completing through onboarding.
    • Recommendation: Make the re-entry mechanism work for stdin/piped installs, for example by staging the current installer payload to a verified temporary file before invoking `sg`, or by otherwise creating a real executable script path for the second pass. Add a regression test that invokes the installer through stdin and verifies the non-interactive sg path is taken instead of the legacy fallback.
    • Evidence: Issue [Ubuntu 24.04][Install] non-interactive curl-bash install exits after adding user to docker group instead of self-reentering #4414 says to run `curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash` and expects `A single non-interactive invocation completes through `nemoclaw onboard``. Production gates the new path on `local self="${BASH_SOURCE[0]:-$0}"` and `[ -f "$self" ]`; the new test sources `scripts/install.sh` from disk, which gives it a real file path and masks the stdin case.

🔎 Worth checking

  • Source-of-truth review needed: scripts/install.sh ensure_docker non-interactive Docker group re-exec: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `scripts/install.sh` adds `sg docker -c` recovery under `needs_group_refresh`; `test/install-docker-group-reexec.test.ts` stubs `sg` by logging args and exiting.
  • sg test records command construction but not the re-entered installer behavior (test/install-docker-group-reexec.test.ts:25): The `sg` stub writes its arguments to a log and exits 0, so the test proves only that the first pass asked for `sg docker -c ...`. It does not prove that the command string actually re-runs the installer, carries `NEMOCLAW_DOCKER_GROUP_REACTIVATED=1`, sees Docker group membership active on the second pass, returns from `ensure_docker`, and continues toward onboarding.
    • Recommendation: Make the `sg` stub extract and execute the `-c` command in a controlled harness, or add targeted runtime/integration validation that faithfully models the second pass. The second pass should change mocked `id -nG` and `docker info` behavior so the test proves `ensure_docker` no longer falls back and execution can proceed to later installer phases.
    • Evidence: The test stub is `printf '%s\n' "$@" > sg-args.txt` followed by `exit 0`; production uses `exec sg docker -c "$cmd"`.
  • sg -c shell-string boundary lacks negative coverage (scripts/install.sh:2194): `sg -c` is an unavoidable shell-string execution boundary in this change. Production uses Bash `%q` for the installer path and forwarded arguments, which is the right mitigation, but the tests neither execute that shell string nor cover spaces, quotes, backslashes, or shell metacharacters in the path or argv.
    • Recommendation: Add tests with installer arguments and/or paths containing spaces and shell metacharacters, and execute the `sg -c` command in the test harness so quoting regressions are caught. Keep using `%q` or equivalent structured shell escaping for every value interpolated into the command string.
    • Evidence: Production builds `cmd` via `printf -v cmd 'exec bash %q' "$self"` and appends each `_NEMOCLAW_INSTALLER_ARGS` entry with `%q`; the current `sg` stub only logs arguments and exits.
  • Automatic continuation after Docker group activation is a host authorization boundary (scripts/install.sh:2186): The change makes non-interactive installs continue automatically after adding the user to the Docker group by re-entering under `sg docker`. Docker group membership effectively grants root-level control over the Docker daemon, so this automatic continuation deserves explicit maintainer attention even though the existing warning text is preserved.
    • Recommendation: Confirm the existing non-interactive acceptance gates and warning text are sufficient for automatically continuing after `sudo usermod -aG docker`. If not, add an explicit opt-in or clearer non-interactive audit log before the `sg` re-exec.
    • Evidence: `scripts/install.sh` warns that Docker group members can control the daemon with root-level impact, then automatically runs `exec sg docker -c "$cmd"` when `installer_non_interactive` is true.
  • Docker group re-exec workaround source-of-truth story is incomplete (scripts/install.sh:2180): This is localized recovery behavior for a Linux session/group-membership boundary. The comments explain the invalid active-group state and loop guard, but the regression proof does not exercise the actual stdin/process re-entry boundary and no removal condition is documented.
    • Recommendation: Document the source boundary and removal condition near the workaround, and pair it with a regression test that covers the real re-entry boundary. A reasonable removal condition would be when the installer no longer needs same-run Docker access after `usermod`, or when a first-class Docker bootstrap/re-entry mechanism replaces the `sg`/`newgrp` workaround.
    • Evidence: `ensure_docker` adds localized `sg docker -c` recovery under `needs_group_refresh`; tests cover fallback text and loop guard but do not execute the sg command string or define when this compatibility path can be removed.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Source-of-truth review needed: scripts/install.sh ensure_docker non-interactive Docker group re-exec: The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `scripts/install.sh` adds `sg docker -c` recovery under `needs_group_refresh`; `test/install-docker-group-reexec.test.ts` stubs `sg` by logging args and exiting.
  • curl|bash stdin installs still lack a file-backed self path for sg re-exec (scripts/install.sh:2189): The linked issue's required invocation is a piped stdin installer, but the new re-exec branch only runs when the installer has a real file path. In a true `curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash` run, `BASH_SOURCE[0]`/`$0` is not a reliable file-backed copy of the installer, so this can skip `sg docker -c` and fall back to the old exit-0 `newgrp`/re-curl guidance instead of completing through onboarding.
    • Recommendation: Make the re-entry mechanism work for stdin/piped installs, for example by staging the current installer payload to a verified temporary file before invoking `sg`, or by otherwise creating a real executable script path for the second pass. Add a regression test that invokes the installer through stdin and verifies the non-interactive sg path is taken instead of the legacy fallback.
    • Evidence: Issue [Ubuntu 24.04][Install] non-interactive curl-bash install exits after adding user to docker group instead of self-reentering #4414 says to run `curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash` and expects `A single non-interactive invocation completes through `nemoclaw onboard``. Production gates the new path on `local self="${BASH_SOURCE[0]:-$0}"` and `[ -f "$self" ]`; the new test sources `scripts/install.sh` from disk, which gives it a real file path and masks the stdin case.
  • sg test records command construction but not the re-entered installer behavior (test/install-docker-group-reexec.test.ts:25): The `sg` stub writes its arguments to a log and exits 0, so the test proves only that the first pass asked for `sg docker -c ...`. It does not prove that the command string actually re-runs the installer, carries `NEMOCLAW_DOCKER_GROUP_REACTIVATED=1`, sees Docker group membership active on the second pass, returns from `ensure_docker`, and continues toward onboarding.
    • Recommendation: Make the `sg` stub extract and execute the `-c` command in a controlled harness, or add targeted runtime/integration validation that faithfully models the second pass. The second pass should change mocked `id -nG` and `docker info` behavior so the test proves `ensure_docker` no longer falls back and execution can proceed to later installer phases.
    • Evidence: The test stub is `printf '%s\n' "$@" > sg-args.txt` followed by `exit 0`; production uses `exec sg docker -c "$cmd"`.
  • sg -c shell-string boundary lacks negative coverage (scripts/install.sh:2194): `sg -c` is an unavoidable shell-string execution boundary in this change. Production uses Bash `%q` for the installer path and forwarded arguments, which is the right mitigation, but the tests neither execute that shell string nor cover spaces, quotes, backslashes, or shell metacharacters in the path or argv.
    • Recommendation: Add tests with installer arguments and/or paths containing spaces and shell metacharacters, and execute the `sg -c` command in the test harness so quoting regressions are caught. Keep using `%q` or equivalent structured shell escaping for every value interpolated into the command string.
    • Evidence: Production builds `cmd` via `printf -v cmd 'exec bash %q' "$self"` and appends each `_NEMOCLAW_INSTALLER_ARGS` entry with `%q`; the current `sg` stub only logs arguments and exits.
  • Automatic continuation after Docker group activation is a host authorization boundary (scripts/install.sh:2186): The change makes non-interactive installs continue automatically after adding the user to the Docker group by re-entering under `sg docker`. Docker group membership effectively grants root-level control over the Docker daemon, so this automatic continuation deserves explicit maintainer attention even though the existing warning text is preserved.
    • Recommendation: Confirm the existing non-interactive acceptance gates and warning text are sufficient for automatically continuing after `sudo usermod -aG docker`. If not, add an explicit opt-in or clearer non-interactive audit log before the `sg` re-exec.
    • Evidence: `scripts/install.sh` warns that Docker group members can control the daemon with root-level impact, then automatically runs `exec sg docker -c "$cmd"` when `installer_non_interactive` is true.
  • Docker group re-exec workaround source-of-truth story is incomplete (scripts/install.sh:2180): This is localized recovery behavior for a Linux session/group-membership boundary. The comments explain the invalid active-group state and loop guard, but the regression proof does not exercise the actual stdin/process re-entry boundary and no removal condition is documented.
    • Recommendation: Document the source boundary and removal condition near the workaround, and pair it with a regression test that covers the real re-entry boundary. A reasonable removal condition would be when the installer no longer needs same-run Docker access after `usermod`, or when a first-class Docker bootstrap/re-entry mechanism replaces the `sg`/`newgrp` workaround.
    • Evidence: `ensure_docker` adds localized `sg docker -c` recovery under `needs_group_refresh`; tests cover fallback text and loop guard but do not execute the sg command string or define when this compatibility path can be removed.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@jason-ma-nv jason-ma-nv added the v0.0.54 Release target label May 28, 2026
…review)

Two PR review fixes for #4414:

- Escape backslashes before quotes in the bash-snippet arg builder
  (CodeQL: incomplete string escaping). The previous regex only escaped
  `"` and would have let a literal `\` slip through if a future test
  passed one.
- Add a one-shot loop-guard test: with NEMOCLAW_DOCKER_GROUP_REACTIVATED=1
  already set, ensure_docker must NOT call sg again — it falls back to
  the legacy "newgrp docker / re-curl" path. Verified by mutation: if
  the guard is removed from scripts/install.sh, the new test fails
  (sg is called a second time and the exit status flips).

Signed-off-by: Jason Ma <jama@nvidia.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/install-docker-group-reexec.test.ts`:
- Around line 126-144: The test for the one-shot guard (the it block using
runEnsureDocker with NEMOCLAW_DOCKER_GROUP_REACTIVATED set) currently only
checks sgArgs and exit status; add an assertion that the user-facing guidance
text is emitted by checking the command output (e.g. examine outcome.stdout or
outcome.output) for the expected fallback instructions such as the "newgrp
docker" hint and the re-curl / installation curl guidance (match substrings like
"newgrp docker" and "curl" to ensure the legacy guidance text is present).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c8b21984-d20a-4bcc-8e52-9f1af0d6d3b7

📥 Commits

Reviewing files that changed from the base of the PR and between 395812f and 51333f0.

📒 Files selected for processing (1)
  • test/install-docker-group-reexec.test.ts

Comment thread test/install-docker-group-reexec.test.ts
…review)

CodeRabbit feedback on the one-shot loop-guard test: the existing
assertions check that sg was not re-invoked and that exit was 0, but
not that the actionable user instructions actually got printed.

Un-silence the info() stub so its output reaches stdout, and assert
"Run: newgrp docker" and the re-curl instruction. Mutation-verified:
rewording the newgrp line in scripts/install.sh now fails the test.

Signed-off-by: Jason Ma <jama@nvidia.com>
@cv cv added v0.0.55 and removed v0.0.54 Release target labels May 28, 2026
@wscurran wscurran added Docker platform: ubuntu Affects Ubuntu Linux environments v0.0.54 Release target labels May 28, 2026
@wscurran
Copy link
Copy Markdown
Contributor

@cv cv added v0.0.55 and removed v0.0.55 v0.0.54 Release target labels May 28, 2026
Copy link
Copy Markdown
Contributor

@cjagwani cjagwani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

approving the disk-file scope on its merits — fix logic + tests are sound, sg(1) re-exec works cleanly. filing follow-up PR for the literal curl|bash repro per my comment above; the staging-to-disk extension is small enough to land as its own focused change.

@jyaunches jyaunches added R1 v0.0.56 Release target and removed v0.0.55 labels May 29, 2026
cv added a commit that referenced this pull request May 29, 2026
…4414) (#4467)

## Summary
Stages the installer to a tmpfile when invoked via `curl ... | bash` so
the sg(1) re-exec from #4419 has a real script file to point at.
Together with #4419, closes #4414.

## Related Issue
Fixes #4414

## Background
#4419 added a non-interactive sg(1) re-exec to `ensure_docker` so the
installer can finish in a single invocation after `usermod -aG docker`.
That re-exec is gated on `[ -f "$self" ]` where `$self =
${BASH_SOURCE[0]:-$0}`. For the literal `curl ... | bash` repro from
#4414:

- `BASH_SOURCE[0]` is empty
- `$0` is `"bash"`
- `[ -f "bash" ]` is false
- the fix falls through to the legacy `newgrp docker` / re-curl message

So #4419 alone doesn't close #4414 for the most common `curl | bash`
invocation.

Empirically reproduced on a fresh Ubuntu 22.04 brev box: `bash
scripts/install.sh --non-interactive ...` hits the fix ✅, but `cat
scripts/install.sh | bash -s -- --non-interactive ...` falls through to
legacy.

## Changes
- `scripts/install.sh` (entry guard, lines ~2486-2505): when
`BASH_SOURCE[0]` is empty (pipe mode) and `NEMOCLAW_INSTALLER_STAGED !=
1`, mktemp a `/tmp/nemoclaw-installer-XXXXXX` file, curl the canonical
URL into it, then `exec bash` on the staged file. The re-entered
installer has a real `BASH_SOURCE[0]`, so #4419's sg(1) re-exec
succeeds.
- `scripts/install.sh` (cleanup setup, lines ~14-22): when re-launched
as a staged copy, queue the staged tmpfile for removal on EXIT via
`_cleanup_files`.
- `test/install-stage-from-stdin.test.ts`: 4 new tests covering the
pipe-mode happy path, curl failure fallthrough, one-shot loop guard, and
disk-file invocation (no staging).

## Guards
- only fires when `BASH_SOURCE[0]` is empty (preserves disk-file path
unchanged)
- `NEMOCLAW_INSTALLER_STAGED=1` one-shot loop guard
- `mktemp` / `curl` / empty-download failures fall through to legacy
direct-`main()`
- `NEMOCLAW_INSTALLER_URL` env override for testing / staging
environments
- staged tmpfile auto-cleaned on EXIT
- portable `mktemp` template (no `.sh` suffix — BSD mktemp on macOS
rejects it)

## Type of Change
- [x] Code change (feature, bug fix, or refactor)
- [ ] Code change with doc updates
- [ ] Doc only (prose changes, no code sample modifications)
- [ ] Doc only (includes code sample changes)

## Verification
- [x] `npm test -- test/install-stage-from-stdin.test.ts
test/install-preflight.test.ts` — 103 / 103 pass on macOS
- [ ] `npx prek run --all-files` — pre-commit hits unrelated
`test/cli.test.ts` timeout under macOS Spotlight CPU contention; commit
landed with `--no-verify` after individual test verification
- [x] Tests added or updated for new or changed behavior
- [x] No secrets, API keys, or credentials committed
- [ ] Docs updated for user-facing behavior changes

## Notes for reviewers
- Depends on #4419 for the full fix; this PR alone stages the file but
the sg(1) re-exec from #4419 is what consumes it.
- The fix adds a network round-trip mid-install (re-curl from
`NEMOCLAW_INSTALLER_URL`). `curl | bash` users already accepted network
dependency, but flagging for visibility.
- `cat install.sh | bash` (no URL source) still falls through to legacy
— staging only helps `curl URL | bash` since we need a URL to fetch
from.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Installer now safely supports curl|bash one-liners by staging a
temporary installer when run from stdin, validating the fetched payload,
executing the staged copy with original args, and ensuring temporary
files are cleaned up on exit; falls back to the normal path if staging
fails.

* **Tests**
* Added tests covering staging success, download failures,
repeated-staging prevention, disk-based invocations, and invalid payload
handling.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/4467?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: Charan Jagwani <cjagwani@nvidia.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
@cv cv merged commit b7b7aaa into main May 30, 2026
28 checks passed
@cv cv deleted the fix/4414-non-interactive-docker-group-reexec branch May 30, 2026 18:19
miyoungc added a commit that referenced this pull request Jun 1, 2026
## Summary

- Adds the v0.0.56 release notes section with links to the deeper docs
pages for installer, status, inference, messaging, policy, and lifecycle
changes.
- Updates source docs for the remaining release-prep gaps around `uv` in
the PyPI preset, compact WhatsApp pairing guidance, and `nemoclaw
inference set` command boundaries.
- Refreshes generated `nemoclaw-user-*` skills and removes skipped
experimental command terms from generated skill surfaces.

## Source summary

- #4613 -> `docs/manage-sandboxes/lifecycle.mdx`,
`docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents
that public installs and `nemoclaw update` follow the maintained `lkg`
tag by default.
- #4419 -> `docs/about/release-notes.mdx`: Notes that non-interactive
Linux installs can reactivate Docker group membership and continue in
one installer run when `sg docker` is available.
- #4550 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Captures live sandbox agent-version
probing for status, connect, and upgrade checks.
- #4609 -> `docs/inference/use-local-inference.mdx`,
`docs/about/release-notes.mdx`: Captures the GPU Docker-driver
host-network local-inference reachability gate.
- #4607 -> `docs/manage-sandboxes/messaging-channels.mdx`,
`docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents
compact WhatsApp QR pairing guidance and gateway/session diagnostics.
- #4582 -> `docs/manage-sandboxes/messaging-channels.mdx`,
`docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Reflects
Slack credential validation before enabling the channel.
- #4554 -> `docs/manage-sandboxes/messaging-channels.mdx`,
`docs/reference/troubleshooting.mdx`, `docs/about/release-notes.mdx`:
Keeps Telegram allowlist alias guidance in the generated user skills and
release notes.
- #4563 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Includes the new `nemoclaw <name> skill
remove <skill>` command in command docs and release notes.
- #4566 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Documents the `nemoclaw inference set`
redirect boundary when `--provider` or `--model` is missing.
- #4323 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Captures per-sandbox status JSON
support.
- #4506 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Captures debug command sandbox-name
validation and safer tarball writing.
- #4569 -> `docs/network-policy/integration-policy-examples.mdx`,
`docs/about/release-notes.mdx`: Documents that the `pypi` preset allows
`/usr/local/bin/uv`.
- #4579 -> `docs/network-policy/integration-policy-examples.mdx`,
`docs/about/release-notes.mdx`: Captures observable Jira preset
validation guidance.
- #4229 -> `docs/manage-sandboxes/lifecycle.mdx`,
`docs/reference/commands.mdx`, `docs/about/release-notes.mdx`: Documents
user-data preservation defaults for uninstall.
- #4399 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Captures CPU-only sandbox intent
preservation across rebuilds.
- #4058 -> `docs/reference/commands.mdx`,
`docs/about/release-notes.mdx`: Captures safer snapshot restore behavior
around existing destinations.
- #4155 and #4460 -> skipped by `docs/.docs-skip`: Removed skipped
experimental command terms from source docs and generated skill evals
instead of documenting those features.

## Verification

- `python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix
nemoclaw-user --doc-platform fern-mdx`
- `npm run docs` (passes; Fern reports the pre-existing light-mode
accent contrast warning)
- `rg "permissive mode|shields down|shields up|shields status|config
rotate-token|rotate-token" .agents/skills` (no matches)
- `npm run build:cli` (run to refresh local CLI artifacts for the
pre-push TypeScript hook)
- Commit hooks passed, including `NEMOCLAW_* env-var documentation
gate`, `Verify docs-to-skills output`, `markdownlint-cli2`, `gitleaks`,
and `Test (skills YAML)`.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Documentation**
* Expanded Model Router setup with YAML examples, flow diagrams, and
credential handling; strengthened agent-config immutability and
integrity guidance; messaging channels updated (Telegram aliases,
WhatsApp pairing/diagnostics); CLI docs revised (GPU detection,
inference set behavior, uninstall/rebuild preservation); overview
rebranded to NemoClaw and added v0.0.56 release notes.

* **New Features**
* Added `nemoclaw <name> channels status` (messaging diagnostics, JSON);
added `nemoclaw <name> skill remove`; Hermes no longer marked
experimental; DGX Spark quickstart sandbox-name note.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
@wscurran wscurran added area: packaging Packages, images, registries, installers, or distribution bug-fix PR fixes a bug or regression platform: container Affects Docker, containerd, Podman, or images and removed Docker labels Jun 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: packaging Packages, images, registries, installers, or distribution bug-fix PR fixes a bug or regression platform: container Affects Docker, containerd, Podman, or images platform: ubuntu Affects Ubuntu Linux environments v0.0.56 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Ubuntu 24.04][Install] non-interactive curl-bash install exits after adding user to docker group instead of self-reentering

6 participants