Skip to content

fix(sandbox): auto-unlock shields during rebuild#4129

Closed
chengjiew wants to merge 1 commit into
mainfrom
fix/3113_rebuild-shields-up-auto-unlock
Closed

fix(sandbox): auto-unlock shields during rebuild#4129
chengjiew wants to merge 1 commit into
mainfrom
fix/3113_rebuild-shields-up-auto-unlock

Conversation

@chengjiew
Copy link
Copy Markdown
Contributor

@chengjiew chengjiew commented May 23, 2026

Summary

Fixes #3113.

When nemoclaw rebuild runs while shields are UP, the sandbox state backup can fail before the rebuild starts because protected state/config paths are locked down. This PR temporarily lowers shields before the backup, skips the detached auto-restore timer during that internal rebuild unlock, and restores shields after the sandbox has been recreated and state/policies are restored.

Changes

  • Detect locked shields before rebuild backup and call shieldsDown() programmatically.
  • Add internal skipTimer and throwOnError options to shields helpers so rebuild can recover instead of exiting mid-flow.
  • Re-apply shields after successful rebuild, and provide manual recovery guidance if recreate fails after the old sandbox has been deleted.
  • Add a regression test for the shields-UP rebuild path and the shields-not-configured path.

Verification

  • npm run build:cli
  • npm test -- test/rebuild-shields-auto-unlock.test.ts
  • npm run typecheck:cli
  • git diff --cached --check

I also previously reproduced the original failure on macOS with the pre-fix code and validated the auto-unlock flow locally. After rebasing to latest main, a full real-sandbox rebuild sanity check is currently blocked before backup by a local COMPATIBLE_API_KEY preflight requirement, so the post-rebase evidence here is the targeted regression test plus CLI build/typecheck.

Note: the local pre-push full CLI hook currently fails in unrelated/environment-sensitive tests on this machine (temporary git fixtures inherit repo hooks, version fallback expectations read the current git version, and one TCP timing assertion is too fast locally). I pushed with --no-verify after running the targeted verification above.

Summary by CodeRabbit

  • Bug Fixes

    • Rebuild process now automatically unlocks security shields during backup operations and re-applies them afterward to prevent data loss.
  • Improvements

    • Shield management commands support enhanced error handling and configuration options.
  • Tests

    • Added integration tests validating shield behavior during rebuild workflows.

Review Change Stack

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 23, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 23, 2026

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR fixes issue #3113 by implementing automatic shield state recovery during rebuild. When shields are locked (UP), the rebuild process now detects this, temporarily unlocks them for the backup phase, and re-applies the lockdown after rebuild succeeds. The fix extends shield lifecycle functions with error-handling options, orchestrates auto-unlock/relock in the rebuild workflow, and adds integration tests to verify the complete flow.

Changes

Shields Auto-Unlock During Rebuild

Layer / File(s) Summary
Shield Options & Error Handling
src/lib/shields/index.ts
shieldsDown and shieldsUp now accept ShieldsDownOpts/ShieldsUpOpts with throwOnError (switches from process.exit(1) to throwing) and skipTimer (suppresses auto-restore timer) flags. All error branches use a shared fail() helper to apply the selected error mode consistently.
Rebuild Shield Orchestration
src/lib/actions/sandbox/rebuild.ts
Rebuild imports shields and adds state tracking: detects if shields start UP, calls shieldsDown(opts.throwOnError) before backup, defines relockShieldsIfNeeded() helper, and reinvokes it on failure paths (backup failure, delete failure, post-destroy failure) and after rebuild success (Step 8). Ensures shields lockdown is restored even if rebuild fails.
Integration Tests
test/rebuild-shields-auto-unlock.test.ts
New test suite constructs a full temporary fixture with nemoclaw config, policy snapshots, and fake CLI binaries that simulate lock/unlock state via shared files. Verifies rebuild with shieldsLocked: true detects UP, auto-unlocks, captures policy, and backs up state; with shieldsLocked: false omits shield notices but still backs up.

Sequence Diagram

sequenceDiagram
  participant User as User/rebuild CLI
  participant Rebuild as rebuildSandbox()
  participant ShieldsDown as shieldsDown(throwOnError)
  participant Backup as backup tar
  participant ShieldsUp as shieldsUp(throwOnError)
  
  User->>Rebuild: rebuild --yes
  Rebuild->>Rebuild: Check shields status
  alt shields were UP
    Rebuild->>ShieldsDown: auto-unlock with error capture
    ShieldsDown-->>Rebuild: unlocked (or throw)
  end
  Rebuild->>Backup: back up sandbox state
  Backup-->>Rebuild: backup complete
  Rebuild->>Rebuild: destroy and recreate sandbox
  Rebuild->>Rebuild: restore state, apply presets
  alt shields were UP
    Rebuild->>ShieldsUp: re-lock with error capture
    ShieldsUp-->>Rebuild: locked (or throw)
  end
  Rebuild-->>User: rebuild success
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#3976: Both PRs modify the shield lifecycle implementation in src/lib/shields/index.ts—main PR extends shieldsUp/shieldsDown with new error/skip options for rebuild auto-unlock/relock, while retrieved PR changes shieldsDown internals for permissive policy application (unioning live filesystem policy) before proceeding.

Suggested labels

fix, Sandbox

Suggested reviewers

  • ericksoa

Poem

🛡️ The shields came UP and locked the door,
But now rebuild won't fail no more—
We slip them down, then back they go,
With auto-magic, off we flow! 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: automatically unlocking shields during rebuild to fix the backup failure issue.
Linked Issues check ✅ Passed The PR implementation fully addresses issue #3113: detects shields-UP state, temporarily unlocks via shieldsDown with new options, backs up state, recreates sandbox, restores state, and re-applies shields-UP.
Out of Scope Changes check ✅ Passed All changes are within scope: shield-related helper updates, rebuild flow modifications, and comprehensive regression tests directly address the linked issue requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/3113_rebuild-shields-up-auto-unlock

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

@chengjiew
Copy link
Copy Markdown
Contributor Author

Closing in favor of a signed-off replacement branch because repository rules disallow force-pushing to update this branch. Superseded by the next PR.

@chengjiew chengjiew closed this May 23, 2026
@github-actions
Copy link
Copy Markdown
Contributor

E2E Advisor Recommendation

Required E2E: rebuild-openclaw-e2e, shields-config-e2e
Optional E2E: rebuild-hermes-e2e, state-backup-restore-e2e

Dispatch hint: rebuild-openclaw-e2e,shields-config-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • rebuild-openclaw-e2e (high): Most direct existing live coverage for nemoclaw <sandbox> rebuild --yes: validates backup, destroy/recreate, state restore, policy preset preservation, credential stripping, and post-rebuild sandbox health for OpenClaw, the primary affected rebuild path.
  • shields-config-e2e (medium): Direct existing live coverage for shields up/down, config mutability/immutability, audit trail, and auto-restore timer behavior. The PR changes shields down/up internals and error handling used by rebuild.

Optional E2E

  • rebuild-hermes-e2e (high): Useful confidence because rebuildSandbox is agent-generic and the new shields relock/recovery behavior applies to Hermes sandboxes too. Optional because the diff does not add Hermes-specific rebuild logic.
  • state-backup-restore-e2e (high): Adjacent confidence for workspace backup/restore semantics that the new auto-unlock is intended to protect, but it does not specifically exercise rebuild or shields state.

New E2E recommendations

  • rebuild + shields integration (high): Existing E2E coverage has rebuild and shields lifecycle separately, but no live test appears to run shields up before nemoclaw <sandbox> rebuild --yes and then verify backup succeeds, state is restored, no detached auto-restore timer races the recreated sandbox, and shields are UP after rebuild.
    • Suggested test: Extend rebuild-openclaw-e2e or add a selective nightly job for: onboard OpenClaw, run nemoclaw <sandbox> shields up, write workspace marker, run nemoclaw <sandbox> rebuild --yes, verify marker survived, shields status reports locked/UP, config permissions are root/read-only, and policy is restrictive after rebuild.
  • Hermes rebuild + shields integration (medium): The same auto-unlock/relock code path applies to Hermes, but existing Hermes rebuild coverage likely does not validate locked shields before rebuild.
    • Suggested test: Add a Hermes variant that enables shields UP before rebuild-hermes-e2e, then verifies Hermes config/state survives and lockdown is restored after recreate.

Dispatch hint

  • Workflow: nightly-e2e.yaml
  • jobs input: rebuild-openclaw-e2e,shields-config-e2e

@github-actions
Copy link
Copy Markdown
Contributor

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • None. No scenario workflow, scenario metadata, scenario runtime, or validation-suite files changed.

Optional scenario E2E

  • None.

Relevant changed files

  • None.

@github-actions
Copy link
Copy Markdown
Contributor

PR Review Advisor

Findings: 3 needs attention, 2 worth checking, 0 nice ideas
Top item: Fail closed and test shields re-lock after rebuild

Review findings

🛠️ Needs attention

  • Rebuild can finish or fail known paths without enforcing the previous locked posture (src/lib/actions/sandbox/rebuild.ts:471): The linked issue requires rebuilding to restore the previous shields-UP lockdown. This patch attempts relock through relockShieldsIfNeeded(), but if shields.shieldsUp() fails after the sandbox is recreated, the helper only logs a warning and rebuild can still print the normal success message. The internal shieldsDown() call also uses skipTimer: true, so an unexpected exception/SIGINT during the auto-unlock window has no detached auto-restore safety net. For a security hardening feature, leaving a previously locked sandbox mutable should not be a best-effort warning on the success path.
    • Recommendation: Make the auto-unlock window fail closed: wrap the backup/recreate/restore path in a try/finally or equivalent guard, restore shields before reporting success whenever the sandbox exists, and return a non-zero/incomplete rebuild result if relock fails. Add tests that assert the final lock state is restored and that a relock failure does not report a clean rebuild.
    • Evidence: rebuild.ts sets const shieldsWereLocked = !shields.isShieldsDown(sandboxName) and calls shields.shieldsDown(..., { skipTimer: true, throwOnError: true }) around lines 443-451. relockShieldsIfNeeded() catches shields.shieldsUp() errors and only logs manual guidance around lines 471-489. The final success path calls relockShieldsIfNeeded(true) before printing success, but does not check whether relock succeeded.
  • Rebuild monolith grew further in a high-risk sandbox lifecycle path (src/lib/actions/sandbox/rebuild.ts): The rebuild action is already a large lifecycle monolith, and this PR adds 73 lines to it. Sandbox rebuild code controls backup, destroy, recreate, credential hydration, state restoration, policy restoration, and now shields unlock/relock behavior; increasing it makes security-critical ordering and failure handling harder to audit.
    • Recommendation: Extract the shields auto-unlock/relock orchestration into a focused helper with a small contract, or offset the growth by moving existing rebuild subflows out of the monolith before merge.
    • Evidence: Trusted monolith delta reports src/lib/actions/sandbox/rebuild.ts grew from 870 to 943 lines (+73), exceeding the 20-line threshold for a current monolith.
  • Shields monolith grew while adding internal control-flow options (src/lib/shields/index.ts): src/lib/shields/index.ts is already a broad security module covering policy snapshots, timers, lockdown, unlock, status, audit, and recovery. This PR adds internal skipTimer and throwOnError options inside that monolith, expanding security-sensitive branching without extracting timer/recovery concerns.
    • Recommendation: Move the new internal-programmatic shields behavior into smaller helpers or a typed service boundary, and keep the CLI-facing shields functions thin enough that timer, rollback, and error-mode behavior can be independently tested.
    • Evidence: Trusted monolith delta reports src/lib/shields/index.ts grew from 1353 to 1397 lines (+44), exceeding the 20-line threshold for a current monolith.

🔎 Worth checking

  • Regression tests check the unlock notice but not the final shields restoration contract (test/rebuild-shields-auto-unlock.test.ts:299): The new tests verify that locked shields are detected, that an unlock notice is printed, and that backup starts. They do not assert that shieldsUp() was invoked, that config/policy returned to locked state, that no unlocked state remains, or that relock failure changes the rebuild result. This leaves the most important acceptance and security invariant unprotected.
    • Recommendation: Extend the fixture to record lock-state transitions and assert the happy path ends locked. Add negative tests for shieldsDown failure before backup, total backup failure after auto-unlock, delete failure, recreate failure after destroy, and shieldsUp/relock failure after recreate.
    • Evidence: The locked-shields test assertions at lines 299-313 only check output strings such as "Shields are UP", "temporarily unlocking for rebuild backup", "Capturing current policy snapshot", and "Backing up sandbox state".
  • Changed files overlap active rebuild and shields work: The PR patches live files that still exist, but trusted drift evidence shows active overlapping PRs touching the same high-risk files. That increases the chance of policy/rebuild behavior drift or merge-order contradictions, especially around rebuild state preservation and shields policy handling.

🌱 Nice ideas

  • None.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

ericksoa added a commit that referenced this pull request May 24, 2026
## Summary

Fixes #3113.

When `nemoclaw rebuild` runs while shields are UP, the sandbox state
backup can fail before the rebuild starts because protected state/config
paths are locked down. This PR temporarily lowers shields before the
backup, skips the detached auto-restore timer during that internal
rebuild unlock, and restores shields after the sandbox has been
recreated and state/policies are restored.

Supersedes #4129, which used the same patch but had an unsigned commit
that could not be force-updated due repository rules.

## Changes

- Detect locked shields before rebuild backup and call `shieldsDown()`
programmatically.
- Add internal `skipTimer` and `throwOnError` options to shields helpers
so rebuild can recover instead of exiting mid-flow.
- Re-apply shields after successful rebuild, and provide manual recovery
guidance if recreate fails after the old sandbox has been deleted.
- Add a regression test for the shields-UP rebuild path and the
shields-not-configured path.

## Verification

- `npm run build:cli`
- `npm test -- test/rebuild-shields-auto-unlock.test.ts
test/rebuild-shields-window.test.ts`
- `npm run typecheck:cli`
- `git diff --cached --check`

I also previously reproduced the original failure on macOS with the
pre-fix code and validated the auto-unlock flow locally. After rebasing
to latest `main`, a full real-sandbox rebuild sanity check is currently
blocked before backup by a local `COMPATIBLE_API_KEY` preflight
requirement, so the post-rebase evidence here is the targeted regression
test plus CLI build/typecheck.

Note: the local pre-push full CLI hook currently fails in
unrelated/environment-sensitive tests on this machine (temporary git
fixtures inherit repo hooks, version fallback expectations read the
current git version, and one TCP timing assertion is too fast locally).
I pushed with `--no-verify` after running the targeted verification
above.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Rebuilds can temporarily relax and re-apply sandbox security shields;
option to skip the detached auto-restore timer and an option to throw
errors instead of exiting.

* **Bug Fixes**
* Shields are now re-applied on multiple abort/failure paths to avoid
leaving sandboxes unprotected.

* **Improvements**
* Clearer operator messaging and explicit recovery instructions when
shield operations fail; rebuild aborts if re-locking fails.

* **Tests**
* New integration and unit tests covering auto-unlock, relock, and
recovery behaviors.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/NVIDIA/NemoClaw/pull/4130?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>

---------

Signed-off-by: Chengjie Wang <chengjiew@nvidia.com>
Signed-off-by: Aaron Erickson <aerickson@nvidia.com>
Co-authored-by: Carlos Villela <cvillela@nvidia.com>
Co-authored-by: Aaron Erickson <aerickson@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[macOS][Sandbox] nemoclaw rebuild fails to back up state when shields are UP — should auto-unlock before backup

2 participants