NVIDIA · deepujain · May 27, 2026
diff --git a/.agents/skills/nemoclaw-user-configure-inference/SKILL.md b/.agents/skills/nemoclaw-user-configure-inference/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: "nemoclaw-user-configure-inference"
-description: "Connects NemoClaw to a local inference server. Use when setting up Ollama, vLLM, TensorRT-LLM, NIM, or any OpenAI-compatible local model server with NemoClaw. Trigger keywords - nemoclaw local inference, ollama nemoclaw, vllm nemoclaw, local model server, openai compatible endpoint, switch nemoclaw inference model, change inference runtime, nemoclaw additional model, nemoclaw sub-agent model, openclaw sub-agent, agents.list, sessions_spawn, vlm-demo, nemoclaw inference options, nemoclaw onboarding providers, nemoclaw inference routing, nemoclaw tool calling, ollama tool calls, vllm tool-call-parser, raw json in tui."
+description: "Connects NemoClaw to a local inference server. Use when setting up Ollama, vLLM, TensorRT-LLM, NIM, or any OpenAI-compatible local model server with NemoClaw. Trigger keywords - nemoclaw local inference, ollama nemoclaw, vllm nemoclaw, local model server, openai compatible endpoint, switch nemoclaw inference model, change inference runtime, nemoclaw additional model, nemoclaw sub-agent model, openclaw sub-agent, agents.list, sessions_spawn, vlm-demo, nemoclaw dgx spark local inference, nemoclaw dgx station vllm, nemoclaw spark ollama, nemoclaw cdi gpu setup, nemoclaw inference options, nemoclaw onboarding providers, nemoclaw inference routing, nemoclaw tool calling, ollama tool calls, vllm tool-call-parser, raw json in tui."
 license: "Apache-2.0"
 ---
 
@@ -453,11 +453,13 @@ If the provider itself needs to change (for example, switching from vLLM to a cl
 
 - **Load [references/switch-inference-providers.md](references/switch-inference-providers.md)** when switching inference providers, changing the model runtime, or reconfiguring inference routing. Changes the active inference model without restarting the sandbox.
 - **Load [references/set-up-sub-agent.md](references/set-up-sub-agent.md)** when users ask how to add a second model, configure a sub-agent model, use Omni for vision tasks, configure agents.list, or use sessions_spawn in NemoClaw. Shows the NemoClaw-specific file paths and update flow for adding an auxiliary OpenClaw sub-agent model.
+- **Load [references/dgx-spark-station-local-inference.md](references/dgx-spark-station-local-inference.md)** when preparing DGX hardware, choosing Ollama or managed vLLM, checking GPU/CDI prerequisites, verifying the OpenShell gateway and local inference route, or troubleshooting CoreDNS, k3s image pull, CDI, or port 3000 conflicts. Guides DGX Spark and DGX Station users through end-to-end local inference setup with NemoClaw.
 - **Load [references/inference-options.md](references/inference-options.md)** when explaining which providers are available, what the onboard wizard presents, or how inference routing works. Lists all inference providers offered during NemoClaw onboarding.
 - **[references/tool-calling-reliability.md](references/tool-calling-reliability.md)** — Explains Ollama tool-call leak symptoms, when vLLM with a tool-call parser is recommended, and how to repoint NemoClaw to a parser-aware local endpoint.
 
 ## Related Skills
 
+- [Set Up DGX Spark or DGX Station Local Inference](references/dgx-spark-station-local-inference.md) for an end-to-end DGX hardware walkthrough.
 - [Inference Options](references/inference-options.md) for the full list of providers available during onboarding.
 - [Tool-Calling Reliability](references/tool-calling-reliability.md) for diagnosing raw JSON tool-call output with local models.
 - [Switch Inference Models](references/switch-inference-providers.md) for runtime model switching.

diff --git a/...moclaw-user-configure-inference/references/dgx-spark-station-local-inference.md b/...moclaw-user-configure-inference/references/dgx-spark-station-local-inference.md
@@ -0,0 +1,159 @@
+<!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
+<!-- SPDX-License-Identifier: Apache-2.0 -->
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+
+# Set Up DGX Spark or DGX Station Local Inference
+
+Use this guide when you want NemoClaw to run with local inference on DGX Spark or DGX Station.
+It pulls together the host checks, provider choice, onboarding flow, and the common Spark-specific failure modes that are otherwise spread across the quickstart, local inference, and troubleshooting pages.
+
+## Prerequisites
+
+Before onboarding, verify the host basics:
+
+- Docker is installed and running.
+- Node.js 22.16 or later and npm 10 or later are available.
+- The NVIDIA driver and container toolkit are installed.
+- `nvidia-smi` works on the host.
+- Port `3000` is free, or you are ready to choose a different dashboard port.
+
+Run:
+
+```bash
+docker info
+nvidia-smi
+node --version
+npm --version
+```
+
+DGX Spark and recent Docker installations can require NVIDIA Container Device Interface (CDI) specs for GPU passthrough.
+NemoClaw checks and repairs the common missing-CDI case during install, but you can pre-generate the spec when needed:
+
+```bash
+sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
+```
+
+If this command is unavailable, install or repair the NVIDIA Container Toolkit before onboarding.
+
+## Choose a Local Inference Path
+
+DGX Spark and DGX Station have two common local-inference paths.
+
+| Path | Best for | Notes |
+|---|---|---|
+| Managed vLLM | Tool-heavy agents, stronger tool-call reliability, larger GPU-backed models | Offered by default on DGX Spark and DGX Station. Uses `Qwen/Qwen3.6-27B-FP8` unless you override the registry slug. |
+| Ollama | Simpler local chat, existing Ollama model libraries, quick experiments | Convenient, but some model/template combinations can emit tool calls as plain text. Use vLLM when tool-call reliability matters. |
+
+For managed vLLM, the first run pulls the container image and model weights into local caches.
+Plan for a long first run on fresh systems.
+
+For Ollama, make sure only one daemon owns port `11434`.
+If another runtime is already using that port, stop it or move one service before onboarding.
+
+## Run Onboarding
+
+Start the standard onboard wizard:
+
+```bash
+nemoclaw onboard
+```
+
+On DGX Spark and DGX Station, the interactive wizard prompts for the provider and policy choices after the third-party software notice.
+Choose the local-inference path and review the suggested policy defaults before NemoClaw creates the sandbox.
+
+If you prefer to choose manually:
+
+1. Select the local provider you want: **Local vLLM** or **Local Ollama**.
+2. For managed vLLM, accept the default model or set `NEMOCLAW_VLLM_MODEL` before running onboarding.
+3. For Ollama, choose an installed model or a starter model that fits available memory.
+4. Let NemoClaw validate the local endpoint before it creates the sandbox.
+
+For non-interactive managed vLLM setup on DGX Spark or DGX Station:
+
+```bash
+NEMOCLAW_PROVIDER=install-vllm nemoclaw onboard --non-interactive --yes --yes-i-accept-third-party-software
+```
+
+To choose a supported managed-vLLM model:
+
+```bash
+NEMOCLAW_PROVIDER=install-vllm \
+NEMOCLAW_VLLM_MODEL=qwen3.6-27b \
+nemoclaw onboard --non-interactive --yes --yes-i-accept-third-party-software
+```
+
+Supported managed-vLLM slugs are listed in [Use a Local Inference Server](../SKILL.md#override-the-managed-vllm-model).
+
+## Verify the Setup
+
+After onboarding completes, check the sandbox and local inference route:
+
+```bash
+nemoclaw <sandbox-name> status
+nemoclaw <sandbox-name> doctor
+```
+
+Healthy output should show:
+
+- The sandbox is running.
+- The dashboard is reachable.
+- The selected inference provider is healthy.
+- For Ollama, the authenticated proxy health line is healthy when the proxy token is available.
+
+Open the TUI:
+
+```bash
+nemoclaw <sandbox-name> connect
+openclaw tui
+```
+
+Ask for a small tool-using action.
+If you see raw JSON tool calls printed as chat text, switch to vLLM with a parser-aware model path and review [Tool-Calling Reliability](tool-calling-reliability.md).
+
+## Common DGX Spark and Station Fixes
+
+### CoreDNS CrashLoop
+
+If CoreDNS in the embedded k3s cluster crashes shortly after setup, run the CoreDNS fix script referenced by the troubleshooting guide, then recreate the sandbox.
+The issue is usually a resolver path that points at `127.0.0.11`, which does not route inside the gateway container.
+
+### k3s Image Pull or Upload Takes Too Long
+
+Fresh systems may spend several minutes pulling images, uploading layers to the OpenShell gateway, or loading model weights.
+If readiness times out while the host is still doing real work, raise both local inference and sandbox readiness budgets:
+
+```bash
+export NEMOCLAW_LOCAL_INFERENCE_TIMEOUT=300
+export NEMOCLAW_SANDBOX_READY_TIMEOUT=600
+nemoclaw onboard
+```
+
+### CDI GPU Errors
+
+If gateway startup reports `unresolvable CDI devices nvidia.com/gpu=all`, regenerate CDI specs and rerun onboarding:
+
+```bash
+sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
+nemoclaw onboard
+```
+
+If the error persists, repair the NVIDIA Container Toolkit installation and verify that `docker info` reports the expected CDI spec directories.
+
+### Port 3000 Conflict
+
+Some Spark systems already run services on port `3000`.
+Set a different dashboard port before onboarding:
+
+```bash
+export NEMOCLAW_DASHBOARD_PORT=18789
+nemoclaw onboard
+```
+
+Use a free port that does not overlap the configured gateway, vLLM, Ollama, or Ollama proxy ports.
+
+## Next Steps
+
+- [Use a Local Inference Server](../SKILL.md) for full Ollama, vLLM, NIM, and compatible-endpoint details.
+- [Tool-Calling Reliability](tool-calling-reliability.md) for choosing between Ollama and parser-aware vLLM.
+- Troubleshooting (use the `nemoclaw-user-reference` skill) for deeper DGX Spark failure-mode guidance.
diff --git a/.agents/skills/nemoclaw-user-configure-inference/references/inference-options.md b/.agents/skills/nemoclaw-user-configure-inference/references/inference-options.md
@@ -74,7 +74,46 @@ When you select it, NemoClaw starts the router proxy on the host, waits for its
 The sandbox does not call the router port directly.
 
 The router model pool lives in `nemoclaw-blueprint/router/pool-config.yaml`.
+Edit that file to define which models the router can choose from.
 The default pool routes between NVIDIA-hosted Nemotron models and uses the `tolerance` value to choose the lowest-cost model whose predicted quality stays within the configured threshold.
+
+```yaml
+routing:
+  method: prefill
+  checkpoint: llm-router/checkpoints/prefill_router_qwen08b.pt
+  tolerance: 0.20
+  encoder: Qwen/Qwen3.5-0.8B
+
+models:
+  - name: nano
+    litellm_model: "openai/nvidia/nvidia/Nemotron-3-Nano-30B-A3B"
+    cost_per_m_input_tokens: 0.05
+    api_base: "https://inference-api.nvidia.com"
+
+  - name: super
+    litellm_model: "openai/nvidia/nvidia/nemotron-3-super-v3"
+    cost_per_m_input_tokens: 0.10
+    api_base: "https://inference-api.nvidia.com"
+```
+
+The `tolerance` parameter controls the accuracy-cost tradeoff.
+
+| Value | Behavior |
+|-------|----------|
+| `0.0` | Always pick the most accurate model. |
+| `0.20` | Allow up to 20 percentage points below the best for a cheaper model (default). |
+| `1.0` | Always pick the cheapest model. |
+
+The router runs on the host, not inside the sandbox.
+
+```text
+Sandbox (agent) ──> OpenShell Gateway (L7 proxy) ──> Model Router (:4000) ──> NVIDIA API
+                                                         └── PrefillRouter selects model
+```
+
+Credentials flow through the OpenShell provider system.
+The sandbox never sees raw API keys.
+
 To use the router in scripted setup, set:
 
 ```console

diff --git a/.agents/skills/nemoclaw-user-configure-security/references/best-practices.md b/.agents/skills/nemoclaw-user-configure-security/references/best-practices.md
@@ -184,6 +184,15 @@ For sensitive workloads, use a reviewed host-side immutability workflow after in
 
 - **DAC permissions (default).** The sandbox user owns `/sandbox/.openclaw` with mode `2770` (setgid `sandbox:sandbox`) and `openclaw.json` with mode `660`, so the agent and its group can read and write config directly. A reviewed host-side immutability workflow should compare the intended ownership and mode with the live sandbox filesystem before treating the config tree as locked.
 - **Config integrity hash.** The image includes a SHA256 hash of `openclaw.json`. In the default mutable state, `.config-hash` is sandbox-owned and is not a tamper-proof trust anchor, so startup does not fail closed on that hash. When the hash is root-owned and read-only, startup enforces it and refuses to start if the hash does not match.
+- **Content seal under shields up.**
+  When `nemoclaw <name> shields up` runs against a clean lock, it captures a SHA-256 seal of `openclaw.json` and any other locked files into the host-side shields state file.
+  On sealed sandboxes, every `shields status` call recomputes the hash inside the sandbox and surfaces drift on any mismatch, so a host-root tamper that flips perms back to `444 root:root` after rewriting the file is still flagged.
+  Sandboxes locked before this seal landed have no recorded hash; perm-only verification cannot prove their bytes match the image-original, so the seal is **not** a retroactive proof of integrity for legacy state.
+  The same refusal applies to partial seals where the locked file set grew after the existing seal was captured (some entries sealed, some missing).
+  By default, `shields up` refuses to seal in either case and asks you to rebuild the sandbox first for a known-good baseline.
+  `shields status` on a legacy lockdown surfaces `UP (UNSEALED — content integrity unknown for legacy lockdown)` and exits with status 2 so scripts treat it as a failure until the operator seals an explicit baseline.
+  If you explicitly trust the current bytes, opt in via `NEMOCLAW_SHIELDS_ACCEPT_LEGACY_BASELINE=1`, which captures a seal over the current files and is acknowledged in the log line.
+  Once a sandbox is sealed, `shields up` refuses to re-seal a tampered baseline; restore the original file or rebuild the sandbox before re-running.
 - **Gateway token environment.** The gateway exports `OPENCLAW_GATEWAY_TOKEN` and writes it to `/tmp/nemoclaw-proxy-env.sh` for interactive sandbox sessions. Keep this in mind when deciding whether a workload should run with mutable config or an immutable config posture.
 
 | Aspect | Detail |

diff --git a/.agents/skills/nemoclaw-user-get-started/SKILL.md b/.agents/skills/nemoclaw-user-get-started/SKILL.md
@@ -74,6 +74,7 @@ $ curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
 On DGX Spark, DGX Station, and Windows WSL, an interactive installer offers express install after you accept the third-party software notice.
 Express install switches onboarding to non-interactive mode, allows `sudo` password prompts for required host changes, and selects the managed local inference path for that platform.
 Unless `NEMOCLAW_POLICY_TIER` is set, it applies sandbox policy in `suggested` mode with the `balanced` tier by default, using the base sandbox policy plus supported package, model, web-search, and local-inference presets.
+On DGX Spark, express install uses `my-spark-assistant` as the sandbox name unless `NEMOCLAW_SANDBOX_NAME` is already set.
 On WSL, express install selects the Windows-host Ollama setup path.
 Set `NEMOCLAW_NO_EXPRESS=1` to skip the express prompt, or set `NEMOCLAW_PROVIDER` before launching the installer when you want to choose a provider yourself.
 

diff --git a/.agents/skills/nemoclaw-user-get-started/references/prerequisites.md b/.agents/skills/nemoclaw-user-get-started/references/prerequisites.md
@@ -60,7 +60,7 @@ The table is generated from [`ci/platform-matrix.json`](https://github.com/NVIDI
 |----|-------------------|--------|-------|
 | Linux | Docker | Tested | Primary tested path. |
 | macOS (Apple Silicon) | Colima, Docker Desktop | Tested with limitations | Install Xcode Command Line Tools (`xcode-select --install`) and start the runtime before running the installer. |
-| DGX Spark | Docker | Tested | Use the standard installer and `nemoclaw onboard`. For an end-to-end walkthrough with local Ollama inference, see the [NVIDIA Spark playbook](https://build.nvidia.com/spark/nemoclaw). |
+| DGX Spark | Docker | Tested | Use the standard installer and `nemoclaw onboard`. For local inference, see Set Up DGX Spark or DGX Station Local Inference (use the `nemoclaw-user-configure-inference` skill). |
 | Windows WSL2 | Docker Desktop (WSL backend) | Tested with limitations | Requires WSL2 with Docker Desktop backend. |
 
 ## Next Steps

diff --git a/.agents/skills/nemoclaw-user-get-started/references/quickstart-hermes.md b/.agents/skills/nemoclaw-user-get-started/references/quickstart-hermes.md
@@ -5,11 +5,6 @@
 Use NemoHermes when you want NemoClaw to create an OpenShell sandbox that runs Hermes instead of the default OpenClaw agent.
 The `nemohermes` command is an alias for `nemoclaw` with the Hermes agent pre-selected.
 
-**Experimental Feature:**
-
-The Hermes agent option is experimental.
-Interfaces, defaults, and supported features may change without notice, and it is not recommended for production use.
-
 Review the [Prerequisites](prerequisites.md) before starting.
 Docker must be installed, running, and reachable from the current shell before Hermes onboarding can build the sandbox image.
 On Linux, the installer can install Docker, start the service, and add your user to the `docker` group.

diff --git a/.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md b/.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md
@@ -187,7 +187,7 @@ Re-run the installer.
 Before it onboards anything, the installer calls `nemoclaw backup-all` (use the `nemoclaw-user-reference` skill) automatically, storing a snapshot of each running sandbox in `~/.nemoclaw/rebuild-backups/` as a safety net.
 If your existing gateway is from OpenShell earlier than `0.0.37`, the installer prompts before it runs the new automatic gateway upgrade path.
 The automatic path is offered only when the existing `nemoclaw` CLI supports `backup-all`; older installs must preserve sandbox state manually before retiring the gateway.
-For unattended installs, set `NEMOCLAW_ACCEPT_EXPERIMENTAL_OPENSHELL_UPGRADE=1`, or manually run `nemoclaw backup-all` and `openshell gateway destroy -g nemoclaw || openshell gateway destroy` before rerunning the installer as `curl -fsSL https://www.nvidia.com/nemoclaw.sh | NEMOCLAW_OPENSHELL_UPGRADE_PREPARED=1 bash`.
+For unattended installs, set `NEMOCLAW_ACCEPT_EXPERIMENTAL_OPENSHELL_UPGRADE=1`, or manually run `nemoclaw backup-all`, `openshell gateway remove nemoclaw || openshell gateway destroy -g nemoclaw || openshell gateway destroy` (both verbs are tried so the right one runs on either OpenShell release), and `sudo pkill -f openshell-gateway` if a privileged host gateway remains before rerunning the installer as `curl -fsSL https://www.nvidia.com/nemoclaw.sh | NEMOCLAW_OPENSHELL_UPGRADE_PREPARED=1 bash`.
 
 ```console
 $ curl -fsSL https://www.nvidia.com/nemoclaw.sh | bash
@@ -255,6 +255,14 @@ nemoclaw uninstall
 | `--keep-openshell` | Leave OpenShell binaries installed.                  |
 | `--delete-models`  | Also remove NemoClaw-pulled Ollama models.           |
 
+**Note:**
+
+`nemoclaw uninstall` preserves `~/.nemoclaw/rebuild-backups/` (host-side snapshots that `nemoclaw <name> snapshot create` and `nemoclaw backup-all` write), `~/.nemoclaw/backups/` (workspace backups that `scripts/backup-workspace.sh` writes), and `~/.nemoclaw/sandboxes.json` (the sandbox registry) by default.
+Uninstall removes every other entry under `~/.nemoclaw/`.
+Interactive runs prompt before they remove the preserved entries; the default answer keeps them.
+For non-interactive runs (`--yes`, `NEMOCLAW_NON_INTERACTIVE=1`, or a non-TTY shell), set `NEMOCLAW_UNINSTALL_DESTROY_USER_DATA=1` to acknowledge data loss and remove the preserved entries as well.
+See `nemoclaw uninstall` (use the `nemoclaw-user-reference` skill) for the full preservation contract.
+
 `nemoclaw uninstall` runs the version-pinned `uninstall.sh` that shipped with your installed CLI, so it does not fetch anything over the network at uninstall time.
 
 If the `nemoclaw` CLI is missing or broken, fall back to the hosted script: