diff --git a/.agents/skills/nemoclaw-user-agent-skills/SKILL.md b/.agents/skills/nemoclaw-user-agent-skills/SKILL.md
index 15e5211e28..1d87704186 100644
--- a/.agents/skills/nemoclaw-user-agent-skills/SKILL.md
+++ b/.agents/skills/nemoclaw-user-agent-skills/SKILL.md
@@ -9,6 +9,84 @@ license: "Apache-2.0"
 
 # NemoClaw Agent Skills for Your AI Coding Assistant
 
-## References
+NemoClaw ships agent skills that are generated directly from this documentation.
+Each skill is a converted version of one or more doc pages, structured so AI coding assistants can consume it as context.
+This means you can interact with the full NemoClaw documentation as skills inside your agent chat session, instead of reading the docs separately.
 
-- **Load [references/agent-skills.md](references/agent-skills.md)** when users ask about AI agent support, coding assistant integration, or the .agents/skills/ directory. Describes the agent skills shipped with NemoClaw and how to access them by cloning the repository.
+Ask your assistant a question about NemoClaw and it responds with the same guidance found in these docs, adapted to your current situation.
+Skills cover installation, inference configuration, network policy management, monitoring, deployment, security, workspace management, and the CLI reference.
+
+**Note:**
+
+If you are a contributor and have cloned the full NemoClaw repository, the full set of skills including contributor and maintainer skills are already available at the project root.
+Open the `NemoClaw` directory in your coding assistant and the skills load automatically.
+This page is for users who installed NemoClaw with the installer and do not have a local clone.
+
+## Get the Skills
+
+Fetch only the skills from the NemoClaw repository without downloading the full source tree.
+
+```console
+$ git clone --filter=blob:none --no-checkout https://github.com/NVIDIA/NemoClaw.git
+$ cd NemoClaw
+$ git sparse-checkout set --no-cone '/.agents/skills/nemoclaw-user-*/**' '/.agents/skills/nemoclaw-skills-guide/**' '/.claude/**' '/AGENTS.md' '/CLAUDE.md'
+$ git checkout
+```
+
+Open the `NemoClaw` directory in your AI coding assistant.
+The assistant discovers the skills in `.agents/skills/` and uses them to answer NemoClaw questions with project-specific guidance.
+
+You can keep the skills inside the cloned directory or copy `.agents/skills/` to a global location (such as `~/.cursor/skills/` or `~/.claude/skills/`) so they are available across all your projects.
+The choice depends on whether you want NemoClaw skills scoped to one workspace or accessible everywhere.
+
+## Update the Skills
+
+The sparse checkout filter is saved, so `git pull` fetches only updated skills without downloading the full source tree.
+Run `git pull` after each NemoClaw release to pick up new and updated skills.
+
+## Available Skills
+
+The following user skills ship with NemoClaw.
+
+| Skill | Summary |
+|-------|---------|
+| `nemoclaw-user-overview` | What NemoClaw is, ecosystem placement (OpenClaw + OpenShell + NemoClaw), how it works internally, and release notes. |
+| `nemoclaw-user-get-started` | Install NemoClaw, launch a sandbox, and run the first agent prompt. |
+| `nemoclaw-user-configure-inference` | Choose inference providers during onboarding, switch models without restarting, and set up local inference servers (Ollama, vLLM, TensorRT-LLM, NIM). |
+| `nemoclaw-user-manage-policy` | Approve or deny blocked egress requests in the TUI and customize the sandbox network policy (add, remove, or modify allowed endpoints). |
+| `nemoclaw-user-monitor-sandbox` | Check sandbox health, read logs, and trace agent behavior to diagnose problems. |
+| `nemoclaw-user-deploy-remote` | Deploy NemoClaw to a remote GPU instance, set up the Telegram bridge, and review sandbox container hardening. |
+| `nemoclaw-user-configure-security` | Review the risk framework for every configurable security control, understand credential storage, and assess posture trade-offs. |
+| `nemoclaw-user-manage-sandboxes` | Manage day-two sandbox operations, including status, logs, diagnostics, rebuilds, upgrades, messaging channels, workspace files, backup, and restore. |
+| `nemoclaw-user-reference` | CLI command reference, plugin and blueprint architecture, baseline network policies, and troubleshooting guide. |
+
+## Example Questions and Triggered Skills
+
+After opening the cloned repository in your coding assistant, ask a NemoClaw question in natural language.
+The assistant matches your question to the relevant skill and follows the guidance it contains.
+
+Examples of questions your assistant can answer with these skills:
+
+| Question | Skill triggered |
+|----------|-----------------|
+| "How do I install NemoClaw?" | `nemoclaw-user-get-started` |
+| "Switch my inference provider to Ollama." | `nemoclaw-user-configure-inference` |
+| "A network request was blocked. How do I approve it?" | `nemoclaw-user-manage-policy` |
+| "Show me the sandbox logs." | `nemoclaw-user-monitor-sandbox` |
+| "How do I deploy NemoClaw to a remote GPU?" | `nemoclaw-user-deploy-remote` |
+| "What security controls can I configure?" | `nemoclaw-user-configure-security` |
+| "Back up my agent workspace files." | `nemoclaw-user-manage-sandboxes` |
+| "What CLI commands are available?" | `nemoclaw-user-reference` |
+
+You can also reference a skill directly by name if you know which one you need.
+
+## AI Coding Assistants that You Can Use with NemoClaw Skills
+
+The NemoClaw agent skills follow the [Agent Skills best practices](https://agentskills.io/skill-creation/best-practices) and the [Claude Skills best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices).
+The following table shows how each AI coding assistant can use the NemoClaw skills.
+
+| Assistant | Skill discovery |
+|-----------|----------------|
+| Cursor | Reads `AGENTS.md` at the project root, which references `.agents/skills/`. |
+| Claude Code | Follows the `.claude/skills/` symlink, which points to `.agents/skills/`. |
+| Other assistants | Point the assistant to `.agents/skills/` if it supports project-level skill loading. |
diff --git a/.agents/skills/nemoclaw-user-agent-skills/references/agent-skills.md b/.agents/skills/nemoclaw-user-agent-skills/references/agent-skills.md
deleted file mode 100644
index 04543e62d1..0000000000
--- a/.agents/skills/nemoclaw-user-agent-skills/references/agent-skills.md
+++ /dev/null
@@ -1,85 +0,0 @@
-<!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
-<!-- SPDX-License-Identifier: Apache-2.0 -->
-# NemoClaw Agent Skills for Your AI Coding Assistant
-
-NemoClaw ships agent skills that are generated directly from this documentation.
-Each skill is a converted version of one or more doc pages, structured so AI coding assistants can consume it as context.
-This means you can interact with the full NemoClaw documentation as skills inside your agent chat session, instead of reading the docs separately.
-
-Ask your assistant a question about NemoClaw and it responds with the same guidance found in these docs, adapted to your current situation.
-Skills cover installation, inference configuration, network policy management, monitoring, deployment, security, workspace management, and the CLI reference.
-
-**Note:**
-
-If you are a contributor and have cloned the full NemoClaw repository, the full set of skills including contributor and maintainer skills are already available at the project root.
-Open the `NemoClaw` directory in your coding assistant and the skills load automatically.
-This page is for users who installed NemoClaw with the installer and do not have a local clone.
-
-## Get the Skills
-
-Fetch only the skills from the NemoClaw repository without downloading the full source tree.
-
-```console
-$ git clone --filter=blob:none --no-checkout https://github.com/NVIDIA/NemoClaw.git
-$ cd NemoClaw
-$ git sparse-checkout set --no-cone '/.agents/skills/nemoclaw-user-*/**' '/.agents/skills/nemoclaw-skills-guide/**' '/.claude/**' '/AGENTS.md' '/CLAUDE.md'
-$ git checkout
-```
-
-Open the `NemoClaw` directory in your AI coding assistant.
-The assistant discovers the skills in `.agents/skills/` and uses them to answer NemoClaw questions with project-specific guidance.
-
-You can keep the skills inside the cloned directory or copy `.agents/skills/` to a global location (such as `~/.cursor/skills/` or `~/.claude/skills/`) so they are available across all your projects.
-The choice depends on whether you want NemoClaw skills scoped to one workspace or accessible everywhere.
-
-## Update the Skills
-
-The sparse checkout filter is saved, so `git pull` fetches only updated skills without downloading the full source tree.
-Run `git pull` after each NemoClaw release to pick up new and updated skills.
-
-## Available Skills
-
-The following user skills ship with NemoClaw.
-
-| Skill | Summary |
-|-------|---------|
-| `nemoclaw-user-overview` | What NemoClaw is, ecosystem placement (OpenClaw + OpenShell + NemoClaw), how it works internally, and release notes. |
-| `nemoclaw-user-get-started` | Install NemoClaw, launch a sandbox, and run the first agent prompt. |
-| `nemoclaw-user-configure-inference` | Choose inference providers during onboarding, switch models without restarting, and set up local inference servers (Ollama, vLLM, TensorRT-LLM, NIM). |
-| `nemoclaw-user-manage-policy` | Approve or deny blocked egress requests in the TUI and customize the sandbox network policy (add, remove, or modify allowed endpoints). |
-| `nemoclaw-user-monitor-sandbox` | Check sandbox health, read logs, and trace agent behavior to diagnose problems. |
-| `nemoclaw-user-deploy-remote` | Deploy NemoClaw to a remote GPU instance, set up the Telegram bridge, and review sandbox container hardening. |
-| `nemoclaw-user-configure-security` | Review the risk framework for every configurable security control, understand credential storage, and assess posture trade-offs. |
-| `nemoclaw-user-manage-sandboxes` | Manage day-two sandbox operations, including status, logs, diagnostics, rebuilds, upgrades, messaging channels, workspace files, backup, and restore. |
-| `nemoclaw-user-reference` | CLI command reference, plugin and blueprint architecture, baseline network policies, and troubleshooting guide. |
-
-## Example Questions and Triggered Skills
-
-After opening the cloned repository in your coding assistant, ask a NemoClaw question in natural language.
-The assistant matches your question to the relevant skill and follows the guidance it contains.
-
-Examples of questions your assistant can answer with these skills:
-
-| Question | Skill triggered |
-|----------|-----------------|
-| "How do I install NemoClaw?" | `nemoclaw-user-get-started` |
-| "Switch my inference provider to Ollama." | `nemoclaw-user-configure-inference` |
-| "A network request was blocked. How do I approve it?" | `nemoclaw-user-manage-policy` |
-| "Show me the sandbox logs." | `nemoclaw-user-monitor-sandbox` |
-| "How do I deploy NemoClaw to a remote GPU?" | `nemoclaw-user-deploy-remote` |
-| "What security controls can I configure?" | `nemoclaw-user-configure-security` |
-| "Back up my agent workspace files." | `nemoclaw-user-manage-sandboxes` |
-| "What CLI commands are available?" | `nemoclaw-user-reference` |
-
-You can also reference a skill directly by name if you know which one you need.
-
-## AI Coding Assistants that You Can Use with NemoClaw Skills
-
-The NemoClaw agent skills follow the [Agent Skills best practices](https://agentskills.io/skill-creation/best-practices) and the [Claude Skills best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices).
-The following table shows how each AI coding assistant can use the NemoClaw skills.
-
-| Assistant | Skill discovery |
-|-----------|----------------|
-| Cursor | Reads `AGENTS.md` at the project root, which references `.agents/skills/`. |
-| Claude Code | Follows the `.claude/skills/` symlink, which points to `.agents/skills/`. |
-| Other assistants | Point the assistant to `.agents/skills/` if it supports project-level skill loading. |
diff --git a/.agents/skills/nemoclaw-user-configure-inference/SKILL.md b/.agents/skills/nemoclaw-user-configure-inference/SKILL.md
index b3462e029a..af6255ceda 100644
--- a/.agents/skills/nemoclaw-user-configure-inference/SKILL.md
+++ b/.agents/skills/nemoclaw-user-configure-inference/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: "nemoclaw-user-configure-inference"
-description: "Connects NemoClaw to a local inference server. Use when setting up Ollama, vLLM, TensorRT-LLM, NIM, or any OpenAI-compatible local model server with NemoClaw. Trigger keywords - nemoclaw local inference, ollama nemoclaw, vllm nemoclaw, local model server, openai compatible endpoint, switch nemoclaw inference model, change inference runtime, nemoclaw additional model, nemoclaw sub-agent model, openclaw sub-agent, agents.list, sessions_spawn, vlm-demo, nemoclaw tool calling, ollama tool calls, vllm tool-call-parser, raw json in tui, nemoclaw inference options, nemoclaw onboarding providers, nemoclaw inference routing."
+description: "Connects NemoClaw to a local inference server. Use when setting up Ollama, vLLM, TensorRT-LLM, NIM, or any OpenAI-compatible local model server with NemoClaw. Trigger keywords - nemoclaw local inference, ollama nemoclaw, vllm nemoclaw, local model server, openai compatible endpoint, switch nemoclaw inference model, change inference runtime, nemoclaw additional model, nemoclaw sub-agent model, openclaw sub-agent, agents.list, sessions_spawn, vlm-demo, nemoclaw inference options, nemoclaw onboarding providers, nemoclaw inference routing, nemoclaw tool calling, ollama tool calls, vllm tool-call-parser, raw json in tui."
 license: "Apache-2.0"
 ---
 
@@ -109,6 +109,52 @@ JSON such as `{"name":"memory_search","arguments":{...}}` instead of running a
 tool, switch to vLLM with `--enable-auto-tool-choice` and the correct
 `--tool-call-parser`. See [Tool-Calling Reliability](references/tool-calling-reliability.md).
 
+### Authenticated Reverse Proxy
+
+On non-WSL hosts, NemoClaw keeps Ollama bound to `127.0.0.1:11434` and starts a token-gated reverse proxy on `0.0.0.0:11435`.
+The native install/start paths also reset NemoClaw-managed systemd launches to the loopback binding.
+Containers and other hosts on the local network reach Ollama only through the
+proxy, which validates a Bearer token before forwarding requests.
+On that native path, NemoClaw never exposes Ollama without authentication.
+
+WSL Ollama paths do not use this proxy.
+Windows-host Ollama uses the Windows daemon through `host.docker.internal`.
+
+For non-WSL Ollama setups, the onboard wizard manages the proxy automatically:
+
+- Generates a random 24-byte token on first run and stores it in
+  `~/.nemoclaw/ollama-proxy-token` with `0600` permissions.
+- Starts the proxy after Ollama and verifies it before continuing.
+- Cleans up stale proxy processes from previous runs.
+- Probes the sandbox Docker network path to the proxy before committing the inference route.
+- Stops matching proxy processes during uninstall before deleting NemoClaw state.
+- Reuses the persisted token after a host reboot so you do not need to re-run
+  onboard.
+
+On native Linux hosts, a firewall can allow the host proxy health check while still blocking sandbox containers on the OpenShell Docker bridge.
+When the sandbox-side proxy probe fails with a TCP error, onboarding exits before it saves the inference route and prints a command like:
+
+```console
+$ sudo ufw allow from <openshell-docker-subnet> to any port 11435 proto tcp
+$ nemoclaw onboard
+```
+
+If the probe cannot run, for example because Docker Desktop or WSL uses a different host routing model, onboarding continues and relies on the regular proxy health check.
+
+The sandbox provider is configured to use proxy port `11435` with the generated
+token as its `OPENAI_API_KEY` credential.
+OpenShell's L7 proxy injects the token at egress, so the agent inside the
+sandbox never sees the token directly.
+
+All proxy endpoints require the Bearer token, including `GET /api/tags`.
+Internal health and reachability checks run via the proxy treat any HTTP
+response (including `401`) as proof the proxy is alive — they only fail
+when nothing answers at all.
+
+If Ollama is already running on a non-loopback address when you start onboard,
+the wizard restarts it on `127.0.0.1:11434` so the proxy is the only network
+path to the model server.
+
 ### GPU Memory Cleanup
 
 When you switch away from Ollama, stop host services, or destroy an Ollama-backed sandbox, NemoClaw asks Ollama to unload currently loaded models from GPU memory.
@@ -137,8 +183,6 @@ Run onboard without `--non-interactive` to get the interactive `[y/N]` prompt th
 | `NEMOCLAW_MODEL` | Ollama model tag to use. Optional. |
 | `NEMOCLAW_YES` | Set to `1` to auto-accept the model-download confirmation prompt. Optional. |
 
-Load [references/use-local-inference-details.md](references/use-local-inference-details.md) for detailed steps on Authenticated Reverse Proxy.
-
 ## OpenAI-Compatible Server
 
 This option works with any server that implements `/v1/chat/completions`, including vLLM, TensorRT-LLM, llama.cpp, LocalAI, and others.
@@ -171,11 +215,69 @@ If you set `NEMOCLAW_PREFERRED_API=openai-responses`, NemoClaw probes `/v1/respo
 If a reasoning model returns only reasoning content before producing a final answer, NemoClaw retries the smoke request with a larger response budget.
 Route, configuration, and authentication failures still fail immediately.
 
-Load [references/use-local-inference-details.md](references/use-local-inference-details.md) for detailed steps on Non-Interactive Setup, Selecting the API Path.
+### Non-Interactive Setup
+
+Set the following environment variables for scripted or CI/CD deployments.
+
+```console
+$ NEMOCLAW_PROVIDER=custom \
+  NEMOCLAW_ENDPOINT_URL=http://localhost:8000/v1 \
+  NEMOCLAW_MODEL=meta-llama/Llama-3.1-8B-Instruct \
+  COMPATIBLE_API_KEY=dummy \
+  nemoclaw onboard --non-interactive
+```
+
+| Variable | Purpose |
+|---|---|
+| `NEMOCLAW_PROVIDER` | Set to `custom` for an OpenAI-compatible endpoint. |
+| `NEMOCLAW_ENDPOINT_URL` | Base URL of the local server. |
+| `NEMOCLAW_MODEL` | Model ID as reported by the server. |
+| `COMPATIBLE_API_KEY` | API key for the endpoint. Use any non-empty value if authentication is not required. |
+
+### Selecting the API Path
+
+For the compatible-endpoint provider, `/v1/chat/completions` is the default.
+NemoClaw tests streaming events during onboarding and uses chat completions
+without probing the Responses API.
+
+To opt in to `/v1/responses`, set `NEMOCLAW_PREFERRED_API` before running onboard:
+
+```console
+$ NEMOCLAW_PREFERRED_API=openai-responses nemoclaw onboard
+```
+
+The wizard then probes `/v1/responses` and only selects it when streaming
+support is complete.
+If the probe fails, the wizard falls back to `/v1/chat/completions`
+automatically.
+You can use this variable in both interactive and non-interactive mode.
+
+| Variable | Values | Default |
+|---|---|---|
+| `NEMOCLAW_PREFERRED_API` | `openai-completions`, `openai-responses` | `openai-completions` for compatible endpoints |
+
+If you already onboarded and the sandbox is failing at runtime, re-run
+`nemoclaw onboard` to re-probe the endpoint and bake the correct API path
+into the image.
+Refer to [Switch Inference Models](references/switch-inference-providers.md) for details.
 
 ## Anthropic-Compatible Server
 
-Load [references/use-local-inference-details.md](references/use-local-inference-details.md) for detailed steps.
+If your local server implements the Anthropic Messages API (`/v1/messages`), choose **Other Anthropic-compatible endpoint** during onboarding instead.
+
+```console
+$ nemoclaw onboard
+```
+
+For non-interactive setup, use `NEMOCLAW_PROVIDER=anthropicCompatible` and set `COMPATIBLE_ANTHROPIC_API_KEY`.
+
+```console
+$ NEMOCLAW_PROVIDER=anthropicCompatible \
+  NEMOCLAW_ENDPOINT_URL=http://localhost:8080 \
+  NEMOCLAW_MODEL=my-model \
+  COMPATIBLE_ANTHROPIC_API_KEY=dummy \
+  nemoclaw onboard --non-interactive
+```
 
 ## vLLM
 
@@ -208,7 +310,53 @@ Managed vLLM uses these profiles:
 NemoClaw forces the `chat/completions` API path for vLLM.
 The vLLM `/v1/responses` endpoint does not run the `--tool-call-parser`, so tool calls arrive as raw text.
 
-Load [references/use-local-inference-details.md](references/use-local-inference-details.md) for detailed steps on Non-Interactive Setup, Override the Managed-vLLM Model.
+### Non-Interactive Setup
+
+Use an already-running vLLM server:
+
+```console
+$ NEMOCLAW_PROVIDER=vllm \
+  nemoclaw onboard --non-interactive
+```
+
+Install or start managed vLLM when a supported profile is detected.
+On DGX Spark and DGX Station, `NEMOCLAW_PROVIDER=install-vllm` is enough for non-interactive runs; add `NEMOCLAW_EXPERIMENTAL=1` on generic Linux NVIDIA GPU hosts.
+
+```console
+$ NEMOCLAW_PROVIDER=install-vllm \
+  nemoclaw onboard --non-interactive
+```
+
+NemoClaw records the model returned by vLLM's `/v1/models` endpoint.
+Start vLLM with the model you want before onboarding if you manage the server yourself.
+
+### Override the Managed-vLLM Model
+
+Managed vLLM serves the profile default unless you select a different registry entry.
+Export `NEMOCLAW_VLLM_MODEL=<slug>` before invoking the installer to choose a different model from the registry.
+NemoClaw uses the matching `vllm serve` flags, including the reasoning parser, tool-call parser, and `--max-model-len`.
+Recognised slugs:
+
+| Slug | Hugging Face model | Notes |
+|---|---|---|
+| `qwen3.6-27b` | `Qwen/Qwen3.6-27B-FP8` | Default on DGX Spark and DGX Station profiles |
+| `nemotron-3-nano-4b` | `nvidia/NVIDIA-Nemotron-3-Nano-4B-FP8` | Default on the generic Linux + NVIDIA GPU profile |
+| `deepseek-r1-distill-70b` | `deepseek-ai/DeepSeek-R1-Distill-Llama-70B` | Gated. Requires Hugging Face license acceptance |
+
+The slug is case-insensitive; the full Hugging Face id is also accepted.
+An unrecognised value fails fast with a list of valid slugs.
+
+Gated models require a Hugging Face token; export it before onboarding so NemoClaw can forward it into the managed vLLM container:
+
+```console
+$ export HF_TOKEN=<your-hf-token>
+$ NEMOCLAW_PROVIDER=install-vllm \
+  NEMOCLAW_VLLM_MODEL=deepseek-r1-distill-70b \
+  nemoclaw onboard --non-interactive
+```
+
+`HUGGING_FACE_HUB_TOKEN` is accepted as an alternative.
+The token check runs on the host before any docker pull, so a missing or empty token aborts onboarding before bandwidth is spent on a 401.
 
 ## NVIDIA NIM (Experimental)
 
@@ -236,27 +384,77 @@ If the NIM container exits before the health endpoint becomes ready, onboarding
 NIM uses vLLM internally.
 The same `chat/completions` API path restriction applies.
 
-Load [references/use-local-inference-details.md](references/use-local-inference-details.md) for detailed steps on Non-Interactive Setup.
+### Non-Interactive Setup
+
+```console
+$ NEMOCLAW_EXPERIMENTAL=1 \
+  NEMOCLAW_PROVIDER=nim \
+  nemoclaw onboard --non-interactive
+```
+
+To select a specific model, set `NEMOCLAW_MODEL`.
 
 ## Timeout Configuration
 
-Load [references/use-local-inference-details.md](references/use-local-inference-details.md) for detailed steps.
+Local inference requests use a default timeout of 180 seconds.
+Large prompts on hardware such as DGX Spark can exceed shorter timeouts, so NemoClaw sets a higher default for Ollama, vLLM, NIM, and compatible-endpoint setup.
+
+To override the timeout, set the `NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` environment variable before onboarding:
+
+```console
+$ export NEMOCLAW_LOCAL_INFERENCE_TIMEOUT=300
+$ nemoclaw onboard
+```
+
+The value is in seconds.
+This setting is baked into the sandbox at build time.
+Changing it after onboarding requires re-running `nemoclaw onboard`.
+
+`NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` only governs the inference-server validation probe.
+During local Ollama setup, NemoClaw treats host-side curl process timeouts as retryable probe failures and retries with a larger timeout before it reports a validation failure.
+NemoClaw also retries Docker runtime detection with a longer `docker info` timeout before it chooses the local inference route.
+The post-create readiness wait (image build, gateway upload, in-sandbox boot) has its own budget, `NEMOCLAW_SANDBOX_READY_TIMEOUT`, also defaulting to 180 seconds.
+On hosts where the sandbox image takes minutes to build or upload — large quantised models, DGX Station first runs, or remote VMs over a slow link — raise both together:
+
+```console
+$ export NEMOCLAW_LOCAL_INFERENCE_TIMEOUT=300
+$ export NEMOCLAW_SANDBOX_READY_TIMEOUT=600
+$ nemoclaw onboard
+```
+
+If onboard ends with `Sandbox '<name>' was created but did not become ready within 180s`, refer to Troubleshooting (use the `nemoclaw-user-reference` skill).
 
 ## Verify the Configuration
 
-Load [references/use-local-inference-details.md](references/use-local-inference-details.md) for detailed steps.
+After onboarding completes, confirm the active provider and model.
+
+```console
+$ nemoclaw <name> status
+```
+
+The output shows the provider label (for example, "Local vLLM" or "Other OpenAI-compatible endpoint") and the active model.
+For Local Ollama, status also checks the authenticated proxy when a proxy token is available.
+If `Inference` is healthy but `Inference (auth proxy)` is not, rerun onboarding to repair the proxy path that sandbox requests use.
 
 ## Switch Models at Runtime
 
-Load [references/use-local-inference-details.md](references/use-local-inference-details.md) for detailed steps.
+You can change the model without re-running onboard.
+Refer to [Switch Inference Models](references/switch-inference-providers.md) for the full procedure.
+
+For compatible endpoints, the command is:
+
+```console
+$ nemoclaw inference set --provider compatible-endpoint --model <model-name>
+```
+
+If the provider itself needs to change (for example, switching from vLLM to a cloud API), pass the new provider to `nemoclaw inference set`.
 
 ## References
 
 - **Load [references/switch-inference-providers.md](references/switch-inference-providers.md)** when switching inference providers, changing the model runtime, or reconfiguring inference routing. Changes the active inference model without restarting the sandbox.
 - **Load [references/set-up-sub-agent.md](references/set-up-sub-agent.md)** when users ask how to add a second model, configure a sub-agent model, use Omni for vision tasks, configure agents.list, or use sessions_spawn in NemoClaw. Shows the NemoClaw-specific file paths and update flow for adding an auxiliary OpenClaw sub-agent model.
-- **[references/tool-calling-reliability.md](references/tool-calling-reliability.md)** — Explains Ollama tool-call leak symptoms, when vLLM with a tool-call parser is recommended, and how to repoint NemoClaw to a parser-aware local endpoint.
 - **Load [references/inference-options.md](references/inference-options.md)** when explaining which providers are available, what the onboard wizard presents, or how inference routing works. Lists all inference providers offered during NemoClaw onboarding.
-- **Load [references/use-local-inference-details.md](references/use-local-inference-details.md)** when you need detailed steps for Authenticated Reverse Proxy, Non-Interactive Setup, Selecting the API Path, and related details.
+- **[references/tool-calling-reliability.md](references/tool-calling-reliability.md)** — Explains Ollama tool-call leak symptoms, when vLLM with a tool-call parser is recommended, and how to repoint NemoClaw to a parser-aware local endpoint.
 
 ## Related Skills
 
diff --git a/.agents/skills/nemoclaw-user-configure-inference/references/use-local-inference-details.md b/.agents/skills/nemoclaw-user-configure-inference/references/use-local-inference-details.md
deleted file mode 100644
index 04d962fde2..0000000000
--- a/.agents/skills/nemoclaw-user-configure-inference/references/use-local-inference-details.md
+++ /dev/null
@@ -1,216 +0,0 @@
-<!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
-<!-- SPDX-License-Identifier: Apache-2.0 -->
-# Use a Local Inference Server: Details
-
-## Authenticated Reverse Proxy
-
-On non-WSL hosts, NemoClaw keeps Ollama bound to `127.0.0.1:11434` and starts a token-gated reverse proxy on `0.0.0.0:11435`.
-The native install/start paths also reset NemoClaw-managed systemd launches to the loopback binding.
-Containers and other hosts on the local network reach Ollama only through the
-proxy, which validates a Bearer token before forwarding requests.
-On that native path, NemoClaw never exposes Ollama without authentication.
-
-WSL Ollama paths do not use this proxy.
-Windows-host Ollama uses the Windows daemon through `host.docker.internal`.
-
-For non-WSL Ollama setups, the onboard wizard manages the proxy automatically:
-
-- Generates a random 24-byte token on first run and stores it in
-  `~/.nemoclaw/ollama-proxy-token` with `0600` permissions.
-- Starts the proxy after Ollama and verifies it before continuing.
-- Cleans up stale proxy processes from previous runs.
-- Probes the sandbox Docker network path to the proxy before committing the inference route.
-- Stops matching proxy processes during uninstall before deleting NemoClaw state.
-- Reuses the persisted token after a host reboot so you do not need to re-run
-  onboard.
-
-On native Linux hosts, a firewall can allow the host proxy health check while still blocking sandbox containers on the OpenShell Docker bridge.
-When the sandbox-side proxy probe fails with a TCP error, onboarding exits before it saves the inference route and prints a command like:
-
-```console
-$ sudo ufw allow from <openshell-docker-subnet> to any port 11435 proto tcp
-$ nemoclaw onboard
-```
-
-If the probe cannot run, for example because Docker Desktop or WSL uses a different host routing model, onboarding continues and relies on the regular proxy health check.
-
-The sandbox provider is configured to use proxy port `11435` with the generated
-token as its `OPENAI_API_KEY` credential.
-OpenShell's L7 proxy injects the token at egress, so the agent inside the
-sandbox never sees the token directly.
-
-All proxy endpoints require the Bearer token, including `GET /api/tags`.
-Internal health and reachability checks run via the proxy treat any HTTP
-response (including `401`) as proof the proxy is alive — they only fail
-when nothing answers at all.
-
-If Ollama is already running on a non-loopback address when you start onboard,
-the wizard restarts it on `127.0.0.1:11434` so the proxy is the only network
-path to the model server.
-
-### Non-Interactive Setup
-
-Set the following environment variables for scripted or CI/CD deployments.
-
-```console
-$ NEMOCLAW_PROVIDER=custom \
-  NEMOCLAW_ENDPOINT_URL=http://localhost:8000/v1 \
-  NEMOCLAW_MODEL=meta-llama/Llama-3.1-8B-Instruct \
-  COMPATIBLE_API_KEY=dummy \
-  nemoclaw onboard --non-interactive
-```
-
-| Variable | Purpose |
-|---|---|
-| `NEMOCLAW_PROVIDER` | Set to `custom` for an OpenAI-compatible endpoint. |
-| `NEMOCLAW_ENDPOINT_URL` | Base URL of the local server. |
-| `NEMOCLAW_MODEL` | Model ID as reported by the server. |
-| `COMPATIBLE_API_KEY` | API key for the endpoint. Use any non-empty value if authentication is not required. |
-
-### Selecting the API Path
-
-For the compatible-endpoint provider, `/v1/chat/completions` is the default.
-NemoClaw tests streaming events during onboarding and uses chat completions
-without probing the Responses API.
-
-To opt in to `/v1/responses`, set `NEMOCLAW_PREFERRED_API` before running onboard:
-
-```console
-$ NEMOCLAW_PREFERRED_API=openai-responses nemoclaw onboard
-```
-
-The wizard then probes `/v1/responses` and only selects it when streaming
-support is complete.
-If the probe fails, the wizard falls back to `/v1/chat/completions`
-automatically.
-You can use this variable in both interactive and non-interactive mode.
-
-| Variable | Values | Default |
-|---|---|---|
-| `NEMOCLAW_PREFERRED_API` | `openai-completions`, `openai-responses` | `openai-completions` for compatible endpoints |
-
-If you already onboarded and the sandbox is failing at runtime, re-run
-`nemoclaw onboard` to re-probe the endpoint and bake the correct API path
-into the image.
-Refer to [Switch Inference Models](switch-inference-providers.md) for details.
-
-## Anthropic-Compatible Server
-
-If your local server implements the Anthropic Messages API (`/v1/messages`), choose **Other Anthropic-compatible endpoint** during onboarding instead.
-
-```console
-$ nemoclaw onboard
-```
-
-For non-interactive setup, use `NEMOCLAW_PROVIDER=anthropicCompatible` and set `COMPATIBLE_ANTHROPIC_API_KEY`.
-
-```console
-$ NEMOCLAW_PROVIDER=anthropicCompatible \
-  NEMOCLAW_ENDPOINT_URL=http://localhost:8080 \
-  NEMOCLAW_MODEL=my-model \
-  COMPATIBLE_ANTHROPIC_API_KEY=dummy \
-  nemoclaw onboard --non-interactive
-```
-
-### Non-Interactive Setup
-
-Use an already-running vLLM server:
-
-```console
-$ NEMOCLAW_PROVIDER=vllm \
-  nemoclaw onboard --non-interactive
-```
-
-Install or start managed vLLM when a supported profile is detected.
-On DGX Spark and DGX Station, `NEMOCLAW_PROVIDER=install-vllm` is enough for non-interactive runs; add `NEMOCLAW_EXPERIMENTAL=1` on generic Linux NVIDIA GPU hosts.
-
-```console
-$ NEMOCLAW_PROVIDER=install-vllm \
-  nemoclaw onboard --non-interactive
-```
-
-NemoClaw records the model returned by vLLM's `/v1/models` endpoint.
-Start vLLM with the model you want before onboarding if you manage the server yourself.
-
-### Override the Managed-vLLM Model
-
-Managed vLLM serves the profile default unless you select a different registry entry.
-Export `NEMOCLAW_VLLM_MODEL=<slug>` before invoking the installer to choose a different model from the registry.
-NemoClaw uses the matching `vllm serve` flags, including the reasoning parser, tool-call parser, and `--max-model-len`.
-Recognised slugs:
-
-| Slug | Hugging Face model | Notes |
-|---|---|---|
-| `qwen3.6-27b` | `Qwen/Qwen3.6-27B-FP8` | Default on DGX Spark and DGX Station profiles |
-| `nemotron-3-nano-4b` | `nvidia/NVIDIA-Nemotron-3-Nano-4B-FP8` | Default on the generic Linux + NVIDIA GPU profile |
-| `deepseek-r1-distill-70b` | `deepseek-ai/DeepSeek-R1-Distill-Llama-70B` | Gated. Requires Hugging Face license acceptance |
-
-The slug is case-insensitive; the full Hugging Face id is also accepted.
-An unrecognised value fails fast with a list of valid slugs.
-
-Gated models require a Hugging Face token; export it before onboarding so NemoClaw can forward it into the managed vLLM container:
-
-```console
-$ export HF_TOKEN=<your-hf-token>
-$ NEMOCLAW_PROVIDER=install-vllm \
-  NEMOCLAW_VLLM_MODEL=deepseek-r1-distill-70b \
-  nemoclaw onboard --non-interactive
-```
-
-`HUGGING_FACE_HUB_TOKEN` is accepted as an alternative.
-The token check runs on the host before any docker pull, so a missing or empty token aborts onboarding before bandwidth is spent on a 401.
-
-## Timeout Configuration
-
-Local inference requests use a default timeout of 180 seconds.
-Large prompts on hardware such as DGX Spark can exceed shorter timeouts, so NemoClaw sets a higher default for Ollama, vLLM, NIM, and compatible-endpoint setup.
-
-To override the timeout, set the `NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` environment variable before onboarding:
-
-```console
-$ export NEMOCLAW_LOCAL_INFERENCE_TIMEOUT=300
-$ nemoclaw onboard
-```
-
-The value is in seconds.
-This setting is baked into the sandbox at build time.
-Changing it after onboarding requires re-running `nemoclaw onboard`.
-
-`NEMOCLAW_LOCAL_INFERENCE_TIMEOUT` only governs the inference-server validation probe.
-During local Ollama setup, NemoClaw treats host-side curl process timeouts as retryable probe failures and retries with a larger timeout before it reports a validation failure.
-NemoClaw also retries Docker runtime detection with a longer `docker info` timeout before it chooses the local inference route.
-The post-create readiness wait (image build, gateway upload, in-sandbox boot) has its own budget, `NEMOCLAW_SANDBOX_READY_TIMEOUT`, also defaulting to 180 seconds.
-On hosts where the sandbox image takes minutes to build or upload — large quantised models, DGX Station first runs, or remote VMs over a slow link — raise both together:
-
-```console
-$ export NEMOCLAW_LOCAL_INFERENCE_TIMEOUT=300
-$ export NEMOCLAW_SANDBOX_READY_TIMEOUT=600
-$ nemoclaw onboard
-```
-
-If onboard ends with `Sandbox '<name>' was created but did not become ready within 180s`, refer to Troubleshooting (use the `nemoclaw-user-reference` skill).
-
-## Verify the Configuration
-
-After onboarding completes, confirm the active provider and model.
-
-```console
-$ nemoclaw <name> status
-```
-
-The output shows the provider label (for example, "Local vLLM" or "Other OpenAI-compatible endpoint") and the active model.
-For Local Ollama, status also checks the authenticated proxy when a proxy token is available.
-If `Inference` is healthy but `Inference (auth proxy)` is not, rerun onboarding to repair the proxy path that sandbox requests use.
-
-## Switch Models at Runtime
-
-You can change the model without re-running onboard.
-Refer to [Switch Inference Models](switch-inference-providers.md) for the full procedure.
-
-For compatible endpoints, the command is:
-
-```console
-$ nemoclaw inference set --provider compatible-endpoint --model <model-name>
-```
-
-If the provider itself needs to change (for example, switching from vLLM to a cloud API), pass the new provider to `nemoclaw inference set`.
diff --git a/.agents/skills/nemoclaw-user-configure-security/SKILL.md b/.agents/skills/nemoclaw-user-configure-security/SKILL.md
index 36df08415f..d4e22bf269 100644
--- a/.agents/skills/nemoclaw-user-configure-security/SKILL.md
+++ b/.agents/skills/nemoclaw-user-configure-security/SKILL.md
@@ -7,10 +7,10 @@ license: "Apache-2.0"
 <!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
 <!-- SPDX-License-Identifier: Apache-2.0 -->
 
-# NemoClaw Security Best Practices: Controls, Risks, and Posture Profiles
+# NemoClaw User Configure Security
 
 ## References
 
 - **Load [references/best-practices.md](references/best-practices.md)** when evaluating security posture, reviewing sandbox security defaults, or assessing control trade-offs. Presents a risk framework for every configurable security control in NemoClaw.
-- **Load [references/openclaw-controls.md](references/openclaw-controls.md)** when reviewing the security boundary between NemoClaw and OpenClaw or assessing what NemoClaw does not cover. Lists OpenClaw security controls that operate independently of NemoClaw, including prompt injection detection, tool access control, rate limiting, environment variable policy, audit framework, supply chain scanning, messaging access policy, context visibility, and safe regex.
 - **Load [references/credential-storage.md](references/credential-storage.md)** when reviewing how credentials are handled, locating a stored credential, or assessing the storage threat model. Covers where NemoClaw stores provider credentials, why nothing is persisted to host disk, and how the OpenShell gateway acts as the single system of record.
+- **Load [references/openclaw-controls.md](references/openclaw-controls.md)** when reviewing the security boundary between NemoClaw and OpenClaw or assessing what NemoClaw does not cover. Lists OpenClaw security controls that operate independently of NemoClaw, including prompt injection detection, tool access control, rate limiting, environment variable policy, audit framework, supply chain scanning, messaging access policy, context visibility, and safe regex.
diff --git a/.agents/skills/nemoclaw-user-get-started/SKILL.md b/.agents/skills/nemoclaw-user-get-started/SKILL.md
index bc63352312..bab2b2d0e2 100644
--- a/.agents/skills/nemoclaw-user-get-started/SKILL.md
+++ b/.agents/skills/nemoclaw-user-get-started/SKILL.md
@@ -87,6 +87,172 @@ The onboard flow builds the sandbox image with `NEMOCLAW_DISABLE_DEVICE_AUTH=1`
 This is a build-time setting baked into the sandbox image, not a runtime knob.
 If you export `NEMOCLAW_DISABLE_DEVICE_AUTH` after onboarding finishes, it has no effect on an existing sandbox.
 
+### Respond to the Onboard Wizard
+
+After the installer launches `nemoclaw onboard`, the wizard runs preflight checks, starts or reuses the OpenShell gateway, asks for an inference provider and model, collects any required credential, then asks for the sandbox name.
+It prints a review summary before it registers the provider with OpenShell.
+After you confirm, NemoClaw registers inference, prompts for optional web search and messaging channels, builds and starts the sandbox, sets up OpenClaw, then applies the selected network policy tier and presets.
+At any prompt, press Enter to accept the default shown in `[brackets]`, type `back` to return to the previous prompt, or type `exit` to quit.
+If existing sandbox sessions are running, the installer warns before onboarding because the setup can rebuild or upgrade sandboxes after the new sandbox launches.
+
+The inference provider prompt presents a numbered list.
+
+```text
+  1) NVIDIA Endpoints
+  2) OpenAI
+  3) Other OpenAI-compatible endpoint
+  4) Anthropic
+  5) Other Anthropic-compatible endpoint
+  6) Google Gemini
+  7) Local Ollama (localhost:11434)
+  8) Model Router (experimental)
+  Choose [1]:
+```
+
+Pick the option that matches where you want inference traffic to go, then expand the matching helper below for the follow-up prompts and the API key environment variable to set.
+For the full list of providers and validation behavior, refer to Inference Options (use the `nemoclaw-user-configure-inference` skill).
+Local Ollama appears when NemoClaw detects a usable local Ollama path or can offer an install or start action for your platform.
+The Model Router option appears when the blueprint router profile is enabled.
+
+**Tip:**
+
+Export the API key before launching the installer so the wizard does not have to ask for it.
+For example, run `export NVIDIA_API_KEY=<your-key>` before `curl ... | bash`.
+If you entered a key incorrectly, refer to Reset a Stored Credential (use the `nemoclaw-user-manage-sandboxes` skill) to clear and re-enter it.
+
+**Option 1: NVIDIA Endpoints:**
+
+Routes inference to models hosted on [build.nvidia.com](https://build.nvidia.com).
+
+Use `NVIDIA_API_KEY` for the API key. Get one from the [NVIDIA build API keys page](https://build.nvidia.com/settings/api-keys).
+
+Respond to the wizard as follows.
+
+1. At the `Choose [1]:` prompt, press Enter (or type `1`) to select **NVIDIA Endpoints**.
+2. At the `NVIDIA_API_KEY:` prompt, paste your key if it is not already exported.
+3. At the `Choose model [1]:` prompt, pick a curated model from the list (for example, `Nemotron 3 Super 120B`, `GLM-5`, `MiniMax M2.7`, `GPT-OSS 120B`, or `DeepSeek V4 Pro`), or pick `Other...` to enter any model ID from the [NVIDIA Endpoints catalog](https://build.nvidia.com).
+
+NemoClaw validates the model against the catalog API before creating the sandbox.
+
+**Tip:**
+
+Use this option for Nemotron and other models hosted on `build.nvidia.com`. If you run NVIDIA Nemotron from a self-hosted NIM, an enterprise gateway, or any other endpoint, choose **Option 3** instead, since all Nemotron models expose OpenAI-compatible APIs.
+
+**Option 2: OpenAI:**
+
+Routes inference to the OpenAI API at `https://api.openai.com/v1`.
+
+Use `OPENAI_API_KEY` for the API key. Get one from the [OpenAI API keys page](https://platform.openai.com/api-keys).
+
+Respond to the wizard as follows.
+
+1. At the `Choose [1]:` prompt, type `2` to select **OpenAI**.
+2. At the `OPENAI_API_KEY:` prompt, paste your key if it is not already exported.
+3. At the `Choose model [1]:` prompt, pick a curated model (for example, `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano`, or `gpt-5.4-pro-2026-03-05`), or pick **Other...** to enter any OpenAI model ID.
+
+**Option 3: Other OpenAI-Compatible Endpoint:**
+
+Routes inference to any server that implements `/v1/chat/completions`, including OpenRouter, LocalAI, llama.cpp, vLLM behind a proxy, and any compatible gateway.
+
+Use `COMPATIBLE_API_KEY` for the API key. Set it to whatever credential your endpoint expects. If your endpoint does not require auth, use any non-empty placeholder.
+
+Respond to the wizard as follows.
+
+1. At the `Choose [1]:` prompt, type `3` to select **Other OpenAI-compatible endpoint**.
+2. At the `OpenAI-compatible base URL` prompt, enter the provider's base URL. Find the exact value in your provider's API documentation. NemoClaw appends `/v1` automatically, so leave that suffix off.
+3. At the `COMPATIBLE_API_KEY:` prompt, paste your key if it is not already exported.
+4. At the `Other OpenAI-compatible endpoint model []:` prompt, enter the model ID exactly as it appears in your provider's model catalog.
+
+For example, when you use NVIDIA's OpenAI-compatible inference endpoint, enter `https://inference-api.nvidia.com` as the base URL and the model ID your endpoint exposes, such as `openai/openai/gpt-5.5`.
+
+NemoClaw sends a real inference request to validate the endpoint and model.
+If the endpoint does not return the streaming events OpenClaw needs from the Responses API, NemoClaw falls back to the chat completions API and configures OpenClaw to use `openai-completions`.
+
+**Tip:**
+
+NVIDIA Nemotron models expose OpenAI-compatible APIs, so this option is the right choice for any Nemotron deployment that does not live on `build.nvidia.com`. Common examples include a self-hosted NIM container, an enterprise NVIDIA AI Enterprise gateway, or a vLLM/SGLang server running Nemotron weights. Point the base URL at your endpoint and enter the Nemotron model ID exactly as your server reports it.
+
+**Option 4: Anthropic:**
+
+Routes inference to the Anthropic Messages API at `https://api.anthropic.com`.
+
+Use `ANTHROPIC_API_KEY` for the API key. Get one from the [Anthropic console keys page](https://console.anthropic.com/settings/keys).
+
+Respond to the wizard as follows.
+
+1. At the `Choose [1]:` prompt, type `4` to select **Anthropic**.
+2. At the `ANTHROPIC_API_KEY:` prompt, paste your key if it is not already exported.
+3. At the `Choose model [1]:` prompt, pick a curated model (for example, `claude-sonnet-4-6`, `claude-haiku-4-5`, or `claude-opus-4-6`), or pick **Other...** to enter any Claude model ID.
+
+**Option 5: Other Anthropic-Compatible Endpoint:**
+
+Routes inference to any server that implements the Anthropic Messages API at `/v1/messages`, including Claude proxies, Bedrock-compatible gateways, and self-hosted Anthropic-compatible servers.
+
+Use `COMPATIBLE_ANTHROPIC_API_KEY` for the API key. Set it to whatever credential your endpoint expects.
+
+Respond to the wizard as follows.
+
+1. At the `Choose [1]:` prompt, type `5` to select **Other Anthropic-compatible endpoint**.
+2. At the `Anthropic-compatible base URL` prompt, enter the proxy or gateway's base URL from its documentation.
+3. At the `COMPATIBLE_ANTHROPIC_API_KEY:` prompt, paste your key if it is not already exported.
+4. At the `Other Anthropic-compatible endpoint model []:` prompt, enter the model ID exactly as it appears in your gateway's model catalog.
+
+**Option 6: Google Gemini:**
+
+Routes inference to Google's OpenAI-compatible Gemini endpoint at `https://generativelanguage.googleapis.com/v1beta/openai/`.
+
+Use `GEMINI_API_KEY` for the API key. Get one from [Google AI Studio API keys](https://aistudio.google.com/app/apikey).
+
+Respond to the wizard as follows.
+
+1. At the `Choose [1]:` prompt, type `6` to select **Google Gemini**.
+2. At the `GEMINI_API_KEY:` prompt, paste your key if it is not already exported.
+3. At the `Choose model [5]:` prompt, pick a curated model (for example, `gemini-3.1-pro-preview`, `gemini-3.1-flash-lite-preview`, `gemini-3-flash-preview`, `gemini-2.5-pro`, `gemini-2.5-flash`, or `gemini-2.5-flash-lite`), or pick **Other...** to enter any Gemini model ID.
+
+**Option 7: Local Ollama:**
+
+Routes inference to a local Ollama instance. Depending on your platform, the wizard can use an existing daemon, start an installed daemon, or offer an install action.
+
+No API key is required. On non-WSL hosts, NemoClaw generates a token and starts an authenticated proxy so containers can reach Ollama without exposing the daemon directly to your network.
+On WSL, NemoClaw can also use Ollama on the Windows host through `host.docker.internal`.
+
+Respond to the wizard as follows.
+
+1. At the `Choose [1]:` prompt, type `7` to select **Local Ollama**.
+2. At the `Choose model [1]:` prompt, pick from **Ollama models** if any are already installed. If none are installed, pick a **starter model** to pull and load now, or pick **Other...** to enter any Ollama model ID.
+
+For setup details, including GPU recommendations and starter model choices, refer to Use a Local Inference Server (use the `nemoclaw-user-configure-inference` skill).
+
+**Option 8: Model Router:**
+
+Starts a host-side model router and routes sandbox inference through OpenShell to that router.
+The router chooses from the model pool in `nemoclaw-blueprint/router/pool-config.yaml` for each request.
+
+Use `NVIDIA_API_KEY` for the model pool credentials.
+
+Respond to the wizard as follows.
+
+1. At the `Choose [1]:` prompt, type `8` to select **Model Router (experimental)**.
+2. At the `NVIDIA_API_KEY:` prompt, paste your key if it is not already exported.
+3. Review the configuration summary and continue with the sandbox build.
+
+For scripted setup, set:
+
+```console
+$ NEMOCLAW_PROVIDER=routed NVIDIA_API_KEY=<your-key> nemoclaw onboard --non-interactive
+```
+
+The router listens on the host at port `4000`.
+The sandbox still calls `https://inference.local/v1`, so do not point in-sandbox tools at the host router port directly.
+
+**Local NIM and Local vLLM:**
+
+- **Local NVIDIA NIM** appears when `NEMOCLAW_EXPERIMENTAL=1` is set and the host has a NIM-capable GPU. NemoClaw pulls and manages a NIM container.
+- **Local vLLM (already running)** appears whenever NemoClaw detects a vLLM server on `localhost:8000`. No flag is required for the menu entry. NemoClaw auto-detects the loaded model.
+- **Local vLLM (managed install/start)** appears by default on DGX Spark and DGX Station. Generic Linux NVIDIA GPU hosts require `NEMOCLAW_EXPERIMENTAL=1` or `NEMOCLAW_PROVIDER=install-vllm`. NemoClaw pulls and starts a vLLM container on supported hosts.
+
+For setup, refer to Use a Local Inference Server (use the `nemoclaw-user-configure-inference` skill).
+
 ### Review the Configuration Before the Sandbox Build
 
 After you enter the sandbox name, the wizard prints a review summary and asks for final confirmation before registering the provider, prompting for optional integrations, and building the sandbox image.
@@ -176,8 +342,6 @@ Manage later
 
 If you picked a different option, the `Model` line shows that provider's model and label instead. For example, you might see `gpt-5.4 (OpenAI)`, `claude-sonnet-4-6 (Anthropic)`, `gemini-2.5-flash (Google Gemini)`, `llama3.1:8b (Local Ollama)`, `nvidia-routed (Model Router)`, or `<your-model> (Other OpenAI-compatible endpoint)`.
 
-Load [references/quickstart-details.md](references/quickstart-details.md) for detailed steps on Respond to the Onboard Wizard.
-
 ## Run Your First Agent Prompt
 
 You can chat with the agent from the terminal or the browser.
@@ -214,7 +378,6 @@ openclaw tui
 - **Load [references/quickstart-hermes.md](references/quickstart-hermes.md)** when users ask for Hermes setup, NemoHermes onboarding, or running Hermes inside OpenShell. Installs NemoClaw, selects the Hermes agent, and launches a sandboxed Hermes API endpoint.
 - **Load [references/prerequisites.md](references/prerequisites.md)** when verifying prerequisites before installation. Lists the hardware, software, and container runtime requirements for running NemoClaw.
 - **Load [references/windows-preparation.md](references/windows-preparation.md)** when preparing a Windows machine for NemoClaw, enabling WSL 2, configuring Docker Desktop for Windows, or troubleshooting a Windows-specific install error. Covers Windows-only preparation steps required before the Quickstart.
-- **Load [references/quickstart-details.md](references/quickstart-details.md)** when you need detailed steps for Respond to the Onboard Wizard.
 
 ## Related Skills
 
diff --git a/.agents/skills/nemoclaw-user-get-started/references/quickstart-details.md b/.agents/skills/nemoclaw-user-get-started/references/quickstart-details.md
deleted file mode 100644
index a688104748..0000000000
--- a/.agents/skills/nemoclaw-user-get-started/references/quickstart-details.md
+++ /dev/null
@@ -1,169 +0,0 @@
-<!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
-<!-- SPDX-License-Identifier: Apache-2.0 -->
-# NemoClaw Quickstart with OpenClaw: Details
-
-## Respond to the Onboard Wizard
-
-After the installer launches `nemoclaw onboard`, the wizard runs preflight checks, starts or reuses the OpenShell gateway, asks for an inference provider and model, collects any required credential, then asks for the sandbox name.
-It prints a review summary before it registers the provider with OpenShell.
-After you confirm, NemoClaw registers inference, prompts for optional web search and messaging channels, builds and starts the sandbox, sets up OpenClaw, then applies the selected network policy tier and presets.
-At any prompt, press Enter to accept the default shown in `[brackets]`, type `back` to return to the previous prompt, or type `exit` to quit.
-If existing sandbox sessions are running, the installer warns before onboarding because the setup can rebuild or upgrade sandboxes after the new sandbox launches.
-
-The inference provider prompt presents a numbered list.
-
-```text
-  1) NVIDIA Endpoints
-  2) OpenAI
-  3) Other OpenAI-compatible endpoint
-  4) Anthropic
-  5) Other Anthropic-compatible endpoint
-  6) Google Gemini
-  7) Local Ollama (localhost:11434)
-  8) Model Router (experimental)
-  Choose [1]:
-```
-
-Pick the option that matches where you want inference traffic to go, then expand the matching helper below for the follow-up prompts and the API key environment variable to set.
-For the full list of providers and validation behavior, refer to Inference Options (use the `nemoclaw-user-configure-inference` skill).
-Local Ollama appears when NemoClaw detects a usable local Ollama path or can offer an install or start action for your platform.
-The Model Router option appears when the blueprint router profile is enabled.
-
-**Tip:**
-
-Export the API key before launching the installer so the wizard does not have to ask for it.
-For example, run `export NVIDIA_API_KEY=<your-key>` before `curl ... | bash`.
-If you entered a key incorrectly, refer to Reset a Stored Credential (use the `nemoclaw-user-manage-sandboxes` skill) to clear and re-enter it.
-
-**Option 1: NVIDIA Endpoints:**
-
-Routes inference to models hosted on [build.nvidia.com](https://build.nvidia.com).
-
-Use `NVIDIA_API_KEY` for the API key. Get one from the [NVIDIA build API keys page](https://build.nvidia.com/settings/api-keys).
-
-Respond to the wizard as follows.
-
-1. At the `Choose [1]:` prompt, press Enter (or type `1`) to select **NVIDIA Endpoints**.
-2. At the `NVIDIA_API_KEY:` prompt, paste your key if it is not already exported.
-3. At the `Choose model [1]:` prompt, pick a curated model from the list (for example, `Nemotron 3 Super 120B`, `GLM-5`, `MiniMax M2.7`, `GPT-OSS 120B`, or `DeepSeek V4 Pro`), or pick `Other...` to enter any model ID from the [NVIDIA Endpoints catalog](https://build.nvidia.com).
-
-NemoClaw validates the model against the catalog API before creating the sandbox.
-
-**Tip:**
-
-Use this option for Nemotron and other models hosted on `build.nvidia.com`. If you run NVIDIA Nemotron from a self-hosted NIM, an enterprise gateway, or any other endpoint, choose **Option 3** instead, since all Nemotron models expose OpenAI-compatible APIs.
-
-**Option 2: OpenAI:**
-
-Routes inference to the OpenAI API at `https://api.openai.com/v1`.
-
-Use `OPENAI_API_KEY` for the API key. Get one from the [OpenAI API keys page](https://platform.openai.com/api-keys).
-
-Respond to the wizard as follows.
-
-1. At the `Choose [1]:` prompt, type `2` to select **OpenAI**.
-2. At the `OPENAI_API_KEY:` prompt, paste your key if it is not already exported.
-3. At the `Choose model [1]:` prompt, pick a curated model (for example, `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano`, or `gpt-5.4-pro-2026-03-05`), or pick **Other...** to enter any OpenAI model ID.
-
-**Option 3: Other OpenAI-Compatible Endpoint:**
-
-Routes inference to any server that implements `/v1/chat/completions`, including OpenRouter, LocalAI, llama.cpp, vLLM behind a proxy, and any compatible gateway.
-
-Use `COMPATIBLE_API_KEY` for the API key. Set it to whatever credential your endpoint expects. If your endpoint does not require auth, use any non-empty placeholder.
-
-Respond to the wizard as follows.
-
-1. At the `Choose [1]:` prompt, type `3` to select **Other OpenAI-compatible endpoint**.
-2. At the `OpenAI-compatible base URL` prompt, enter the provider's base URL. Find the exact value in your provider's API documentation. NemoClaw appends `/v1` automatically, so leave that suffix off.
-3. At the `COMPATIBLE_API_KEY:` prompt, paste your key if it is not already exported.
-4. At the `Other OpenAI-compatible endpoint model []:` prompt, enter the model ID exactly as it appears in your provider's model catalog.
-
-For example, when you use NVIDIA's OpenAI-compatible inference endpoint, enter `https://inference-api.nvidia.com` as the base URL and the model ID your endpoint exposes, such as `openai/openai/gpt-5.5`.
-
-NemoClaw sends a real inference request to validate the endpoint and model.
-If the endpoint does not return the streaming events OpenClaw needs from the Responses API, NemoClaw falls back to the chat completions API and configures OpenClaw to use `openai-completions`.
-
-**Tip:**
-
-NVIDIA Nemotron models expose OpenAI-compatible APIs, so this option is the right choice for any Nemotron deployment that does not live on `build.nvidia.com`. Common examples include a self-hosted NIM container, an enterprise NVIDIA AI Enterprise gateway, or a vLLM/SGLang server running Nemotron weights. Point the base URL at your endpoint and enter the Nemotron model ID exactly as your server reports it.
-
-**Option 4: Anthropic:**
-
-Routes inference to the Anthropic Messages API at `https://api.anthropic.com`.
-
-Use `ANTHROPIC_API_KEY` for the API key. Get one from the [Anthropic console keys page](https://console.anthropic.com/settings/keys).
-
-Respond to the wizard as follows.
-
-1. At the `Choose [1]:` prompt, type `4` to select **Anthropic**.
-2. At the `ANTHROPIC_API_KEY:` prompt, paste your key if it is not already exported.
-3. At the `Choose model [1]:` prompt, pick a curated model (for example, `claude-sonnet-4-6`, `claude-haiku-4-5`, or `claude-opus-4-6`), or pick **Other...** to enter any Claude model ID.
-
-**Option 5: Other Anthropic-Compatible Endpoint:**
-
-Routes inference to any server that implements the Anthropic Messages API at `/v1/messages`, including Claude proxies, Bedrock-compatible gateways, and self-hosted Anthropic-compatible servers.
-
-Use `COMPATIBLE_ANTHROPIC_API_KEY` for the API key. Set it to whatever credential your endpoint expects.
-
-Respond to the wizard as follows.
-
-1. At the `Choose [1]:` prompt, type `5` to select **Other Anthropic-compatible endpoint**.
-2. At the `Anthropic-compatible base URL` prompt, enter the proxy or gateway's base URL from its documentation.
-3. At the `COMPATIBLE_ANTHROPIC_API_KEY:` prompt, paste your key if it is not already exported.
-4. At the `Other Anthropic-compatible endpoint model []:` prompt, enter the model ID exactly as it appears in your gateway's model catalog.
-
-**Option 6: Google Gemini:**
-
-Routes inference to Google's OpenAI-compatible Gemini endpoint at `https://generativelanguage.googleapis.com/v1beta/openai/`.
-
-Use `GEMINI_API_KEY` for the API key. Get one from [Google AI Studio API keys](https://aistudio.google.com/app/apikey).
-
-Respond to the wizard as follows.
-
-1. At the `Choose [1]:` prompt, type `6` to select **Google Gemini**.
-2. At the `GEMINI_API_KEY:` prompt, paste your key if it is not already exported.
-3. At the `Choose model [5]:` prompt, pick a curated model (for example, `gemini-3.1-pro-preview`, `gemini-3.1-flash-lite-preview`, `gemini-3-flash-preview`, `gemini-2.5-pro`, `gemini-2.5-flash`, or `gemini-2.5-flash-lite`), or pick **Other...** to enter any Gemini model ID.
-
-**Option 7: Local Ollama:**
-
-Routes inference to a local Ollama instance. Depending on your platform, the wizard can use an existing daemon, start an installed daemon, or offer an install action.
-
-No API key is required. On non-WSL hosts, NemoClaw generates a token and starts an authenticated proxy so containers can reach Ollama without exposing the daemon directly to your network.
-On WSL, NemoClaw can also use Ollama on the Windows host through `host.docker.internal`.
-
-Respond to the wizard as follows.
-
-1. At the `Choose [1]:` prompt, type `7` to select **Local Ollama**.
-2. At the `Choose model [1]:` prompt, pick from **Ollama models** if any are already installed. If none are installed, pick a **starter model** to pull and load now, or pick **Other...** to enter any Ollama model ID.
-
-For setup details, including GPU recommendations and starter model choices, refer to Use a Local Inference Server (use the `nemoclaw-user-configure-inference` skill).
-
-**Option 8: Model Router:**
-
-Starts a host-side model router and routes sandbox inference through OpenShell to that router.
-The router chooses from the model pool in `nemoclaw-blueprint/router/pool-config.yaml` for each request.
-
-Use `NVIDIA_API_KEY` for the model pool credentials.
-
-Respond to the wizard as follows.
-
-1. At the `Choose [1]:` prompt, type `8` to select **Model Router (experimental)**.
-2. At the `NVIDIA_API_KEY:` prompt, paste your key if it is not already exported.
-3. Review the configuration summary and continue with the sandbox build.
-
-For scripted setup, set:
-
-```console
-$ NEMOCLAW_PROVIDER=routed NVIDIA_API_KEY=<your-key> nemoclaw onboard --non-interactive
-```
-
-The router listens on the host at port `4000`.
-The sandbox still calls `https://inference.local/v1`, so do not point in-sandbox tools at the host router port directly.
-
-**Local NIM and Local vLLM:**
-
-- **Local NVIDIA NIM** appears when `NEMOCLAW_EXPERIMENTAL=1` is set and the host has a NIM-capable GPU. NemoClaw pulls and manages a NIM container.
-- **Local vLLM (already running)** appears whenever NemoClaw detects a vLLM server on `localhost:8000`. No flag is required for the menu entry. NemoClaw auto-detects the loaded model.
-- **Local vLLM (managed install/start)** appears by default on DGX Spark and DGX Station. Generic Linux NVIDIA GPU hosts require `NEMOCLAW_EXPERIMENTAL=1` or `NEMOCLAW_PROVIDER=install-vllm`. NemoClaw pulls and starts a vLLM container on supported hosts.
-
-For setup, refer to Use a Local Inference Server (use the `nemoclaw-user-configure-inference` skill).
diff --git a/.agents/skills/nemoclaw-user-manage-policy/SKILL.md b/.agents/skills/nemoclaw-user-manage-policy/SKILL.md
index 298c672588..5f62d8d49f 100644
--- a/.agents/skills/nemoclaw-user-manage-policy/SKILL.md
+++ b/.agents/skills/nemoclaw-user-manage-policy/SKILL.md
@@ -279,13 +279,20 @@ Fix the failing file and re-run the command to continue.
 Custom preset hosts bypass NemoClaw's review process and can widen sandbox egress to arbitrary destinations.
 Review every host in a custom preset before applying it, especially when the file originates outside your team.
 
-Load [references/customize-network-policy-details.md](references/customize-network-policy-details.md) for detailed steps on Remove a Custom Preset.
+### Remove a Custom Preset
+
+Custom presets applied with `--from-file` or `--from-dir` are recorded in the NemoClaw sandbox registry alongside their full YAML content, so they can be removed by name — the original file does not need to be kept on disk:
+
+```console
+$ nemoclaw my-assistant policy-remove my-internal-api --yes
+```
+
+`policy-remove` accepts both built-in and custom preset names. Run `nemoclaw <name> policy-list` to see every preset currently applied to the sandbox.
 
 ## References
 
 - **[references/integration-policy-examples.md](references/integration-policy-examples.md)** — Guides users through common post-install integration policy setup for maintained NemoClaw policy presets, including Outlook, messaging channels, GitHub, Jira, Brave Search, package managers, Hugging Face, local inference, and OpenShell approval workflows.
 - **Load [references/approve-network-requests.md](references/approve-network-requests.md)** when approving or denying sandbox egress requests, managing blocked network calls, or using the approval TUI. Reviews and approves blocked agent network requests in the TUI.
-- **Load [references/customize-network-policy-details.md](references/customize-network-policy-details.md)** when you need detailed steps for Remove a Custom Preset.
 
 ## Related Skills
 
diff --git a/.agents/skills/nemoclaw-user-manage-policy/references/customize-network-policy-details.md b/.agents/skills/nemoclaw-user-manage-policy/references/customize-network-policy-details.md
deleted file mode 100644
index 224829b964..0000000000
--- a/.agents/skills/nemoclaw-user-manage-policy/references/customize-network-policy-details.md
+++ /dev/null
@@ -1,13 +0,0 @@
-<!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
-<!-- SPDX-License-Identifier: Apache-2.0 -->
-# Customize the Sandbox Network Policy: Details
-
-## Remove a Custom Preset
-
-Custom presets applied with `--from-file` or `--from-dir` are recorded in the NemoClaw sandbox registry alongside their full YAML content, so they can be removed by name — the original file does not need to be kept on disk:
-
-```console
-$ nemoclaw my-assistant policy-remove my-internal-api --yes
-```
-
-`policy-remove` accepts both built-in and custom preset names. Run `nemoclaw <name> policy-list` to see every preset currently applied to the sandbox.
diff --git a/.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md b/.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md
index b5618052c5..766db03f54 100644
--- a/.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md
+++ b/.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md
@@ -218,7 +218,28 @@ This restores saved state directories only; it does not downgrade the sandbox im
 $ nemoclaw <sandbox-name> snapshot restore pre-upgrade
 ```
 
-Load [references/lifecycle-details.md](references/lifecycle-details.md) for detailed steps on What Changes During a Rebuild.
+### What Changes During a Rebuild
+
+Each rebuild destroys the existing container and creates a new one.
+NemoClaw protects your data through the same backup-and-restore flow as `nemoclaw <name> rebuild` (use the `nemoclaw-user-reference` skill):
+
+- NemoClaw preserves manifest-defined workspace state. Before deleting the old container, NemoClaw snapshots the state directories and durable state files defined in the agent manifest, typically `/sandbox/.openclaw/workspace/`; for Hermes this also includes `SOUL.md` and the SQLite database behind `.hermes/state.db`. Stored credentials (`~/.nemoclaw/credentials.json`) and registered policy presets live on the host and are re-applied to the new sandbox automatically.
+- NemoClaw does not preserve runtime changes outside the workspace state directories. This includes packages installed inside the running container with `apt` or `pip`, files in non-workspace paths, and in-memory or process state. If you have customized the running container at runtime, capture that as `Dockerfile` changes for `nemoclaw onboard --from` or a manual `openshell sandbox download` before the rebuild starts.
+
+Aborts before the destroy step are non-destructive.
+The flow refuses to proceed past preflight if a credential is missing or past backup if required manifest-defined state cannot be copied, so a failed run leaves the original sandbox intact and ready to retry.
+When a backup command reports partial archive output, NemoClaw keeps the usable entries and reports only the manifest-defined paths that could not be archived.
+
+See [Backup and Restore](references/backup-restore.md) for the full list of state-preservation guarantees, snapshot retention, and instructions for manual backups when the auto-flow is not enough.
+
+**If the rebuild aborts with `Missing credential: <KEY>`:**
+
+The rebuild preflight reads the provider credential recorded by your last `nemoclaw onboard` session.
+If you have switched providers since onboarding, for example from a remote API to a local Ollama setup, the preflight may still reference the old key and fail before any destroy step runs.
+
+To recover, re-run `nemoclaw onboard` and select your current provider.
+This refreshes the session metadata.
+Your existing container keeps serving traffic until the new image is ready.
 
 ## Uninstall
 
@@ -256,7 +277,6 @@ For a full comparison of the two forms, including what they fetch, what they tru
 - **Load [references/backup-restore.md](references/backup-restore.md)** when downloading workspace files from a sandbox, uploading restored files into a new sandbox, or preserving sandbox state across rebuilds. Backs up and restores OpenClaw workspace files before destructive operations such as sandbox rebuilds.
 - **Load [references/messaging-channels.md](references/messaging-channels.md)** when setting up messaging channels, chat interfaces, or integrations without relying on nemoclaw tunnel start for bridges. Explains how Telegram, Discord, Slack, WeChat, and WhatsApp reach sandboxed OpenClaw and Hermes agents through OpenShell-managed processes and NemoClaw channel commands.
 - **Load [references/workspace-files.md](references/workspace-files.md)** when users ask about `SOUL.md`, `USER.md`, `IDENTITY.md`, `AGENTS.md`, or other workspace files, or when preparing to back up or restore workspace state. Explains what workspace personality and configuration files are, where they live, and how they persist across sandbox restarts.
-- **Load [references/lifecycle-details.md](references/lifecycle-details.md)** when you need detailed steps for What Changes During a Rebuild.
 
 ## Related Skills
 
diff --git a/.agents/skills/nemoclaw-user-manage-sandboxes/references/lifecycle-details.md b/.agents/skills/nemoclaw-user-manage-sandboxes/references/lifecycle-details.md
deleted file mode 100644
index 38d055b82f..0000000000
--- a/.agents/skills/nemoclaw-user-manage-sandboxes/references/lifecycle-details.md
+++ /dev/null
@@ -1,26 +0,0 @@
-<!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
-<!-- SPDX-License-Identifier: Apache-2.0 -->
-# Manage Sandbox Lifecycle: Details
-
-## What Changes During a Rebuild
-
-Each rebuild destroys the existing container and creates a new one.
-NemoClaw protects your data through the same backup-and-restore flow as `nemoclaw <name> rebuild` (use the `nemoclaw-user-reference` skill):
-
-- NemoClaw preserves manifest-defined workspace state. Before deleting the old container, NemoClaw snapshots the state directories and durable state files defined in the agent manifest, typically `/sandbox/.openclaw/workspace/`; for Hermes this also includes `SOUL.md` and the SQLite database behind `.hermes/state.db`. Stored credentials (`~/.nemoclaw/credentials.json`) and registered policy presets live on the host and are re-applied to the new sandbox automatically.
-- NemoClaw does not preserve runtime changes outside the workspace state directories. This includes packages installed inside the running container with `apt` or `pip`, files in non-workspace paths, and in-memory or process state. If you have customized the running container at runtime, capture that as `Dockerfile` changes for `nemoclaw onboard --from` or a manual `openshell sandbox download` before the rebuild starts.
-
-Aborts before the destroy step are non-destructive.
-The flow refuses to proceed past preflight if a credential is missing or past backup if required manifest-defined state cannot be copied, so a failed run leaves the original sandbox intact and ready to retry.
-When a backup command reports partial archive output, NemoClaw keeps the usable entries and reports only the manifest-defined paths that could not be archived.
-
-See [Backup and Restore](backup-restore.md) for the full list of state-preservation guarantees, snapshot retention, and instructions for manual backups when the auto-flow is not enough.
-
-**If the rebuild aborts with `Missing credential: <KEY>`:**
-
-The rebuild preflight reads the provider credential recorded by your last `nemoclaw onboard` session.
-If you have switched providers since onboarding, for example from a remote API to a local Ollama setup, the preflight may still reference the old key and fail before any destroy step runs.
-
-To recover, re-run `nemoclaw onboard` and select your current provider.
-This refreshes the session metadata.
-Your existing container keeps serving traffic until the new image is ready.
diff --git a/.agents/skills/nemoclaw-user-overview/SKILL.md b/.agents/skills/nemoclaw-user-overview/SKILL.md
index 41250f65ec..01ed184ec2 100644
--- a/.agents/skills/nemoclaw-user-overview/SKILL.md
+++ b/.agents/skills/nemoclaw-user-overview/SKILL.md
@@ -1,17 +1,17 @@
 ---
 name: "nemoclaw-user-overview"
-description: "Explains how OpenClaw, OpenShell, and NemoClaw form the ecosystem, NemoClaw's position in the stack, what NemoClaw adds beyond the community sandbox, and when to prefer NemoClaw versus integrating OpenShell and OpenClaw directly. Use when users ask about the relationship between OpenClaw, OpenShell, and NemoClaw, or when to use NemoClaw versus OpenShell. Trigger keywords - nemoclaw ecosystem, openclaw openshell, nemoclaw vs openshell, sandboxed openclaw, how nemoclaw works, nemoclaw sandbox lifecycle blueprint, nemoclaw overview, openclaw always-on assistants, nvidia openshell, nvidia nemotron, nemoclaw release notes, nemoclaw changelog."
+description: "Explains what NemoClaw covers: onboarding, lifecycle management, and OpenClaw operations within OpenShell containers, plus capabilities and why it exists. Use when users ask what NemoClaw is or what the project provides. For ecosystem placement or OpenShell-only paths, use the Ecosystem page; for internal mechanics, use How It Works. Trigger keywords - nemoclaw overview, openclaw always-on assistants, nvidia openshell, nvidia nemotron, nemoclaw ecosystem, openclaw openshell, nemoclaw vs openshell, sandboxed openclaw, how nemoclaw works, nemoclaw sandbox lifecycle blueprint, nemoclaw release notes, nemoclaw changelog."
 license: "Apache-2.0"
 ---
 
 <!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
 <!-- SPDX-License-Identifier: Apache-2.0 -->
 
-# Ecosystem
+# NemoClaw User Overview
 
 ## References
 
+- **Load [references/overview.md](references/overview.md)** when users ask what NemoClaw is or what the project provides. For ecosystem placement or OpenShell-only paths, use the Ecosystem page; for internal mechanics, use How It Works. Explains what NemoClaw covers: onboarding, lifecycle management, and OpenClaw operations within OpenShell containers, plus capabilities and why it exists.
 - **Load [references/ecosystem.md](references/ecosystem.md)** when users ask about the relationship between OpenClaw, OpenShell, and NemoClaw, or when to use NemoClaw versus OpenShell. Explains how OpenClaw, OpenShell, and NemoClaw form the ecosystem, NemoClaw's position in the stack, what NemoClaw adds beyond the community sandbox, and when to prefer NemoClaw versus integrating OpenShell and OpenClaw directly.
 - **Load [references/how-it-works.md](references/how-it-works.md)** for sandbox lifecycle and architecture mechanics; not for product definition (Overview) or multi-project placement (Ecosystem). Describes how NemoClaw works internally: CLI, plugin, blueprint runner, OpenShell orchestration, inference routing, and protection layers.
-- **Load [references/overview.md](references/overview.md)** when users ask what NemoClaw is or what the project provides. For ecosystem placement or OpenShell-only paths, use the Ecosystem page; for internal mechanics, use How It Works. Explains what NemoClaw covers: onboarding, lifecycle management, and OpenClaw operations within OpenShell containers, plus capabilities and why it exists.
 - **Load [references/release-notes.md](references/release-notes.md)** when users ask about recent changes, the release cadence, or where to track versioned assets on GitHub. Includes the NemoClaw release notes.
diff --git a/.agents/skills/nemoclaw-user-reference/SKILL.md b/.agents/skills/nemoclaw-user-reference/SKILL.md
index 020f52f5df..c1791c9e14 100644
--- a/.agents/skills/nemoclaw-user-reference/SKILL.md
+++ b/.agents/skills/nemoclaw-user-reference/SKILL.md
@@ -7,7 +7,7 @@ license: "Apache-2.0"
 <!-- SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. -->
 <!-- SPDX-License-Identifier: Apache-2.0 -->
 
-# Architecture Details
+# NemoClaw User Reference
 
 ## References
 
diff --git a/docs/CONTRIBUTING.md b/docs/CONTRIBUTING.md
index 2c7836d7d4..f6a5673958 100644
--- a/docs/CONTRIBUTING.md
+++ b/docs/CONTRIBUTING.md
@@ -47,10 +47,10 @@ The current generated skills and their source pages are:
 |---|---|
 | `nemoclaw-user-overview` | `docs/about/overview.mdx`, `docs/about/ecosystem.mdx`, `docs/about/how-it-works.mdx`, `docs/about/release-notes.mdx` |
 | `nemoclaw-user-agent-skills` | `docs/resources/agent-skills.mdx` |
-| `nemoclaw-user-deploy-remote` | `docs/deployment/deploy-to-remote-gpu.mdx`, `docs/deployment/install-openclaw-plugins.mdx`, `docs/deployment/sandbox-hardening.mdx` |
+| `nemoclaw-user-deploy-remote` | `docs/deployment/deploy-to-remote-gpu.mdx`, `docs/deployment/brev-web-ui.mdx`, `docs/deployment/install-openclaw-plugins.mdx`, `docs/deployment/sandbox-hardening.mdx` |
 | `nemoclaw-user-get-started` | `docs/get-started/prerequisites.mdx`, `docs/get-started/quickstart.mdx`, `docs/get-started/quickstart-hermes.mdx`, `docs/get-started/windows-preparation.mdx` |
-| `nemoclaw-user-configure-inference` | `docs/inference/inference-options.mdx`, `docs/inference/use-local-inference.mdx`, `docs/inference/switch-inference-providers.mdx`, `docs/inference/set-up-sub-agent.mdx` |
-| `nemoclaw-user-manage-sandboxes` | `docs/manage-sandboxes/lifecycle.mdx`, `docs/manage-sandboxes/messaging-channels.mdx`, `docs/manage-sandboxes/workspace-files.mdx`, `docs/manage-sandboxes/backup-restore.mdx` |
+| `nemoclaw-user-configure-inference` | `docs/inference/inference-options.mdx`, `docs/inference/use-local-inference.mdx`, `docs/inference/switch-inference-providers.mdx`, `docs/inference/set-up-sub-agent.mdx`, `docs/inference/tool-calling-reliability.mdx` |
+| `nemoclaw-user-manage-sandboxes` | `docs/manage-sandboxes/lifecycle.mdx`, `docs/manage-sandboxes/runtime-controls.mdx`, `docs/manage-sandboxes/messaging-channels.mdx`, `docs/manage-sandboxes/workspace-files.mdx`, `docs/manage-sandboxes/backup-restore.mdx` |
 | `nemoclaw-user-monitor-sandbox` | `docs/monitoring/monitor-sandbox-activity.mdx` |
 | `nemoclaw-user-manage-policy` | `docs/network-policy/customize-network-policy.mdx`, `docs/network-policy/integration-policy-examples.mdx`, `docs/network-policy/approve-network-requests.mdx` |
 | `nemoclaw-user-reference` | `docs/reference/architecture.mdx`, `docs/reference/commands.mdx`, `docs/reference/cli-selection-guide.mdx`, `docs/reference/network-policies.mdx`, `docs/reference/troubleshooting.mdx` |
@@ -87,16 +87,19 @@ Other useful flags:
 
 | Flag | Purpose |
 |------|---------|
-| `--strategy <name>` | Grouping strategy: `smart` (default), `grouped`, or `individual`. |
+| `--strategy <name>` | Grouping strategy: `grouped` (default) or `individual`. |
 | `--doc-platform <name>` | Source format: `fern-mdx` for migrated Fern pages or `myst-md` for legacy Markdown. |
 | `--name-map CAT=NAME` | Override a generated skill name (e.g. `--name-map about=overview`). |
 | `--exclude <file>` | Skip specific files (e.g. `--exclude "release-notes.mdx"`). |
 
 ### How the Script Works
 
-The script reads YAML frontmatter from each doc page to determine its content type (`how_to`, `concept`, `reference`, `get_started`), then groups pages into skills using the `smart` strategy by default.
-Within each group, the procedure page (`how_to`, `get_started`, or `tutorial`) with the lowest `skill.priority` becomes the main body of the skill.
-Sibling procedure pages, concept pages, and reference pages go into a `references/` subdirectory for progressive disclosure, keeping `SKILL.md` concise while preserving access to the full docs.
+The script reads YAML frontmatter from each doc page to determine its content type (`how_to`, `concept`, `reference`, `get_started`), then groups pages into skills using the `grouped` strategy by default.
+Within each directory group, the highest-priority procedure page (`how_to`, `get_started`, or `tutorial`) becomes the full body of `SKILL.md`.
+Sibling pages are written unchanged to `references/`.
+Groups with no procedure page keep every sibling in `references/` only.
+
+Use `--strategy individual` to emit one skill per `how_to`, `get_started`, or `tutorial` page, collect `concept` pages into `nemoclaw-user-concept`, and collect `reference` pages (and other non-procedure types) into `nemoclaw-user-reference`.
 
 Cross-references between doc pages are rewritten as skill-to-skill pointers so agents can navigate between skills.
 Fern MDX components and MyST/Sphinx directives are converted to standard markdown.
diff --git a/docs/about/overview.mdx b/docs/about/overview.mdx
index 1b7c9a8e0f..e387c3dee1 100644
--- a/docs/about/overview.mdx
+++ b/docs/about/overview.mdx
@@ -8,6 +8,8 @@ description-agent: "Explains what NemoClaw covers: onboarding, lifecycle managem
 keywords: ["nemoclaw overview", "openclaw always-on assistants", "nvidia openshell", "nvidia nemotron"]
 content:
   type: "concept"
+skill:
+  priority: 10
 ---
 NVIDIA NemoClaw is an open-source reference stack that simplifies running [OpenClaw](https://openclaw.ai) always-on assistants more safely.
 NemoClaw provides onboarding, lifecycle management, and OpenClaw operations within OpenShell containers.
diff --git a/docs/resources/agent-skills.mdx b/docs/resources/agent-skills.mdx
index 105a348d84..e701d6ff43 100644
--- a/docs/resources/agent-skills.mdx
+++ b/docs/resources/agent-skills.mdx
@@ -7,7 +7,7 @@ description: "NemoClaw ships agent skills that let AI coding assistants guide yo
 description-agent: "Describes the agent skills shipped with NemoClaw and how to access them by cloning the repository. Use when users ask about AI agent support, coding assistant integration, or the .agents/skills/ directory."
 keywords: ["nemoclaw agent skills", "ai coding assistant", "cursor", "claude code", "copilot"]
 content:
-  type: "concept"
+  type: "how_to"
 ---
 NemoClaw ships agent skills that are generated directly from this documentation.
 Each skill is a converted version of one or more doc pages, structured so AI coding assistants can consume it as context.
diff --git a/scripts/docs-to-skills.py b/scripts/docs-to-skills.py
index c8bb85fb7a..67527542fd 100755
--- a/scripts/docs-to-skills.py
+++ b/scripts/docs-to-skills.py
@@ -18,22 +18,19 @@
   1. Scans a docs directory for Markdown or Fern MDX files with YAML frontmatter.
   2. Classifies each page by content type (how_to, concept, reference,
      get_started) using the frontmatter `content.type` field.
-  3. Groups pages into skills using one of three strategies:
-       - smart (default): groups by directory; the procedure page with the
-         lowest frontmatter `skill.priority` becomes the main SKILL.md body,
-         while sibling procedure, concept, and reference pages ride along as
-         reference files.
-       - grouped: groups all pages in the same parent directory.
-       - individual: each doc page becomes its own skill.
+  3. Groups pages into skills using one of two strategies:
+       - grouped (default): groups by parent directory; the highest-priority
+         procedure page (``how_to``, ``get_started``, or ``tutorial``) becomes
+         the full SKILL.md body and siblings go to ``references/``. Groups
+         with no procedure page put every sibling in ``references/`` only.
+       - individual: each ``how_to``, ``get_started``, or ``tutorial`` page
+         becomes its own skill; ``concept`` pages collect into
+         ``nemoclaw-user-concept`` and ``reference`` pages (plus other
+         non-procedure types) collect into ``nemoclaw-user-reference``.
   4. Generates a skill directory per group containing:
-       - SKILL.md with frontmatter (name, description), prerequisites,
-         procedural steps for the primary procedure page, a References
-         section that links to sibling pages, and a Related Skills section.
-         Sibling procedure, concept, and reference bodies are not inlined,
-         so SKILL.md stays small and nothing is truncated mid-table or
-         mid-code-fence.
-       - references/ with the full sibling procedure, concept, and reference
-         content for progressive disclosure (loaded by the agent on demand).
+       - SKILL.md with frontmatter (name, description), the lead page body,
+         a References section linking sibling pages, and Related Skills links.
+       - references/ with full sibling page content for progressive disclosure.
   5. Resolves all relative doc paths to repo-root-relative paths, and
      converts cross-references between docs into skill-to-skill pointers
      so agents can navigate between skills.
@@ -154,6 +151,19 @@ def space_anchor_headings(text: str) -> str:
     return re.sub(r'(?m)^(<a\s+id="[^"]+"></a>)\n(#{1,6}\s)', r"\1\n\n\2", text)
 
 
+def collapse_consecutive_blank_lines(text: str) -> str:
+    """Collapse runs of blank lines to a single blank line (markdownlint MD012)."""
+    return re.sub(r"\n{3,}", "\n\n", text)
+
+
+def append_markdown_section(lines: list[str], heading: str) -> None:
+    """Append a section heading, avoiding duplicate blank lines before it."""
+    if lines and lines[-1] != "":
+        lines.append("")
+    lines.append(heading)
+    lines.append("")
+
+
 # ---------------------------------------------------------------------------
 # Frontmatter / doc parsing
 # ---------------------------------------------------------------------------
@@ -1129,6 +1139,7 @@ def _safe_truncation_point(lines: list[str], target: int) -> int:
 
 CATEGORY_NOUNS = {
     "about": "overview",
+    "concept": "concept",
     "reference": "reference",
     "get-started": "get-started",
     "root": "overview",
@@ -1434,8 +1445,10 @@ def _to_third_person(sentence: str) -> str:
     "concept": "context",
     "reference": "reference",
 }
+PROCEDURE_CONTENT_TYPES = frozenset({"how_to", "get_started", "tutorial"})
+SKIP_SKILL_SECTIONS = frozenset({"prerequisites", "before you begin", "troubleshooting"})
+RELATED_SKILL_SECTIONS = frozenset({"related topics", "next steps"})
 SKILL_FRONTMATTER_LICENSE = "Apache-2.0"
-MAX_SKILL_MD_CHARS = 11_500
 
 
 def markdown_spdx_header() -> str:
@@ -1449,37 +1462,6 @@ def markdown_spdx_header() -> str:
     )
 
 
-def split_markdown_h3_sections(content: str) -> tuple[str, list[tuple[str, str]]]:
-    """Split an H2 body into preamble plus H3 subsection blocks."""
-    preamble: list[str] = []
-    sections: list[tuple[str, str]] = []
-    current_heading: str | None = None
-    current_lines: list[str] = []
-
-    def _flush() -> None:
-        nonlocal current_heading, current_lines
-        if current_heading is None:
-            return
-        body = "\n".join(current_lines).strip()
-        sections.append((current_heading, body))
-        current_heading = None
-        current_lines = []
-
-    for line in content.split("\n"):
-        if line.startswith("### "):
-            _flush()
-            current_heading = line[4:].strip()
-            current_lines = [line]
-            continue
-        if current_heading is None:
-            preamble.append(line)
-        else:
-            current_lines.append(line)
-    _flush()
-
-    return "\n".join(preamble).strip(), sections
-
-
 _SECTION_HEADING_RE = re.compile(r"(?m)^(#{2,6})\s+(.+)$")
 
 
@@ -1522,37 +1504,67 @@ def canonicalize_leading_h1(body: str, title: str) -> str:
     return f"# {title}\n\n{body}".rstrip()
 
 
-def partition_skill_pages(
-    pages: list[DocPage],
-) -> tuple[list[DocPage], list[DocPage], list[DocPage], list[DocPage]]:
-    """Split a doc group into inline procedures and deferred references.
-
-    The converter preserves the existing one-skill-per-docs-area grouping, but
-    keeps SKILL.md focused by inlining only one primary procedure. The primary
-    procedure is the page with the lowest frontmatter ``skill.priority``;
-    additional how-to/tutorial pages still contribute triggers through the
-    skill description and are written to references/ for progressive disclosure.
+
+def partition_skill_pages(pages: list[DocPage]) -> tuple[DocPage, list[DocPage]]:
+    """Split a skill group into the lead page and reference siblings.
+
+    The lead page has the lowest ``skill.priority`` value (highest priority).
+    Used by the ``individual`` strategy.
     """
-    procedures = [
-        p for p in pages if CONTENT_TYPE_ROLE.get(p.content_type) == "procedure"
-    ]
-    # Pages without a recognized content_type default to procedure.
-    procedures.extend([p for p in pages if p.content_type not in CONTENT_TYPE_ROLE])
+    ordered = sorted(pages, key=lambda p: (p.skill_priority, str(p.path)))
+    return ordered[0], ordered[1:]
 
-    context_pages = [
-        p for p in pages if CONTENT_TYPE_ROLE.get(p.content_type) == "context"
-    ]
-    reference_pages = [
-        p for p in pages if CONTENT_TYPE_ROLE.get(p.content_type) == "reference"
-    ]
 
-    if not procedures:
-        return [], [], context_pages, reference_pages
+def partition_grouped_skill_pages(
+    pages: list[DocPage],
+) -> tuple[DocPage | None, list[DocPage]]:
+    """Split a grouped skill into an optional inline lead and reference siblings.
 
-    procedures = sorted(procedures, key=lambda p: (p.skill_priority, str(p.path)))
-    primary = [procedures[0]]
-    deferred = procedures[1:]
-    return primary, deferred, context_pages, reference_pages
+    When the group contains a procedure page (``how_to``, ``get_started``, or
+    ``tutorial``), the one with the lowest ``skill.priority`` becomes the
+    SKILL.md body and siblings go to ``references/``. Otherwise every page is
+    reference-only progressive disclosure.
+    """
+    ordered = sorted(pages, key=lambda p: (p.skill_priority, str(p.path)))
+    candidates = [p for p in pages if p.content_type in PROCEDURE_CONTENT_TYPES]
+    if not candidates:
+        return None, ordered
+    lead = min(candidates, key=lambda p: (p.skill_priority, str(p.path)))
+    refs = [p for p in ordered if p is not lead]
+    return lead, refs
+
+
+def _append_page_sections_to_skill(
+    page: DocPage,
+    lines: list[str],
+    *,
+    clean_fn,
+    skill_md_images: list[tuple[Path, str]],
+    skill_md_local_links: dict[str, str],
+    collected_related: list[str],
+) -> None:
+    """Append a doc page body to SKILL.md lines."""
+    for heading, content in page.sections:
+        heading_lower = heading.lower()
+        if heading_lower in SKIP_SKILL_SECTIONS:
+            continue
+        if heading_lower in RELATED_SKILL_SECTIONS:
+            collected_related.append(
+                clean_fn(content, page, skill_md_images, skill_md_local_links)
+            )
+            continue
+        cleaned = clean_fn(content, page, skill_md_images, skill_md_local_links)
+        if not heading:
+            cleaned = re.sub(r"^#\s+.+(?:\n|$)", "", cleaned)
+            if cleaned.strip():
+                lines.append(cleaned.strip())
+                lines.append("")
+            continue
+        lines.append(f"## {heading}")
+        lines.append("")
+        if cleaned.strip():
+            lines.append(cleaned.strip())
+            lines.append("")
 
 
 def generate_skill(
@@ -1564,20 +1576,10 @@ def generate_skill(
     doc_to_skill: dict[str, str] | None = None,
     html_baseurl: str | None = None,
     doc_platform: str = "myst-md",
+    strategy: str = "grouped",
     dry_run: bool = False,
 ) -> dict:
-    """Generate a complete skill directory from a group of doc pages.
-
-    Writes identical output to each directory in *output_dirs*. Since
-    inter-doc links are rewritten to either skill cross-references or
-    absolute HTTPS URLs (see :func:`rewrite_doc_paths`), the emitted
-    content is independent of where it is written and can safely be
-    mirrored across multiple output roots. Image assets referenced by
-    the source pages are copied alongside the file that links them so
-    the rendered skill works without network access.
-
-    Returns a summary dict for reporting.
-    """
+    """Generate a complete skill directory from a group of doc pages."""
     skill_md_images: list[tuple[Path, str]] = []
     ref_images: dict[str, list[tuple[Path, str]]] = {}
 
@@ -1587,7 +1589,6 @@ def _clean(
         image_acc: list[tuple[Path, str]],
         local_doc_links: dict[str, str] | None = None,
     ) -> str:
-        """Apply directive cleanup and path rewriting for a source page."""
         if doc_platform == "fern-mdx":
             result = clean_fern_mdx(text)
         else:
@@ -1605,10 +1606,15 @@ def _clean(
             image_acc.extend(copies)
         return result
 
-    procedures, deferred_procedures, context_pages, reference_pages = (
-        partition_skill_pages(pages)
+    if strategy == "grouped":
+        primary_page, reference_pages = partition_grouped_skill_pages(pages)
+    else:
+        primary_page, reference_pages = partition_skill_pages(pages)
+
+    ordered_pages = sorted(pages, key=lambda p: (p.skill_priority, str(p.path)))
+    description_pages = (
+        [primary_page, *reference_pages] if primary_page is not None else ordered_pages
     )
-    ref_section_pages = deferred_procedures + context_pages + reference_pages
 
     def _page_rel(page: DocPage) -> str | None:
         if docs_dir is None:
@@ -1620,135 +1626,21 @@ def _page_rel(page: DocPage) -> str | None:
 
     skill_md_local_links: dict[str, str] = {}
     reference_local_links: dict[str, str] = {}
-    for page in ref_section_pages:
+    for page in reference_pages:
         rel = _page_rel(page)
         if rel is None:
             continue
         ref_name = page.path.stem + ".md"
         skill_md_local_links[rel] = f"references/{ref_name}"
         reference_local_links[rel] = ref_name
-    for page in procedures:
-        rel = _page_rel(page)
-        if rel is not None:
-            reference_local_links[rel] = "../SKILL.md"
+    if primary_page is not None:
+        primary_rel = _page_rel(primary_page)
+        if primary_rel is not None:
+            reference_local_links[primary_rel] = "../SKILL.md"
 
-    description_pages = (
-        procedures + deferred_procedures + context_pages + reference_pages
-        if procedures
-        else pages
-    )
     description = build_skill_description(name, description_pages)
-    generated_ref_sections: dict[str, list[str]] = {}
-    generated_ref_topics: dict[str, list[str]] = {}
-    generated_ref_images: dict[str, list[tuple[Path, str]]] = {}
-    generated_ref_names: dict[Path, str] = {}
-    reserved_ref_names = {page.path.stem + ".md" for page in ref_section_pages}
-
-    def _unique_generated_ref_name(page: DocPage) -> str:
-        cached = generated_ref_names.get(page.path)
-        if cached:
-            return cached
-        base = re.sub(r"[^a-z0-9-]", "-", page.path.stem.lower()).strip("-")
-        candidate = f"{base}-details.md"
-        suffix = 2
-        while candidate in reserved_ref_names:
-            candidate = f"{base}-details-{suffix}.md"
-            suffix += 1
-        reserved_ref_names.add(candidate)
-        generated_ref_names[page.path] = candidate
-        return candidate
-
-    def _defer_detail(
-        page: DocPage,
-        section_heading: str,
-        content: str,
-        topic: str | None = None,
-    ) -> str:
-        """Store overflow procedure detail in a generated reference file."""
-        ref_name = _unique_generated_ref_name(page)
-        if ref_name not in generated_ref_sections:
-            title = page.title or _brand_case(page.path.stem.replace("-", " ").title())
-            generated_ref_sections[ref_name] = [f"# {title}: Details"]
-            generated_ref_topics[ref_name] = []
-            generated_ref_images[ref_name] = []
-        block = content.strip()
-        if not block:
-            return ref_name
-        # Overflow content was cleaned for SKILL.md first, where same-skill
-        # reference targets live under references/. Once moved into a generated
-        # reference file, those targets are siblings.
-        block = re.sub(r"\]\(references/([^)]+)\)", r"](\1)", block)
-        if not block.startswith("#"):
-            block = f"## {section_heading}\n\n{block}"
-        generated_ref_sections[ref_name].append(block)
-        topic_text = topic or section_heading
-        if topic_text and topic_text not in generated_ref_topics[ref_name]:
-            generated_ref_topics[ref_name].append(topic_text)
-        return ref_name
-
-    def _current_skill_size() -> int:
-        return len("\n".join(lines))
-
-    def _append_section_or_defer(
-        page: DocPage,
-        heading: str,
-        cleaned_content: str,
-    ) -> None:
-        """Append a procedure section, moving overflow detail to references."""
-        section_lines = [f"## {heading}", "", cleaned_content, ""]
-        if (
-            _current_skill_size() + len("\n".join(section_lines))
-            <= MAX_SKILL_MD_CHARS
-        ):
-            lines.extend(section_lines)
-            return
-
-        preamble, subsections = split_markdown_h3_sections(cleaned_content)
-        if not subsections:
-            ref_name = _defer_detail(page, heading, cleaned_content)
-            lines.extend(
-                [
-                    f"## {heading}",
-                    "",
-                    f"Load [references/{ref_name}](references/{ref_name}) for detailed steps.",
-                    "",
-                ]
-            )
-            return
-
-        lines.append(f"## {heading}")
-        lines.append("")
-        if preamble:
-            lines.append(preamble)
-            lines.append("")
-
-        deferred_topics: list[str] = []
-        for subheading, block in subsections:
-            block_lines = [block, ""]
-            if (
-                _current_skill_size() + len("\n".join(block_lines))
-                <= MAX_SKILL_MD_CHARS
-            ):
-                lines.extend(block_lines)
-                continue
-            ref_name = _defer_detail(page, heading, block, topic=subheading)
-            if subheading not in deferred_topics:
-                deferred_topics.append(subheading)
-
-        if deferred_topics:
-            ref_name = _unique_generated_ref_name(page)
-            topic_text = ", ".join(deferred_topics[:3])
-            if len(deferred_topics) > 3:
-                topic_text += ", and related details"
-            lines.append(
-                f"Load [references/{ref_name}](references/{ref_name}) for detailed steps on {topic_text}."
-            )
-            lines.append("")
-
-    # Build SKILL.md content
     lines: list[str] = []
 
-    # Frontmatter
     lines.append("---")
     lines.append(f"name: {yaml_scalar(name)}")
     lines.append(f"description: {yaml_scalar(description)}")
@@ -1758,124 +1650,87 @@ def _append_section_or_defer(
     lines.append(markdown_spdx_header().rstrip("\n"))
     lines.append("")
 
-    # Title — prefer the lead page's frontmatter `title.page` (or H1)
-    # verbatim so the SKILL.md heading matches the source doc instead of
-    # echoing the auto-generated, prefix-laden skill name.
-    lead_page = procedures[0] if procedures else pages[0] if pages else None
-    if lead_page and lead_page.title:
-        skill_title = lead_page.title
-    else:
-        skill_title = _brand_case(name.replace("-", " ").title())
+    skill_title = (
+        primary_page.title
+        if primary_page is not None and primary_page.title
+        else _brand_case(name.replace("-", " ").title())
+    )
     lines.append(f"# {skill_title}")
-    lines.append("")
-
-    # Gotchas — surface :::{warning} admonitions from the source procedure
-    # pages at the top so the agent sees non-obvious corrections before it
-    # commits to a path through the steps. The warnings stay in place
-    # inline; this section is a directed summary, not a replacement.
-    gotchas = _extract_gotchas(procedures, doc_platform=doc_platform)
-    if gotchas:
-        lines.append("## Gotchas")
-        lines.append("")
-        for g in gotchas:
-            lines.append(g)
+    if primary_page is not None:
         lines.append("")
 
-    # Prerequisites (merged from all procedure pages, deduplicated)
-    prereq_items: list[str] = []
-    seen_prereqs: set[str] = set()
-    for pp in procedures:
-        for heading, content in pp.sections:
-            if heading.lower() in ("prerequisites", "before you begin"):
-                cleaned = _clean(
-                    content, pp, skill_md_images, skill_md_local_links
-                )
-                for item_line in cleaned.split("\n"):
-                    stripped = item_line.strip()
-                    if stripped.startswith("- "):
-                        if prereq_items and not prereq_items[-1].startswith("- "):
-                            prereq_items.append("")
-                        norm = stripped.lower().strip("- .")
-                        if norm not in seen_prereqs:
-                            seen_prereqs.add(norm)
-                            prereq_items.append(stripped)
-                    elif stripped and not prereq_items:
-                        prereq_items.append(stripped)
-
-    if prereq_items:
-        lines.append("## Prerequisites")
-        lines.append("")
-        for item in prereq_items:
-            lines.append(item)
-        lines.append("")
-
-    # Procedural sections from how_to and get_started pages
-    skip_sections = {"prerequisites", "before you begin", "troubleshooting"}
-    related_sections = {"related topics", "next steps"}
-    collected_related: list[str] = []  # raw content from related sections
-    for idx, pp in enumerate(procedures):
-        # When merging multiple docs, add a transition heading
-        if len(procedures) > 1 and idx > 0 and pp.title:
-            lines.append("---")
+    if primary_page is not None:
+        gotchas = _extract_gotchas([primary_page], doc_platform=doc_platform)
+        if gotchas:
+            lines.append("## Gotchas")
+            lines.append("")
+            for gotcha in gotchas:
+                lines.append(gotcha)
             lines.append("")
 
-        for heading, content in pp.sections:
-            if heading.lower() in skip_sections:
-                continue
-            if heading.lower() in related_sections:
-                collected_related.append(
-                    _clean(content, pp, skill_md_images, skill_md_local_links)
-                )
-                continue
-            if not heading:
-                cleaned = _clean(content, pp, skill_md_images, skill_md_local_links)
-                cleaned = re.sub(r"^#\s+.+\n+", "", cleaned)
-                if cleaned.strip():
-                    lines.append(cleaned)
-                    lines.append("")
+        prereq_items: list[str] = []
+        seen_prereqs: set[str] = set()
+        for heading, content in primary_page.sections:
+            if heading.lower() not in ("prerequisites", "before you begin"):
                 continue
+            cleaned = _clean(content, primary_page, skill_md_images, skill_md_local_links)
+            for item_line in cleaned.split("\n"):
+                stripped = item_line.strip()
+                if stripped.startswith("- "):
+                    if prereq_items and not prereq_items[-1].startswith("- "):
+                        prereq_items.append("")
+                    norm = stripped.lower().strip("- .")
+                    if norm not in seen_prereqs:
+                        seen_prereqs.add(norm)
+                        prereq_items.append(stripped)
+                elif stripped and not prereq_items:
+                    prereq_items.append(stripped)
 
-            cleaned_content = _clean(
-                content, pp, skill_md_images, skill_md_local_links
-            )
-            _append_section_or_defer(pp, heading, cleaned_content)
+        if prereq_items:
+            lines.append("## Prerequisites")
+            lines.append("")
+            for item in prereq_items:
+                lines.append(item)
+            lines.append("")
 
-    # Build Related Skills from collected sections + any remaining in body
-    raw_md = "\n".join(lines)
-    raw_md, body_related = extract_related_skills(raw_md)
-    lines = raw_md.rstrip("\n").split("\n")
+        collected_related: list[str] = []
+        _append_page_sections_to_skill(
+            primary_page,
+            lines,
+            clean_fn=_clean,
+            skill_md_images=skill_md_images,
+            skill_md_local_links=skill_md_local_links,
+            collected_related=collected_related,
+        )
 
-    # Also extract from the collected_related content
-    all_related_text = "\n".join(
-        f"## Related Topics\n\n{block}" for block in collected_related
-    )
-    _, section_related = extract_related_skills(all_related_text)
+        raw_md = "\n".join(lines)
+        raw_md, body_related = extract_related_skills(raw_md)
+        lines = raw_md.rstrip("\n").split("\n")
 
-    # Merge and deduplicate
-    seen_skills: set[str] = set()
-    merged_entries: list[str] = []
-    for entry in section_related + body_related:
-        skill_match = re.search(r"`([a-z0-9-]+)`", entry)
-        key = skill_match.group(1) if skill_match else entry
-        if key == name:
-            continue  # skip self-references
-        if key not in seen_skills:
-            seen_skills.add(key)
-            merged_entries.append(entry)
-
-    # References section — point at the full concept/reference files that
-    # ship alongside SKILL.md. Each bullet leads with the activation
-    # trigger from description.agent (the "Use when ..." clause) so the
-    # agent can decide on-sight whether to load the file, which is how
-    # progressive disclosure is supposed to work.
-    if ref_section_pages or generated_ref_topics:
-        lines.append("")
-        lines.append("## References")
-        lines.append("")
-        for rp in ref_section_pages:
-            ref_name = rp.path.stem + ".md"
+        all_related_text = "\n".join(
+            f"## Related Topics\n\n{block}" for block in collected_related
+        )
+        _, section_related = extract_related_skills(all_related_text)
+
+        seen_skills: set[str] = set()
+        merged_entries: list[str] = []
+        for entry in section_related + body_related:
+            skill_match = re.search(r"`([a-z0-9-]+)`", entry)
+            key = skill_match.group(1) if skill_match else entry
+            if key == name:
+                continue
+            if key not in seen_skills:
+                seen_skills.add(key)
+                merged_entries.append(entry)
+    else:
+        merged_entries = []
+
+    if reference_pages:
+        append_markdown_section(lines, "## References")
+        for ref_page in reference_pages:
+            ref_name = ref_page.path.stem + ".md"
             file_link = f"[references/{ref_name}](references/{ref_name})"
-            covers, trigger = _split_description_trigger(rp.description or "")
+            covers, trigger = _split_description_trigger(ref_page.description or "")
             if trigger:
                 bullet = f"- **Load {file_link}** {trigger}."
                 if covers:
@@ -1885,47 +1740,31 @@ def _append_section_or_defer(
             else:
                 bullet = f"- {file_link}"
             lines.append(bullet)
-        for ref_name, topics in generated_ref_topics.items():
-            file_link = f"[references/{ref_name}](references/{ref_name})"
-            topic_text = ", ".join(topics[:3])
-            if len(topics) > 3:
-                topic_text += ", and related details"
-            lines.append(
-                f"- **Load {file_link}** when you need detailed steps for {topic_text}."
-            )
 
     if merged_entries:
-        lines.append("")
-        lines.append("## Related Skills")
-        lines.append("")
+        append_markdown_section(lines, "## Related Skills")
         for entry in merged_entries:
             lines.append(entry)
         lines.append("")
 
-    skill_md = normalize_heading_levels("\n".join(lines))
+    skill_md = collapse_consecutive_blank_lines(
+        normalize_heading_levels("\n".join(lines))
+    )
 
-    # --- Build reference files ---
     ref_files: dict[str, str] = {}
-    for ref_name, sections in generated_ref_sections.items():
-        body = "\n\n".join(sections)
-        body = normalize_heading_levels(dedupe_repeated_heading_sections(body))
-        ref_files[ref_name] = body
-        ref_images[ref_name] = generated_ref_images.get(ref_name, [])
-
-    for rp in deferred_procedures + reference_pages + context_pages:
-        ref_name = rp.path.stem + ".md"
+    for ref_page in reference_pages:
+        ref_name = ref_page.path.stem + ".md"
         ref_image_acc: list[tuple[Path, str]] = []
-        body = _clean(rp.body, rp, ref_image_acc, reference_local_links)
-        if doc_platform == "myst-md" and rp.title:
-            body = canonicalize_leading_h1(body, rp.title)
-        elif doc_platform == "fern-mdx" and rp.title and not body.startswith("# "):
-            body = f"# {rp.title}\n\n{body}".rstrip()
+        body = _clean(ref_page.body, ref_page, ref_image_acc, reference_local_links)
+        if doc_platform == "myst-md" and ref_page.title:
+            body = canonicalize_leading_h1(body, ref_page.title)
+        elif doc_platform == "fern-mdx" and ref_page.title and not body.startswith("# "):
+            body = f"# {ref_page.title}\n\n{body}".rstrip()
         body = normalize_heading_levels(body)
         body = dedupe_repeated_heading_sections(body)
         ref_files[ref_name] = body
         ref_images[ref_name] = ref_image_acc
 
-    # --- Write output ---
     summary = {
         "name": name,
         "dirs": [str(d / name) for d in output_dirs],
@@ -1947,16 +1786,19 @@ def _append_section_or_defer(
         _copy_skill_images(skill_dir, skill_md_images)
 
         spdx_ref = markdown_spdx_header()
-
-
+        refs_dir = skill_dir / "references"
         if ref_files:
-            refs_dir = skill_dir / "references"
             refs_dir.mkdir(exist_ok=True)
+            for existing in refs_dir.glob("*.md"):
+                if existing.name not in ref_files:
+                    existing.unlink()
             for fname, content in ref_files.items():
                 (refs_dir / fname).write_text(
                     spdx_ref + content.rstrip("\n") + "\n", encoding="utf-8"
                 )
                 _copy_skill_images(refs_dir, ref_images.get(fname, []))
+        elif refs_dir.is_dir():
+            shutil.rmtree(refs_dir)
 
     return summary
 
@@ -2003,36 +1845,27 @@ def group_by_directory(pages: list[DocPage]) -> dict[str, list[DocPage]]:
 
 
 def group_individual(pages: list[DocPage]) -> dict[str, list[DocPage]]:
-    """Each page becomes its own skill."""
-    return {page.path.stem: [page] for page in pages}
-
-
-def group_by_content_type(pages: list[DocPage]) -> dict[str, list[DocPage]]:
-    """Group pages by directory when an area has procedural content."""
-    # First pass: group by directory
-    dir_groups = group_by_directory(pages)
-
-    # Second pass: keep each procedural docs area together. generate_skill()
-    # decides which page to inline and which sibling pages to defer.
-    result: dict[str, list[DocPage]] = {}
-    for cat, group_pages in dir_groups.items():
-        has_procedures = any(
-            CONTENT_TYPE_ROLE.get(p.content_type) == "procedure" for p in group_pages
-        )
-        if has_procedures or len(group_pages) > 1:
-            result[cat] = group_pages
+    """Give each procedure page its own skill; bucket concept and reference pages."""
+    groups: dict[str, list[DocPage]] = {}
+    concept_pages: list[DocPage] = []
+    reference_pages: list[DocPage] = []
+    for page in pages:
+        if page.content_type in PROCEDURE_CONTENT_TYPES:
+            groups[page.path.stem] = [page]
+        elif page.content_type == "concept":
+            concept_pages.append(page)
         else:
-            # Individual concept/reference pages become their own skill
-            for p in group_pages:
-                result[p.path.stem] = [p]
-
-    return result
+            reference_pages.append(page)
+    if concept_pages:
+        groups["concept"] = concept_pages
+    if reference_pages:
+        groups["reference"] = reference_pages
+    return groups
 
 
 STRATEGIES = {
     "grouped": group_by_directory,
     "individual": group_individual,
-    "smart": group_by_content_type,
 }
 
 
@@ -2105,10 +1938,10 @@ def main():
         formatter_class=argparse.RawDescriptionHelpFormatter,
         epilog=textwrap.dedent("""\
             Strategies:
-              grouped     Group docs by parent directory
-              individual  Each doc page becomes its own skill
-              smart       Group by directory, inline the lowest-priority procedure,
-                          defer siblings
+              grouped     Group docs by parent directory (default)
+              individual  One skill per how_to/get_started/tutorial page;
+                          concept pages -> nemoclaw-user-concept;
+                          reference pages -> nemoclaw-user-reference
 
             Examples:
               %(prog)s docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx
@@ -2129,8 +1962,8 @@ def main():
     parser.add_argument(
         "--strategy",
         choices=list(STRATEGIES.keys()),
-        default="smart",
-        help="Grouping strategy (default: smart)",
+        default="grouped",
+        help="Grouping strategy (default: grouped)",
     )
     parser.add_argument(
         "--doc-platform",
@@ -2270,6 +2103,7 @@ def main():
             doc_to_skill=doc_to_skill,
             html_baseurl=html_baseurl,
             doc_platform=args.doc_platform,
+            strategy=args.strategy,
             dry_run=args.dry_run,
         )
         summaries.append(summary)