Skip to content

docs: add DGX local inference walkthrough (Fixes #3231)#4337

Open
deepujain wants to merge 1 commit into
NVIDIA:mainfrom
deepujain:docs/3231-dgx-local-inference
Open

docs: add DGX local inference walkthrough (Fixes #3231)#4337
deepujain wants to merge 1 commit into
NVIDIA:mainfrom
deepujain:docs/3231-dgx-local-inference

Conversation

@deepujain
Copy link
Copy Markdown
Contributor

@deepujain deepujain commented May 27, 2026

Summary

Adds a single DGX Spark and DGX Station local-inference walkthrough so users do not have to stitch together host prep, provider selection, vLLM/Ollama setup, verification, and Spark-specific troubleshooting from several pages.

Fixes #3231.

Changes

  • Added docs/inference/dgx-spark-station-local-inference.mdx with GPU/CDI checks, provider choice guidance, managed vLLM commands, verification steps, and common DGX fixes.
  • Linked the walkthrough from prerequisites, local inference, troubleshooting, and docs navigation.
  • Regenerated the relevant generated user-skill references for get-started, inference, and troubleshooting.
  • Added a docs copy-paste test for the new page.

Testing

  • python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx
  • npm run build:cli
  • npm run typecheck:cli
  • npm run docs
  • npx vitest run test/dgx-local-inference-doc-copy.test.ts

Evidence it works

  • Fern docs validation completed with 0 errors.
  • The generated nemoclaw-user-configure-inference skill now points DGX Spark and DGX Station questions to the new walkthrough.
  • The docs copy-paste test confirms the walkthrough's shell examples are copyable bash blocks without prompt prefixes.

Summary by CodeRabbit

  • Documentation

    • Added a comprehensive DGX Spark/DGX Station local inference guide, updated platform/prerequisite entries and navigation links, and revised related troubleshooting and onboarding docs.
    • Expanded docs for model router configuration, shields/seal behavior, CLI commands (onboard/status/channels/rebuild/uninstall), and messaging/Telegram guidance.
    • Removed Hermes "Experimental" warning from quickstart.
  • Tests

    • Added a test ensuring the DGX local inference doc's fenced code blocks contain only bash and no interactive prompts.

Review Change Stack

Signed-off-by: Deepak Jain deepujain@gmail.com

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 27, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

Adds a new end-to-end DGX Spark/DGX Station local inference guide, integrates it into site navigation and related skill docs, expands several CLI/troubleshooting/security docs, and adds a Vitest check validating code-block formatting in the new guide.

Changes

DGX Spark and DGX Station Local Inference Documentation

Layer / File(s) Summary
Core local inference setup documentation
docs/inference/dgx-spark-station-local-inference.mdx, .agents/skills/nemoclaw-user-configure-inference/references/dgx-spark-station-local-inference.md, .agents/skills/nemoclaw-user-configure-inference/SKILL.md
New comprehensive guide covering host prerequisites (Docker, Node/npm, NVIDIA driver/toolkit, optional CDI spec generation), choice of managed vLLM vs Ollama, interactive and non-interactive onboarding (nemoclaw onboard / NEMOCLAW_PROVIDER + NEMOCLAW_VLLM_MODEL), verification (status, doctor, TUI), tool-calling notes, and DGX-specific troubleshooting (CoreDNS CrashLoop, k3s readiness timeouts, CDI GPU errors, port 3000 conflicts).
Navigation and cross-reference integration
docs/index.yml, docs/get-started/prerequisites.mdx, docs/inference/use-local-inference.mdx, docs/reference/troubleshooting.mdx, .agents/skills/nemoclaw-user-get-started/references/prerequisites.md, .agents/skills/nemoclaw-user-configure-inference/SKILL.md
Adds "DGX Local Inference" nav entry and updates prerequisites/troubleshooting/use-local-inference pages and skill references to point to the new walkthrough instead of the prior external Ollama playbook.
CLI commands and lifecycle docs
.agents/skills/nemoclaw-user-reference/references/commands.md
Expanded command reference: GPU detection and passthrough docs, status --json fields/exit semantics, channels status diagnostics, rebuild GPU-mode preservation, uninstall user-data preservation, and lifecycle env flags.
Security: shields & uninstall
.agents/skills/nemoclaw-user-configure-security/references/best-practices.md, .agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md
Documents shields up SHA-256 sealing behavior, legacy-baseline opt-in (NEMOCLAW_SHIELDS_ACCEPT_LEGACY_BASELINE), and clarifies uninstall-preserved data and non-interactive destroy opt-in.
Messaging and troubleshooting updates
.agents/skills/nemoclaw-user-manage-sandboxes/references/messaging-channels.md, .agents/skills/nemoclaw-user-reference/references/troubleshooting.md
Clarifies TELEGRAM_ALLOWED_IDS and aliases, adds repeatable e2e DM testing steps, and updates DGX troubleshooting to point to the new DGX local-inference guide and replace gateway-destroy flow with gateway-remove + pkill + --resume.
Overview & skill wording tweaks
.agents/skills/nemoclaw-user-overview/*, .agents/skills/nemoclaw-user-get-started/*, .agents/skills/nemoclaw-user-manage-sandboxes/*
Minor front-matter and reference wording changes (OpenClaw → NemoClaw, quickstart bullets, default sandbox name note for DGX express-install), and removal of an Hermes "Experimental" banner.
Documentation code block validation test
test/dgx-local-inference-doc-copy.test.ts
Vitest test ensures fenced code blocks in the new DGX local-inference doc contain no interactive shell prompts and are only labeled bash.

Estimated code review effort:
🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#4460: Updates and tests for shields up SHA-256 sealing behavior and legacy-baseline acceptance that overlap with the shields/uninstall docs added here.
  • NVIDIA/NemoClaw#3151: Prior inference-provider and router documentation changes that touch inference-options.md and relate to router pool configuration described in this PR.

Suggested labels

Getting Started, enhancement: skill

Suggested reviewers

  • miyoungc
  • cv
  • ericksoa

🐰 A DGX guide at last, no longer adrift,
vLLM or Ollama — pick the right lift.
CDI, timeouts, ports, and CoreDNS fight,
One walkthrough to follow, from preflight to night.
Pasteable commands, and tests keep it tight.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'docs: add DGX local inference walkthrough (Fixes #3231)' directly and clearly describes the primary change: adding comprehensive DGX local inference documentation that addresses the linked issue.
Linked Issues check ✅ Passed The PR comprehensively addresses all coding/documentation objectives from #3231: adds end-to-end DGX walkthrough with pre-flight checks, provider selection, onboarding steps, verification commands, inline failure modes, and self-contained documentation instead of external redirects.
Out of Scope Changes check ✅ Passed All changes are directly related to the DGX local inference documentation objective. Modifications to prerequisite references, troubleshooting links, skill documentation, and supporting files are all aligned with consolidating and improving local inference setup guidance.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.agents/skills/nemoclaw-user-reference/references/troubleshooting.md:
- Around line 1138-1139: This change edits an autogenerated skill reference and
must be reverted; do not modify generated markdown directly. Revert the manual
edit in the autogenerated file and instead update the canonical docs/source that
generate the skill (the source used to produce the
nemoclaw-user-configure-inference skill reference), then re-run the
documentation/skill generation pipeline so the corrected text is emitted into
the generated nemoclaw-user-*/*.md outputs; ensure CI/linting for autogenerated
skills passes before merging.

In `@docs/inference/dgx-spark-station-local-inference.mdx`:
- Around line 1-12: The frontmatter in the new page is missing required fields
and the SPDX header is incorrectly placed inside the frontmatter; update the
frontmatter to include title, description, keywords, topics, tags, content.type,
difficulty, audience, and status (use the existing title/description/keywords
and add appropriate topics/tags/difficulty/audience/status values), move the
SPDX lines so they appear immediately after the frontmatter block (not inside
it), and ensure the document body contains an H1 that exactly matches the
frontmatter title (i.e., add or replace the top-level heading to match "Set Up
DGX Spark or DGX Station Local Inference").
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1a4e20af-4761-45f1-b0b4-5fdee947e6d3

📥 Commits

Reviewing files that changed from the base of the PR and between e139dbc and c2a2ebf.

📒 Files selected for processing (10)
  • .agents/skills/nemoclaw-user-configure-inference/SKILL.md
  • .agents/skills/nemoclaw-user-configure-inference/references/dgx-spark-station-local-inference.md
  • .agents/skills/nemoclaw-user-get-started/references/prerequisites.md
  • .agents/skills/nemoclaw-user-reference/references/troubleshooting.md
  • docs/get-started/prerequisites.mdx
  • docs/index.yml
  • docs/inference/dgx-spark-station-local-inference.mdx
  • docs/inference/use-local-inference.mdx
  • docs/reference/troubleshooting.mdx
  • test/dgx-local-inference-doc-copy.test.ts

Comment thread docs/inference/dgx-spark-station-local-inference.mdx
@wscurran wscurran added documentation Improvements or additions to documentation fix Local Models Running NemoClaw with local models Platform: DGX Spark Support for DGX Spark priority: high Important issue that should be resolved in the next release labels May 27, 2026
@wscurran wscurran requested a review from miyoungc May 27, 2026 22:48
@wscurran wscurran added the v0.0.55 Release target label May 27, 2026
@jyaunches jyaunches added R3 v0.0.56 Release target and removed v0.0.55 Release target R3 labels May 29, 2026
Copy link
Copy Markdown
Contributor

@prekshivyas prekshivyas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deepujain Thanks for putting this together — it's a genuinely useful walkthrough and most of it checks out. A couple of housekeeping items first, then the content fixes.

Before review can pass

  • Merge conflicts: the PR is currently CONFLICTING/DIRTY. Please rebase onto main and resolve.
  • dco-check is failing: the commit has a Signed-off-by line, but the workflow also requires one in the PR description. Please add Signed-off-by: Deepak Jain <deepujain@gmail.com> to the PR body. (commit-lint is green — an earlier red entry was a superseded run.)

Required changes before merge

1. Non-interactive examples will exit immediately for a first-time user.
Both --non-interactive examples omit the third-party consent flag, so a new user (no prior acceptance) hits:

  • ensureUsageNoticeConsent returns false (src/lib/onboard/usage-notice.ts:153) → onboard.ts:6500 process.exit(1).
  • Plain --yes does not satisfy this — only --yes-i-accept-third-party-software / NEMOCLAW_ACCEPT_THIRD_PARTY_SOFTWARE=1 does (legacy-command.ts:240-241 vs :246).
  • The model/image download separately requires --yes (onboard.ts:3907-3915).

So both examples need both flags, e.g.:

NEMOCLAW_PROVIDER=install-vllm nemoclaw onboard --non-interactive --yes --yes-i-accept-third-party-software

2. The "express setup" paragraph describes behavior that isn't in the code.
Lines 66–67 say the installer offers an "express setup" after the third-party notice that selects the local-inference path and policy defaults on DGX Spark/Station. I couldn't find any such flow — express in src/ only refers to the vLLM/Ollama install model-picker path, not a notice-driven onboarding mode. Please remove or rewrite this so it matches the actual wizard.

Non-blocking

  • Provider label: the doc shows **Local vLLM [experimental]**, but providers.ts:130-131 returns exactly "Local vLLM" (no suffix). Suggest dropping [experimental].
  • Test rigidity: test/dgx-local-inference-doc-copy.test.ts asserts every fenced block is bash (toEqual(new Set(["bash"]))). Fine today, but any future non-bash block (e.g. an output sample) will break it.

Verified correct

  • vLLM model slug + default (qwen3.6-27b, vllm-models.ts:41-43, vllm.ts:464), all env vars, install-vllm provider key, every cross-link/anchor, and the regenerated skill reference for the new page are all accurate.

Once the two required items and the conflicts/DCO are sorted, this is a clear approve.

Fixes NVIDIA#3231

Signed-off-by: Deepak Jain <deepujain@gmail.com>
@deepujain deepujain force-pushed the docs/3231-dgx-local-inference branch from c2a2ebf to f44331f Compare May 30, 2026 02:01
@deepujain
Copy link
Copy Markdown
Contributor Author

Rebased on current main, cleaned up the DGX walkthrough review items, regenerated the user skills, and added the PR-body sign-off. Focused docs copy test passes.

@cv cv added v0.0.57 Release target and removed v0.0.56 Release target labels Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation fix Local Models Running NemoClaw with local models Platform: DGX Spark Support for DGX Spark priority: high Important issue that should be resolved in the next release v0.0.57 Release target

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[DGX Spark][Docs] No end-to-end walkthrough for setting up NemoClaw with local inference on DGX Spark or DGX Station

5 participants