Skip to content

Commit 9bbcadb

Browse files
authored
Add DGX Spark Demo community example (#402)
- Add DGX Spark demo showcasing Nemotron Nano 9B v2 via NIM with OpenWebUI frontend - Include setup instructions, architecture diagram, and interface screenshot - Update community README to include new demo in inventory
1 parent 4f85441 commit 9bbcadb

File tree

3 files changed

+150
-1
lines changed

3 files changed

+150
-1
lines changed

community/README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@ Community examples are sample code and deployments for RAG pipelines that are no
2323

2424
## Inventory
2525

26+
* [DGX Spark Demo](./dgx-spark-demo/)
27+
28+
This demo showcases NVIDIA's Nemotron Nano 9B v2 model running locally via NIM (NVIDIA Inference Microservice) on DGX Spark, with OpenWebUI as the frontend interface.
29+
2630
* [Smart Health Agent](./smart-health-agent/)
2731

2832
This example demonstrates a comprehensive multi-agent workflow for health data analysis and personalized recommendations. It integrates real-time health metrics, weather data, and Retrieval Augmented Generation (RAG) to process multimodal health documents. The system uses LangGraph for agent orchestration and Ollama with Gemma 3 models for intelligent health insights. Features include multimodal document processing, weather-aware exercise recommendations, and personalized health guidance through an interactive chat interface.
@@ -90,4 +94,4 @@ Community examples are sample code and deployments for RAG pipelines that are no
9094

9195
* [Chat with LLM Llama 3.1 Nemotron Nano 4B](./chat-llama-nemotron/)
9296

93-
This is a React-based conversational UI designed for interacting with a powerful local LLM. It incorporates RAG to enhance contextual understanding and is backed by an NVIDIA Dynamo inference server running the NVIDIA Llama-3.1-Nemotron-Nano-4B-v1.1 model. The setup enables low-latency, cloud-free AI assistant capabilities, with live document search and reasoning, all deployable on local or edge infrastructure.
97+
This is a React-based conversational UI designed for interacting with a powerful local LLM. It incorporates RAG to enhance contextual understanding and is backed by an NVIDIA Dynamo inference server running the NVIDIA Llama-3.1-Nemotron-Nano-4B-v1.1 model. The setup enables low-latency, cloud-free AI assistant capabilities, with live document search and reasoning, all deployable on local or edge infrastructure.

community/dgx-spark-demo/README.md

Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
<!--
2+
SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
SPDX-License-Identifier: Apache-2.0
4+
5+
Licensed under the Apache License, Version 2.0 (the "License");
6+
you may not use this file except in compliance with the License.
7+
You may obtain a copy of the License at
8+
9+
http://www.apache.org/licenses/LICENSE-2.0
10+
11+
Unless required by applicable law or agreed to in writing, software
12+
distributed under the License is distributed on an "AS IS" BASIS,
13+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
See the License for the specific language governing permissions and
15+
limitations under the License.
16+
-->
17+
18+
# DGX Spark Demo
19+
20+
This demo showcases NVIDIA's Nemotron Nano 9B v2 model running locally via NIM (NVIDIA Inference Microservice) on DGX Spark, with OpenWebUI as the frontend interface.
21+
22+
## Overview
23+
24+
- **Model**: NVIDIA Nemotron Nano 9B v2
25+
- **Runtime**: NVIDIA NIM
26+
- **Frontend**: OpenWebUI
27+
- **Hardware**: DGX Spark
28+
29+
![OpenWebUI Interface](./openwebui_interface.png)
30+
31+
## Prerequisites
32+
33+
- NVIDIA DGX Spark or compatible GPU hardware
34+
- Docker installed with NVIDIA GPU support
35+
- NGC API Key ([Get one here](https://ngc.nvidia.com/))
36+
- Python 3.x (for OpenWebUI)
37+
38+
## Setup Instructions
39+
40+
### 1. Running Local NIM
41+
42+
First, set up and run the NVIDIA Inference Microservice with the Nemotron model:
43+
44+
```bash
45+
# Set your NGC API key
46+
export NGC_API_KEY="<your-ngc-api-key>"
47+
48+
# Set up cache directory
49+
export LOCAL_NIM_CACHE=~/.cache/nim
50+
mkdir -p "$LOCAL_NIM_CACHE"
51+
52+
# Run the NIM container
53+
docker run -it --rm \
54+
--gpus all \
55+
--shm-size=16GB \
56+
-e NGC_API_KEY \
57+
-v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \
58+
-u $(id -u) \
59+
-p 8000:8000 \
60+
nvcr.io/nim/nvidia/nvidia-nemotron-nano-9b-v2-dgx-spark:latest
61+
```
62+
63+
The NIM service will be available at `http://localhost:8000`.
64+
65+
### 2. Installing OpenWebUI
66+
67+
Follow the official OpenWebUI installation guide:
68+
69+
[OpenWebUI Installation Guide](https://github.com/open-webui/open-webui?tab=readme-ov-file#how-to-install-)
70+
71+
### 3. Running OpenWebUI
72+
73+
Configure and start OpenWebUI to connect to your local NIM instance:
74+
75+
```bash
76+
# Configure OpenWebUI to use the local NIM endpoint
77+
export OPENAI_API_BASE_URL=http://0.0.0.0:8000/v1
78+
export OPENAI_API_KEY=""
79+
export ENABLE_MODEL_SELECTOR=false
80+
export WEBUI_AUTH=False
81+
export DEFAULT_MODEL="nvidia/nemotron-nano-9b-v2"
82+
83+
# Start OpenWebUI
84+
open-webui serve --host 0.0.0.0 --port 8080
85+
```
86+
87+
Access OpenWebUI at `http://localhost:8080` in your browser.
88+
89+
## Testing the Setup
90+
91+
To verify that the NIM service is running correctly, use the following curl command:
92+
93+
```bash
94+
curl -X 'POST' \
95+
'http://localhost:8000/v1/chat/completions' \
96+
-H 'accept: application/json' \
97+
-H 'Content-Type: application/json' \
98+
-d '{
99+
"model": "nvidia/nemotron-nano-9b-v2",
100+
"messages": [{"role":"user", "content":"Which number is larger, 9.11 or 9.8?"}],
101+
"max_tokens": 128,
102+
"stream": true
103+
}'
104+
```
105+
106+
You should receive a streaming response from the model.
107+
108+
## Architecture
109+
110+
```
111+
┌─────────────┐ HTTP ┌──────────────┐ API ┌─────────────┐
112+
│ User │ ────────────> │ OpenWebUI │ ──────────> │ NIM │
113+
│ Browser │ │ (Port 8080) │ │ (Port 8000)│
114+
└─────────────┘ └──────────────┘ └─────────────┘
115+
116+
v
117+
┌─────────────┐
118+
│ Nemotron │
119+
│ Nano 9B v2 │
120+
└─────────────┘
121+
```
122+
123+
## Troubleshooting
124+
125+
### NIM Container Issues
126+
- Ensure you have sufficient GPU memory (the model requires significant VRAM)
127+
- Verify your NGC API key is valid
128+
- Check that Docker has access to GPU resources with `docker run --gpus all nvidia/cuda:11.0-base nvidia-smi`
129+
130+
### OpenWebUI Connection Issues
131+
- Verify the NIM service is running: `curl http://localhost:8000/v1/models`
132+
- Ensure the `OPENAI_API_BASE_URL` points to the correct endpoint
133+
- Check firewall settings if accessing from a different machine
134+
135+
## Additional Resources
136+
137+
- [NIM Container for DGX Spark (NGC Catalog)](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nvidia-nemotron-nano-9b-v2-dgx-spark?version=1.0.0-variant)
138+
- [NVIDIA NIM Documentation](https://docs.nvidia.com/nim/)
139+
- [OpenWebUI Documentation](https://github.com/open-webui/open-webui)
140+
- [Nemotron Model Information](https://build.nvidia.com/nvidia/nemotron-nano-9b-v2)
141+
142+
## License
143+
144+
Refer to NVIDIA's licensing terms for NIM and the Nemotron model.
145+
233 KB
Loading

0 commit comments

Comments
 (0)