|
| 1 | +<!-- |
| 2 | + SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
| 3 | + SPDX-License-Identifier: Apache-2.0 |
| 4 | +
|
| 5 | + Licensed under the Apache License, Version 2.0 (the "License"); |
| 6 | + you may not use this file except in compliance with the License. |
| 7 | + You may obtain a copy of the License at |
| 8 | +
|
| 9 | + http://www.apache.org/licenses/LICENSE-2.0 |
| 10 | +
|
| 11 | + Unless required by applicable law or agreed to in writing, software |
| 12 | + distributed under the License is distributed on an "AS IS" BASIS, |
| 13 | + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. |
| 14 | + See the License for the specific language governing permissions and |
| 15 | + limitations under the License. |
| 16 | +--> |
| 17 | + |
| 18 | +# DGX Spark Demo |
| 19 | + |
| 20 | +This demo showcases NVIDIA's Nemotron Nano 9B v2 model running locally via NIM (NVIDIA Inference Microservice) on DGX Spark, with OpenWebUI as the frontend interface. |
| 21 | + |
| 22 | +## Overview |
| 23 | + |
| 24 | +- **Model**: NVIDIA Nemotron Nano 9B v2 |
| 25 | +- **Runtime**: NVIDIA NIM |
| 26 | +- **Frontend**: OpenWebUI |
| 27 | +- **Hardware**: DGX Spark |
| 28 | + |
| 29 | + |
| 30 | + |
| 31 | +## Prerequisites |
| 32 | + |
| 33 | +- NVIDIA DGX Spark or compatible GPU hardware |
| 34 | +- Docker installed with NVIDIA GPU support |
| 35 | +- NGC API Key ([Get one here](https://ngc.nvidia.com/)) |
| 36 | +- Python 3.x (for OpenWebUI) |
| 37 | + |
| 38 | +## Setup Instructions |
| 39 | + |
| 40 | +### 1. Running Local NIM |
| 41 | + |
| 42 | +First, set up and run the NVIDIA Inference Microservice with the Nemotron model: |
| 43 | + |
| 44 | +```bash |
| 45 | +# Set your NGC API key |
| 46 | +export NGC_API_KEY="<your-ngc-api-key>" |
| 47 | + |
| 48 | +# Set up cache directory |
| 49 | +export LOCAL_NIM_CACHE=~/.cache/nim |
| 50 | +mkdir -p "$LOCAL_NIM_CACHE" |
| 51 | + |
| 52 | +# Run the NIM container |
| 53 | +docker run -it --rm \ |
| 54 | + --gpus all \ |
| 55 | + --shm-size=16GB \ |
| 56 | + -e NGC_API_KEY \ |
| 57 | + -v "$LOCAL_NIM_CACHE:/opt/nim/.cache" \ |
| 58 | + -u $(id -u) \ |
| 59 | + -p 8000:8000 \ |
| 60 | + nvcr.io/nim/nvidia/nvidia-nemotron-nano-9b-v2-dgx-spark:latest |
| 61 | +``` |
| 62 | + |
| 63 | +The NIM service will be available at `http://localhost:8000`. |
| 64 | + |
| 65 | +### 2. Installing OpenWebUI |
| 66 | + |
| 67 | +Follow the official OpenWebUI installation guide: |
| 68 | + |
| 69 | +[OpenWebUI Installation Guide](https://github.com/open-webui/open-webui?tab=readme-ov-file#how-to-install-) |
| 70 | + |
| 71 | +### 3. Running OpenWebUI |
| 72 | + |
| 73 | +Configure and start OpenWebUI to connect to your local NIM instance: |
| 74 | + |
| 75 | +```bash |
| 76 | +# Configure OpenWebUI to use the local NIM endpoint |
| 77 | +export OPENAI_API_BASE_URL=http://0.0.0.0:8000/v1 |
| 78 | +export OPENAI_API_KEY="" |
| 79 | +export ENABLE_MODEL_SELECTOR=false |
| 80 | +export WEBUI_AUTH=False |
| 81 | +export DEFAULT_MODEL="nvidia/nemotron-nano-9b-v2" |
| 82 | + |
| 83 | +# Start OpenWebUI |
| 84 | +open-webui serve --host 0.0.0.0 --port 8080 |
| 85 | +``` |
| 86 | + |
| 87 | +Access OpenWebUI at `http://localhost:8080` in your browser. |
| 88 | + |
| 89 | +## Testing the Setup |
| 90 | + |
| 91 | +To verify that the NIM service is running correctly, use the following curl command: |
| 92 | + |
| 93 | +```bash |
| 94 | +curl -X 'POST' \ |
| 95 | + 'http://localhost:8000/v1/chat/completions' \ |
| 96 | + -H 'accept: application/json' \ |
| 97 | + -H 'Content-Type: application/json' \ |
| 98 | + -d '{ |
| 99 | + "model": "nvidia/nemotron-nano-9b-v2", |
| 100 | + "messages": [{"role":"user", "content":"Which number is larger, 9.11 or 9.8?"}], |
| 101 | + "max_tokens": 128, |
| 102 | + "stream": true |
| 103 | + }' |
| 104 | +``` |
| 105 | + |
| 106 | +You should receive a streaming response from the model. |
| 107 | + |
| 108 | +## Architecture |
| 109 | + |
| 110 | +``` |
| 111 | +┌─────────────┐ HTTP ┌──────────────┐ API ┌─────────────┐ |
| 112 | +│ User │ ────────────> │ OpenWebUI │ ──────────> │ NIM │ |
| 113 | +│ Browser │ │ (Port 8080) │ │ (Port 8000)│ |
| 114 | +└─────────────┘ └──────────────┘ └─────────────┘ |
| 115 | + │ |
| 116 | + v |
| 117 | + ┌─────────────┐ |
| 118 | + │ Nemotron │ |
| 119 | + │ Nano 9B v2 │ |
| 120 | + └─────────────┘ |
| 121 | +``` |
| 122 | + |
| 123 | +## Troubleshooting |
| 124 | + |
| 125 | +### NIM Container Issues |
| 126 | +- Ensure you have sufficient GPU memory (the model requires significant VRAM) |
| 127 | +- Verify your NGC API key is valid |
| 128 | +- Check that Docker has access to GPU resources with `docker run --gpus all nvidia/cuda:11.0-base nvidia-smi` |
| 129 | + |
| 130 | +### OpenWebUI Connection Issues |
| 131 | +- Verify the NIM service is running: `curl http://localhost:8000/v1/models` |
| 132 | +- Ensure the `OPENAI_API_BASE_URL` points to the correct endpoint |
| 133 | +- Check firewall settings if accessing from a different machine |
| 134 | + |
| 135 | +## Additional Resources |
| 136 | + |
| 137 | +- [NIM Container for DGX Spark (NGC Catalog)](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nvidia-nemotron-nano-9b-v2-dgx-spark?version=1.0.0-variant) |
| 138 | +- [NVIDIA NIM Documentation](https://docs.nvidia.com/nim/) |
| 139 | +- [OpenWebUI Documentation](https://github.com/open-webui/open-webui) |
| 140 | +- [Nemotron Model Information](https://build.nvidia.com/nvidia/nemotron-nano-9b-v2) |
| 141 | + |
| 142 | +## License |
| 143 | + |
| 144 | +Refer to NVIDIA's licensing terms for NIM and the Nemotron model. |
| 145 | + |
0 commit comments