|
| 1 | +--- |
| 2 | +title: |
| 3 | + page: Route Local Inference Requests to LM Studio |
| 4 | + nav: Local Inference with LM Studio |
| 5 | +description: Configure inference.local to route sandbox requests to a local LM Studio server running on the gateway host. |
| 6 | +topics: |
| 7 | +- Generative AI |
| 8 | +- Cybersecurity |
| 9 | +tags: |
| 10 | +- Tutorial |
| 11 | +- Inference Routing |
| 12 | +- LM Studio |
| 13 | +- Local Inference |
| 14 | +- Sandbox |
| 15 | +content: |
| 16 | + type: tutorial |
| 17 | + difficulty: technical_intermediate |
| 18 | + audience: |
| 19 | + - engineer |
| 20 | +--- |
| 21 | + |
| 22 | +<!-- |
| 23 | + SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. |
| 24 | + SPDX-License-Identifier: Apache-2.0 |
| 25 | +--> |
| 26 | + |
| 27 | +# Route Local Inference Requests to LM Studio |
| 28 | + |
| 29 | +This tutorial describes how to configure OpenShell to route inference requests to a local LM Studio server. |
| 30 | + |
| 31 | +:::{note} |
| 32 | +The LM Studio server provides easy setup with both OpenAI and Anthropic compatible endpoints. |
| 33 | +::: |
| 34 | + |
| 35 | +This tutorial will cover: |
| 36 | + |
| 37 | +- Expose a local inference server to OpenShell sandboxes. |
| 38 | +- Verify end-to-end inference from inside a sandbox. |
| 39 | + |
| 40 | +## Prerequisites |
| 41 | + |
| 42 | +First, complete OpenShell installation and follow the {doc}`/get-started/quickstart`. |
| 43 | + |
| 44 | +[Install the LM Studio app](https://lmstudio.ai/download). Make sure that your LM Studio is running in the same environment as your gateway. |
| 45 | + |
| 46 | +If you prefer to work without having to keep the LM Studio app open, download llmster (headless LM Studio) with the following command: |
| 47 | + |
| 48 | +### Linux/Mac |
| 49 | +```bash |
| 50 | +curl -fsSL https://lmstudio.ai/install.sh | bash |
| 51 | +``` |
| 52 | + |
| 53 | +### Windows |
| 54 | +```bash |
| 55 | +irm https://lmstudio.ai/install.ps1 | iex |
| 56 | +``` |
| 57 | + |
| 58 | +And start llmster: |
| 59 | +```bash |
| 60 | +lms daemon up |
| 61 | +``` |
| 62 | + |
| 63 | +## Step 1: Start LM Studio Local Server |
| 64 | + |
| 65 | +Start the LM Studio local server from the Developer tab, and verify the OpenAI-compatible endpoint is enabled. |
| 66 | + |
| 67 | +LM Studio will listen to `127.0.0.1:1234` by default. For use with OpenShell, you'll need to configure LM Studio to listen on all interfaces (`0.0.0.0`). |
| 68 | + |
| 69 | +If you're using the GUI, go to the Developer Tab, select Server Settings, then enable Serve on Local Network. |
| 70 | + |
| 71 | +If you're using llmster in headless mode, run `lms server start --bind 0.0.0.0`. |
| 72 | + |
| 73 | +## Step 2: Test with a small model |
| 74 | + |
| 75 | +In the LM Studio app, head to the Model Search tab to download a small model like Qwen3.5 2B. |
| 76 | + |
| 77 | +In the terminal, use the following command to download and load the model: |
| 78 | +```bash |
| 79 | +lms get qwen/qwen3.5-2b |
| 80 | +lms load qwen/qwen3.5-2b |
| 81 | +``` |
| 82 | + |
| 83 | + |
| 84 | +## Step 3: Add LM Studio as a provider |
| 85 | + |
| 86 | +Choose the provider type that matches the client protocol you want to route through `inference.local`. |
| 87 | + |
| 88 | +:::::{tab-set} |
| 89 | + |
| 90 | +::::{tab-item} OpenAI-compatible |
| 91 | + |
| 92 | +Add LM Studio as an OpenAI-compatible provider through `host.openshell.internal`: |
| 93 | + |
| 94 | +```console |
| 95 | +$ openshell provider create \ |
| 96 | + --name lmstudio \ |
| 97 | + --type openai \ |
| 98 | + --credential OPENAI_API_KEY=lmstudio \ |
| 99 | + --config OPENAI_BASE_URL=http://host.openshell.internal:1234/v1 |
| 100 | +``` |
| 101 | + |
| 102 | +Use this provider for clients that send OpenAI-compatible requests such as `POST /v1/chat/completions` or `POST /v1/responses`. |
| 103 | + |
| 104 | +:::: |
| 105 | + |
| 106 | +::::{tab-item} Anthropic-compatible |
| 107 | + |
| 108 | +Add a provider that points to LM Studio's Anthropic-compatible `POST /v1/messages` endpoint: |
| 109 | + |
| 110 | +```console |
| 111 | +$ openshell provider create \ |
| 112 | + --name lmstudio-anthropic \ |
| 113 | + --type anthropic \ |
| 114 | + --credential ANTHROPIC_API_KEY=lmstudio \ |
| 115 | + --config ANTHROPIC_BASE_URL=http://host.openshell.internal:1234 |
| 116 | +``` |
| 117 | + |
| 118 | +Use this provider for Anthropic-compatible `POST /v1/messages` requests. |
| 119 | + |
| 120 | +:::: |
| 121 | + |
| 122 | +::::: |
| 123 | + |
| 124 | + |
| 125 | +## Step 4: Configure LM Studio as the local inference provider |
| 126 | + |
| 127 | +Set the managed inference route for the active gateway: |
| 128 | + |
| 129 | +:::::{tab-set} |
| 130 | + |
| 131 | +::::{tab-item} OpenAI-compatible |
| 132 | + |
| 133 | +```console |
| 134 | +$ openshell inference set --provider lmstudio --model qwen/qwen3.5-2b |
| 135 | +``` |
| 136 | + |
| 137 | +If the command succeeds, OpenShell has verified that the upstream is reachable and accepts the expected OpenAI-compatible request shape. |
| 138 | + |
| 139 | +:::: |
| 140 | + |
| 141 | +::::{tab-item} Anthropic-compatible |
| 142 | + |
| 143 | +```console |
| 144 | +$ openshell inference set --provider lmstudio-anthropic --model qwen/qwen3.5-2b |
| 145 | +``` |
| 146 | + |
| 147 | +If the command succeeds, OpenShell has verified that the upstream is reachable and accepts the expected Anthropic-compatible request shape. |
| 148 | + |
| 149 | +:::: |
| 150 | + |
| 151 | +::::: |
| 152 | + |
| 153 | +The active `inference.local` route is gateway-scoped, so only one provider and model pair is active at a time. Re-run `openshell inference set` whenever you want to switch between OpenAI-compatible and Anthropic-compatible clients. |
| 154 | + |
| 155 | +Confirm the saved config: |
| 156 | + |
| 157 | +```console |
| 158 | +$ openshell inference get |
| 159 | +``` |
| 160 | + |
| 161 | +You should see either `Provider: lmstudio` or `Provider: lmstudio-anthropic`, along with `Model: qwen/qwen3.5-2b`. |
| 162 | + |
| 163 | +## Step 5: Verify from Inside a Sandbox |
| 164 | + |
| 165 | +Run a simple request through `https://inference.local`: |
| 166 | + |
| 167 | +:::::{tab-set} |
| 168 | + |
| 169 | +::::{tab-item} OpenAI-compatible |
| 170 | + |
| 171 | +```console |
| 172 | +$ openshell sandbox create -- \ |
| 173 | + curl https://inference.local/v1/chat/completions \ |
| 174 | + --json '{"messages":[{"role":"user","content":"hello"}],"max_tokens":10}' |
| 175 | + |
| 176 | +$ openshell sandbox create -- \ |
| 177 | + curl https://inference.local/v1/responses \ |
| 178 | + -H "Content-Type: application/json" \ |
| 179 | + -d '{ |
| 180 | + "instructions": "You are a helpful assistant.", |
| 181 | + "input": "hello", |
| 182 | + "max_output_tokens": 10 |
| 183 | + }' |
| 184 | +``` |
| 185 | + |
| 186 | +:::: |
| 187 | + |
| 188 | +::::{tab-item} Anthropic-compatible |
| 189 | + |
| 190 | +```console |
| 191 | +$ openshell sandbox create -- \ |
| 192 | + curl https://inference.local/v1/messages \ |
| 193 | + -H "Content-Type: application/json" \ |
| 194 | + -d '{"messages":[{"role":"user","content":"hello"}],"max_tokens":10}' |
| 195 | +``` |
| 196 | + |
| 197 | +:::: |
| 198 | + |
| 199 | +::::: |
| 200 | + |
| 201 | +## Troubleshooting |
| 202 | + |
| 203 | +If setup fails, check these first: |
| 204 | + |
| 205 | +- LM Studio local server is running and reachable from the gateway host |
| 206 | +- `OPENAI_BASE_URL` uses `http://host.openshell.internal:1234/v1` when you use an `openai` provider |
| 207 | +- `ANTHROPIC_BASE_URL` uses `http://host.openshell.internal:1234` when you use an `anthropic` provider |
| 208 | +- The gateway and LM Studio run on the same machine or a reachable network path |
| 209 | +- The configured model name matches the model exposed by LM Studio |
| 210 | + |
| 211 | +Useful commands: |
| 212 | + |
| 213 | +```console |
| 214 | +$ openshell status |
| 215 | +$ openshell inference get |
| 216 | +$ openshell provider get lmstudio |
| 217 | +$ openshell provider get lmstudio-anthropic |
| 218 | +``` |
| 219 | + |
| 220 | +## Next Steps |
| 221 | + |
| 222 | +- To learn more about using the LM Studio CLI, refer to [LM Studio docs](https://lmstudio.ai/docs/cli) |
| 223 | +- To learn more about managed inference, refer to {doc}`/inference/index`. |
| 224 | +- To configure a different self-hosted backend, refer to {doc}`/inference/configure`. |
0 commit comments