Skip to content

Commit 0463046

Browse files
authored
docs(inference): Add LM Studio guide (#386)
Signed-off-by: Will Burford <will@lmstudio.ai>
1 parent bb4545f commit 0463046

3 files changed

Lines changed: 235 additions & 0 deletions

File tree

docs/inference/configure.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,5 +166,6 @@ Explore related topics:
166166

167167
- To understand the inference routing flow and supported API patterns, refer to {doc}`index`.
168168
- To follow a complete Ollama-based local setup, refer to {doc}`/tutorials/local-inference-ollama`.
169+
- To follow a complete LM Studio-based local setup, refer to {doc}`/tutorials/local-inference-lmstudio`.
169170
- To control external endpoints, refer to [Policies](/sandboxes/policies.md).
170171
- To manage provider records, refer to {doc}`../sandboxes/manage-providers`.

docs/tutorials/index.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,15 @@ Route inference to a local Ollama server, verify it from a sandbox, and reuse th
5252
+++
5353
{bdg-secondary}`Tutorial`
5454
:::
55+
56+
:::{grid-item-card} Local Inference with LM Studio
57+
:link: local-inference-lmstudio
58+
:link-type: doc
59+
60+
Route inference to a local LM Studio server via the OpenAI or Anthropic compatible APIs.
61+
+++
62+
{bdg-secondary}`Tutorial`
63+
:::
5564
::::
5665

5766
```{toctree}
@@ -60,4 +69,5 @@ Route inference to a local Ollama server, verify it from a sandbox, and reuse th
6069
First Network Policy <first-network-policy>
6170
GitHub Push Access <github-sandbox>
6271
Local Inference with Ollama <local-inference-ollama>
72+
Local Inference with LM Studio <local-inference-lmstudio>
6373
```
Lines changed: 224 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,224 @@
1+
---
2+
title:
3+
page: Route Local Inference Requests to LM Studio
4+
nav: Local Inference with LM Studio
5+
description: Configure inference.local to route sandbox requests to a local LM Studio server running on the gateway host.
6+
topics:
7+
- Generative AI
8+
- Cybersecurity
9+
tags:
10+
- Tutorial
11+
- Inference Routing
12+
- LM Studio
13+
- Local Inference
14+
- Sandbox
15+
content:
16+
type: tutorial
17+
difficulty: technical_intermediate
18+
audience:
19+
- engineer
20+
---
21+
22+
<!--
23+
SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
24+
SPDX-License-Identifier: Apache-2.0
25+
-->
26+
27+
# Route Local Inference Requests to LM Studio
28+
29+
This tutorial describes how to configure OpenShell to route inference requests to a local LM Studio server.
30+
31+
:::{note}
32+
The LM Studio server provides easy setup with both OpenAI and Anthropic compatible endpoints.
33+
:::
34+
35+
This tutorial will cover:
36+
37+
- Expose a local inference server to OpenShell sandboxes.
38+
- Verify end-to-end inference from inside a sandbox.
39+
40+
## Prerequisites
41+
42+
First, complete OpenShell installation and follow the {doc}`/get-started/quickstart`.
43+
44+
[Install the LM Studio app](https://lmstudio.ai/download). Make sure that your LM Studio is running in the same environment as your gateway.
45+
46+
If you prefer to work without having to keep the LM Studio app open, download llmster (headless LM Studio) with the following command:
47+
48+
### Linux/Mac
49+
```bash
50+
curl -fsSL https://lmstudio.ai/install.sh | bash
51+
```
52+
53+
### Windows
54+
```bash
55+
irm https://lmstudio.ai/install.ps1 | iex
56+
```
57+
58+
And start llmster:
59+
```bash
60+
lms daemon up
61+
```
62+
63+
## Step 1: Start LM Studio Local Server
64+
65+
Start the LM Studio local server from the Developer tab, and verify the OpenAI-compatible endpoint is enabled.
66+
67+
LM Studio will listen to `127.0.0.1:1234` by default. For use with OpenShell, you'll need to configure LM Studio to listen on all interfaces (`0.0.0.0`).
68+
69+
If you're using the GUI, go to the Developer Tab, select Server Settings, then enable Serve on Local Network.
70+
71+
If you're using llmster in headless mode, run `lms server start --bind 0.0.0.0`.
72+
73+
## Step 2: Test with a small model
74+
75+
In the LM Studio app, head to the Model Search tab to download a small model like Qwen3.5 2B.
76+
77+
In the terminal, use the following command to download and load the model:
78+
```bash
79+
lms get qwen/qwen3.5-2b
80+
lms load qwen/qwen3.5-2b
81+
```
82+
83+
84+
## Step 3: Add LM Studio as a provider
85+
86+
Choose the provider type that matches the client protocol you want to route through `inference.local`.
87+
88+
:::::{tab-set}
89+
90+
::::{tab-item} OpenAI-compatible
91+
92+
Add LM Studio as an OpenAI-compatible provider through `host.openshell.internal`:
93+
94+
```console
95+
$ openshell provider create \
96+
--name lmstudio \
97+
--type openai \
98+
--credential OPENAI_API_KEY=lmstudio \
99+
--config OPENAI_BASE_URL=http://host.openshell.internal:1234/v1
100+
```
101+
102+
Use this provider for clients that send OpenAI-compatible requests such as `POST /v1/chat/completions` or `POST /v1/responses`.
103+
104+
::::
105+
106+
::::{tab-item} Anthropic-compatible
107+
108+
Add a provider that points to LM Studio's Anthropic-compatible `POST /v1/messages` endpoint:
109+
110+
```console
111+
$ openshell provider create \
112+
--name lmstudio-anthropic \
113+
--type anthropic \
114+
--credential ANTHROPIC_API_KEY=lmstudio \
115+
--config ANTHROPIC_BASE_URL=http://host.openshell.internal:1234
116+
```
117+
118+
Use this provider for Anthropic-compatible `POST /v1/messages` requests.
119+
120+
::::
121+
122+
:::::
123+
124+
125+
## Step 4: Configure LM Studio as the local inference provider
126+
127+
Set the managed inference route for the active gateway:
128+
129+
:::::{tab-set}
130+
131+
::::{tab-item} OpenAI-compatible
132+
133+
```console
134+
$ openshell inference set --provider lmstudio --model qwen/qwen3.5-2b
135+
```
136+
137+
If the command succeeds, OpenShell has verified that the upstream is reachable and accepts the expected OpenAI-compatible request shape.
138+
139+
::::
140+
141+
::::{tab-item} Anthropic-compatible
142+
143+
```console
144+
$ openshell inference set --provider lmstudio-anthropic --model qwen/qwen3.5-2b
145+
```
146+
147+
If the command succeeds, OpenShell has verified that the upstream is reachable and accepts the expected Anthropic-compatible request shape.
148+
149+
::::
150+
151+
:::::
152+
153+
The active `inference.local` route is gateway-scoped, so only one provider and model pair is active at a time. Re-run `openshell inference set` whenever you want to switch between OpenAI-compatible and Anthropic-compatible clients.
154+
155+
Confirm the saved config:
156+
157+
```console
158+
$ openshell inference get
159+
```
160+
161+
You should see either `Provider: lmstudio` or `Provider: lmstudio-anthropic`, along with `Model: qwen/qwen3.5-2b`.
162+
163+
## Step 5: Verify from Inside a Sandbox
164+
165+
Run a simple request through `https://inference.local`:
166+
167+
:::::{tab-set}
168+
169+
::::{tab-item} OpenAI-compatible
170+
171+
```console
172+
$ openshell sandbox create -- \
173+
curl https://inference.local/v1/chat/completions \
174+
--json '{"messages":[{"role":"user","content":"hello"}],"max_tokens":10}'
175+
176+
$ openshell sandbox create -- \
177+
curl https://inference.local/v1/responses \
178+
-H "Content-Type: application/json" \
179+
-d '{
180+
"instructions": "You are a helpful assistant.",
181+
"input": "hello",
182+
"max_output_tokens": 10
183+
}'
184+
```
185+
186+
::::
187+
188+
::::{tab-item} Anthropic-compatible
189+
190+
```console
191+
$ openshell sandbox create -- \
192+
curl https://inference.local/v1/messages \
193+
-H "Content-Type: application/json" \
194+
-d '{"messages":[{"role":"user","content":"hello"}],"max_tokens":10}'
195+
```
196+
197+
::::
198+
199+
:::::
200+
201+
## Troubleshooting
202+
203+
If setup fails, check these first:
204+
205+
- LM Studio local server is running and reachable from the gateway host
206+
- `OPENAI_BASE_URL` uses `http://host.openshell.internal:1234/v1` when you use an `openai` provider
207+
- `ANTHROPIC_BASE_URL` uses `http://host.openshell.internal:1234` when you use an `anthropic` provider
208+
- The gateway and LM Studio run on the same machine or a reachable network path
209+
- The configured model name matches the model exposed by LM Studio
210+
211+
Useful commands:
212+
213+
```console
214+
$ openshell status
215+
$ openshell inference get
216+
$ openshell provider get lmstudio
217+
$ openshell provider get lmstudio-anthropic
218+
```
219+
220+
## Next Steps
221+
222+
- To learn more about using the LM Studio CLI, refer to [LM Studio docs](https://lmstudio.ai/docs/cli)
223+
- To learn more about managed inference, refer to {doc}`/inference/index`.
224+
- To configure a different self-hosted backend, refer to {doc}`/inference/configure`.

0 commit comments

Comments
 (0)