You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/inference/configure.md
+24-6Lines changed: 24 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -35,7 +35,9 @@ The configuration consists of two values:
35
35
| Provider record | The credential backend OpenShell uses to authenticate with the upstream model host. |
36
36
| Model ID | The model to use for generation requests. |
37
37
38
-
## Step 1: Create a Provider
38
+
For a list of tested providers and their base URLs, refer to [Supported Inference Providers](../sandboxes/manage-providers.md#supported-inference-providers).
39
+
40
+
## Create a Provider
39
41
40
42
Create a provider that holds the backend credentials you want OpenShell to use.
41
43
@@ -51,7 +53,23 @@ This reads `NVIDIA_API_KEY` from your environment.
51
53
52
54
::::
53
55
54
-
::::{tab-item} Local / self-hosted endpoint
56
+
::::{tab-item} OpenAI-compatible Provider
57
+
58
+
Any cloud provider that exposes an OpenAI-compatible API works with the `openai` provider type. You need three values from the provider: the base URL, an API key, and a model name.
Replace the base URL and API key with the values from your provider. For supported providers out of the box, refer to [Supported Inference Providers](../sandboxes/manage-providers.md#supported-inference-providers). For other providers, refer to your provider's documentation for the correct base URL, available models, and API key setup.
69
+
70
+
::::
71
+
72
+
::::{tab-item} Local Endpoint
55
73
56
74
```console
57
75
$ openshell provider create \
@@ -77,7 +95,7 @@ This reads `ANTHROPIC_API_KEY` from your environment.
77
95
78
96
:::::
79
97
80
-
## Step 2: Set Inference Routing
98
+
## Set Inference Routing
81
99
82
100
Point `inference.local` at that provider and choose the model to use:
83
101
@@ -87,7 +105,7 @@ $ openshell inference set \
87
105
--model nvidia/nemotron-3-nano-30b-a3b
88
106
```
89
107
90
-
## Step 3: Verify the Active Config
108
+
## Verify the Active Config
91
109
92
110
Confirm that the provider and model are set correctly:
93
111
@@ -100,7 +118,7 @@ Gateway inference:
100
118
Version: 1
101
119
```
102
120
103
-
## Step 4: Update Part of the Config
121
+
## Update Part of the Config
104
122
105
123
Use `update` when you want to change only one field:
106
124
@@ -114,7 +132,7 @@ Or switch providers without repeating the current model:
Copy file name to clipboardExpand all lines: docs/inference/index.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -44,7 +44,7 @@ If code calls an external inference host directly, that traffic is evaluated onl
44
44
|---|---|
45
45
| Credentials | No sandbox API keys needed. Credentials come from the configured provider record. |
46
46
| Configuration | One provider and one model define sandbox inference for the active gateway. Every sandbox on that gateway sees the same `inference.local` backend. |
47
-
| Provider support |OpenAI, Anthropic, and NVIDIA providers all work through the same endpoint. |
47
+
| Provider support |NVIDIA, any OpenAI-compatible provider, and Anthropic all work through the same endpoint. |
48
48
| Hot-refresh | OpenShell picks up provider credential changes and inference updates without recreating sandboxes. Changes propagate within about 5 seconds by default. |
|`generic`| User-defined | Any service with custom credentials |
139
+
|`openai`|`OPENAI_API_KEY`| Any OpenAI-compatible endpoint. Set `--config OPENAI_BASE_URL` to point to the provider. Refer to {doc}`/inference/configure`. |
Use the `generic` type for any service not listed above. You define the
143
144
environment variable names and values yourself with `--credential`.
144
145
:::
145
146
147
+
## Supported Inference Providers
148
+
149
+
The following providers have been tested with `inference.local`. Any provider that exposes an OpenAI-compatible API works with the `openai` type. Set `--config OPENAI_BASE_URL` to the provider's base URL and `--credential OPENAI_API_KEY` to your API key.
150
+
151
+
| Provider | Name | Type | Base URL | API Key Variable |
152
+
|---|---|---|---|---|
153
+
| NVIDIA API Catalog |`nvidia-prod`|`nvidia`|`https://integrate.api.nvidia.com/v1`|`NVIDIA_API_KEY`|
| LM Studio (local) |`lmstudio`|`openai`|`http://host.openshell.internal:1234/v1`|`OPENAI_API_KEY`|
160
+
161
+
Refer to your provider's documentation for the correct base URL, available models, and API key setup. To configure inference routing, refer to {doc}`/inference/configure`.
0 commit comments