API and Client Integration

The AI Gateway exposes an HTTP API that mirrors the provider SDK shapes you already use. You keep your SDK and change only the base URL and the credential. The full caller's reference is the ingress API guide; this page is the orientation.

Base URL and authentication

Point your SDK's base URL at the gateway and use a Nexus virtual key as the credential, in either carrier:

Authorization: Bearer <virtual-key>
x-nexus-virtual-key: <virtual-key>

Virtual keys are prefixed nvk_. You never send the upstream provider's own API key — the gateway holds provider credentials and attaches them when it dispatches upstream.

curl -sS http://localhost:3050/v1/chat/completions \
  -H "Authorization: Bearer $VK" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello"}]}'

Endpoints

Endpoint	Shape	Streaming
`POST /v1/chat/completions`	OpenAI Chat Completions	yes (`stream: true`)
`POST /v1/messages`	Anthropic Messages	yes
`POST /v1/responses`	OpenAI Responses	yes
`POST /v1/embeddings`	OpenAI Embeddings	no
`POST /v1/estimate`	cost preview, no upstream call	no

Provider-native shims also exist for GLM, Azure OpenAI, and Gemini paths, so an SDK already pointed at one of those works against the gateway with only a base-URL and credential change. The canonical /v1/* routes are the primary surface.

Cross-format translation

You do not have to match your API shape to the target provider. The gateway accepts your request in whichever supported shape your SDK speaks, translates it to whatever provider and model the routing rule selects, and translates the response back into your shape — a /v1/chat/completions request can be served by an Anthropic or Gemini model and still return an OpenAI Chat Completions response. Two guardrails bound this: the gateway only routes to targets the ingress shape can be translated to, and a /v1/responses request whose target is not natively a Responses provider is rejected with a Responses-shaped 400.

Choosing the model

The request's model field drives routing. Send a concrete model and the gateway resolves it through the active routing rules; send the auto sentinel to hand model selection to the smart router. Provider-specific parameters that have no OpenAI equivalent travel in the nexus.ext.<provider>.<key> namespace on the request body.

Streaming

Set stream: true (or use a provider-native streaming path) to receive a Server-Sent Events stream. The stream is emitted in the event grammar of the API shape you called, regardless of which provider served it.

Response headers and errors

Every response carries headers reporting what happened — the routed model and provider, the number of upstream attempts, the cache outcome, and quota and compliance-hook annotations. Errors are returned in the envelope of the API shape you called, with the HTTP status preserved, so your SDK's native error handling keeps working.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

API and Client Integration

API and Client Integration

Base URL and authentication

Endpoints

Cross-format translation

Choosing the model

Streaming

Response headers and errors

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Nexus Gateway

Clone this wiki locally