Skip to content

API and Client Integration

nexus edited this page May 28, 2026 · 2 revisions

API and Client Integration

The AI Gateway exposes an HTTP API that mirrors the provider SDK shapes you already use. You keep your SDK and change only the base URL and the credential. The full caller's reference is the ingress API guide; this page is the orientation.

Base URL and authentication

Point your SDK's base URL at the gateway and use a Nexus virtual key as the credential, in either carrier:

  • Authorization: Bearer <virtual-key>
  • x-nexus-virtual-key: <virtual-key>

Virtual keys are prefixed nvk_. You never send the upstream provider's own API key — the gateway holds provider credentials and attaches them when it dispatches upstream.

curl -sS http://localhost:3050/v1/chat/completions \
  -H "Authorization: Bearer $VK" \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello"}]}'

Endpoints

Endpoint Shape Streaming
POST /v1/chat/completions OpenAI Chat Completions yes (stream: true)
POST /v1/messages Anthropic Messages yes
POST /v1/responses OpenAI Responses yes
POST /v1/embeddings OpenAI Embeddings no
POST /v1/estimate cost preview, no upstream call no

Provider-native shims also exist for GLM, Azure OpenAI, and Gemini paths, so an SDK already pointed at one of those works against the gateway with only a base-URL and credential change. The canonical /v1/* routes are the primary surface.

Cross-format translation

You do not have to match your API shape to the target provider. The gateway accepts your request in whichever supported shape your SDK speaks, translates it to whatever provider and model the routing rule selects, and translates the response back into your shape — a /v1/chat/completions request can be served by an Anthropic or Gemini model and still return an OpenAI Chat Completions response. Two guardrails bound this: the gateway only routes to targets the ingress shape can be translated to, and a /v1/responses request whose target is not natively a Responses provider is rejected with a Responses-shaped 400.

Choosing the model

The request's model field drives routing. Send a concrete model and the gateway resolves it through the active routing rules; send the auto sentinel to hand model selection to the smart router. Provider-specific parameters that have no OpenAI equivalent travel in the nexus.ext.<provider>.<key> namespace on the request body.

Streaming

Set stream: true (or use a provider-native streaming path) to receive a Server-Sent Events stream. The stream is emitted in the event grammar of the API shape you called, regardless of which provider served it.

Response headers and errors

Every response carries headers reporting what happened — the routed model and provider, the number of upstream attempts, the cache outcome, and quota and compliance-hook annotations. Errors are returned in the envelope of the API shape you called, with the HTTP status preserved, so your SDK's native error handling keeps working.

See also

Clone this wiki locally