-
Notifications
You must be signed in to change notification settings - Fork 2
API and Client Integration
The AI Gateway exposes an HTTP API that mirrors the provider SDK shapes you already use. You keep your SDK and change only the base URL and the credential. The full caller's reference is the ingress API guide; this page is the orientation.
Point your SDK's base URL at the gateway and use a Nexus virtual key as the credential, in either carrier:
Authorization: Bearer <virtual-key>x-nexus-virtual-key: <virtual-key>
Virtual keys are prefixed nvk_. You never send the upstream provider's own API key — the gateway holds provider credentials and attaches them when it dispatches upstream.
curl -sS http://localhost:3050/v1/chat/completions \
-H "Authorization: Bearer $VK" \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o-mini", "messages": [{"role": "user", "content": "Hello"}]}'| Endpoint | Shape | Streaming |
|---|---|---|
POST /v1/chat/completions |
OpenAI Chat Completions | yes (stream: true) |
POST /v1/messages |
Anthropic Messages | yes |
POST /v1/responses |
OpenAI Responses | yes |
POST /v1/embeddings |
OpenAI Embeddings | no |
POST /v1/estimate |
cost preview, no upstream call | no |
Provider-native shims also exist for GLM, Azure OpenAI, and Gemini paths, so an SDK already pointed at one of those works against the gateway with only a base-URL and credential change. The canonical /v1/* routes are the primary surface.
You do not have to match your API shape to the target provider. The gateway accepts your request in whichever supported shape your SDK speaks, translates it to whatever provider and model the routing rule selects, and translates the response back into your shape — a /v1/chat/completions request can be served by an Anthropic or Gemini model and still return an OpenAI Chat Completions response. Two guardrails bound this: the gateway only routes to targets the ingress shape can be translated to, and a /v1/responses request whose target is not natively a Responses provider is rejected with a Responses-shaped 400.
The request's model field drives routing. Send a concrete model and the gateway resolves it through the active routing rules; send the auto sentinel to hand model selection to the smart router. Provider-specific parameters that have no OpenAI equivalent travel in the nexus.ext.<provider>.<key> namespace on the request body.
Set stream: true (or use a provider-native streaming path) to receive a Server-Sent Events stream. The stream is emitted in the event grammar of the API shape you called, regardless of which provider served it.
Every response carries headers reporting what happened — the routed model and provider, the number of upstream attempts, the cache outcome, and quota and compliance-hook annotations. Errors are returned in the envelope of the API shape you called, with the HTTP status preserved, so your SDK's native error handling keeps working.
- Ingress API guide — the complete caller's reference.
- Provider coverage — which providers and models are supported.
- Getting Started — your first request end to end.
Nexus Gateway · Enterprise AI traffic gateway for compliance, routing, caching, and analytics.
Start here
Concepts
Using the gateway
Operations & internals
Community