Transform AbstractCore into an OpenAI-compatible API server. One server, all models, any client.
If you want a dedicated single-model /v1 server (one provider/model per worker), see Endpoint.
Visit while the server is running:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc - Lightweight endpoint index:
http://localhost:8000/docs-lite
Swagger UI keeps its standard Authorize button when server auth is enabled.
When ABSTRACTCORE_AUTH_TOKEN is set,
AbstractCore wraps that authorize
flow and validates the entered bearer token through /acore/auth/validate
before Swagger stores it for Try it out requests. Invalid tokens stay
unauthorized and render an auth error inside the modal. The docs and OpenAPI
schema are public by default so the UI can load before authentication, but API
operations remain protected. Set ABSTRACTCORE_SERVER_PROTECT_DOCS=1 if you
also want /docs, /docs-lite, /redoc, and /openapi.json behind server auth.
When server auth is disabled, the server bearer scheme is omitted from the docs,
so Swagger does not render a misleading server-token authorize flow.
The OpenAPI schema includes executable examples for every request body. JSON
examples intentionally show optional aliases as null when sending both fields
would be ambiguous; the server drops nulls before routing. For local/custom
OpenAI-compatible endpoints, set base_url only when you intentionally want to
route away from the provider's default API host.
# Install
pip install "abstractcore[server]"
# Configure server auth and provider keys
export ABSTRACTCORE_AUTH_TOKEN="acore-server-secret"
export OPENAI_API_KEY="sk-..."
# Start server
python -m abstractcore.server.app
# Or with uvicorn directly
uvicorn abstractcore.server.app:app --host 0.0.0.0 --port 8000
# Test
curl http://localhost:8000/health
# Response: {"status":"healthy"}curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ABSTRACTCORE_AUTH_TOKEN" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'Or with Python:
import os
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key=os.environ["ABSTRACTCORE_AUTH_TOKEN"])
response = client.chat.completions.create(
model="anthropic/claude-haiku-4-5",
messages=[{"role": "user", "content": "Explain quantum computing"}]
)
print(response.choices[0].message.content)You can configure the server through environment variables or through AbstractCore's centralized config. Environment variables always take precedence over config-persisted values.
# Persisted local/server config
abstractcore --set-server-auth-token acore-server-secret
abstractcore --set-api-key openai sk-...
abstractcore --set-api-key anthropic sk-ant-...
abstractcore --set-api-key openrouter sk-or-...
abstractcore --set-api-key portkey pk_...
# Optional hardening/defaults
abstractcore --set-server-base-url-allowlist "https://example.com/v1"
abstractcore --set-server-url-fetch-allowlist "https://files.example.com"
abstractcore --set-server-media-root /srv/abstractcore-media
abstractcore --set-server-host 127.0.0.1
abstractcore --set-server-port 8000# Provider API keys
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export OPENROUTER_API_KEY="sk-or-..."
export PORTKEY_API_KEY="pk_..." # optional (Portkey)
export PORTKEY_CONFIG="pcfg_..." # required for Portkey routing
# Server auth token. Authenticated clients can use all server-configured providers.
export ABSTRACTCORE_AUTH_TOKEN="acore-server-secret"
# Optional: also protect /docs, /docs-lite, /redoc, and /openapi.json.
export ABSTRACTCORE_SERVER_PROTECT_DOCS=1
# Local providers
export OLLAMA_BASE_URL="http://localhost:11434" # (or legacy: OLLAMA_HOST)
export LMSTUDIO_BASE_URL="http://localhost:1234/v1"
export VLLM_BASE_URL="http://localhost:8000/v1"
export OPENAI_BASE_URL="http://localhost:1234/v1"
export OPENAI_API_KEY="your-endpoint-key" # optional, if the endpoint requires auth
# Server bind (only used by `python -m abstractcore.server.app`)
export HOST="0.0.0.0"
export PORT="8000"
# Debug mode
export ABSTRACTCORE_DEBUG=true
# Dangerous (multi-tenant hazard): allow unload_after for providers that can unload shared server state (e.g. Ollama)
export ABSTRACTCORE_ALLOW_UNSAFE_UNLOAD_AFTER=1
# Server security controls (recommended)
#
# - Request-level base_url overrides are loopback-only by default.
# URL entries match scheme + exact host + default/explicit port + path-segment prefix.
# Bare entries match hostname globs, e.g. "*.example.com".
export ABSTRACTCORE_SERVER_BASE_URL_ALLOWLIST="https://api.openai.com,https://example.com/v1"
#
# - Remote URL fetches for attachments are blocked for private/loopback/link-local targets by default (SSRF protection).
# To allow specific hosts/prefixes, use the same structured allowlist syntax:
export ABSTRACTCORE_SERVER_URL_FETCH_ALLOWLIST="https://www.berkshirehathaway.com"
#
# - Local file paths in HTTP requests are disabled by default (including @/path/to/file in message strings).
# To allow local file paths safely, restrict them under a single directory:
export ABSTRACTCORE_SERVER_MEDIA_ROOT="/srv/abstractcore-media"
#
# - Unsafe escape hatch: allow arbitrary local file paths from HTTP requests (not recommended)
export ABSTRACTCORE_SERVER_ALLOW_LOCAL_FILES=1# Using AbstractCore's built-in CLI
python -m abstractcore.server.app --help # View all options
python -m abstractcore.server.app --debug # Debug mode
python -m abstractcore.server.app --host 127.0.0.1 --port 8080 # Custom host/port
python -m abstractcore.server.app --debug --port 8001 # Debug on custom port
# Using uvicorn directly
uvicorn abstractcore.server.app:app --reload # Development with auto-reload
uvicorn abstractcore.server.app:app --workers 4 # Production with multiple workers
uvicorn abstractcore.server.app:app --port 3000 # Custom portAll API operations except GET /health use the same server auth policy:
send Authorization: Bearer $ABSTRACTCORE_AUTH_TOKEN when ABSTRACTCORE_AUTH_TOKEN is configured. Provider-key overrides use
X-AbstractCore-Provider-API-Key. Provider keys in request bodies remain disabled; select
discovery endpoints accept an api_key query parameter for tooling/Swagger UI convenience.
| Group | Method | Endpoint | Purpose | Main parameters |
|---|---|---|---|---|
| Health | GET | /health |
Liveness/version probe; never requires auth | none |
| Configuration | GET | /v1/config/capability-defaults |
List explicit input/output/embedding/rerank route defaults | none |
| Configuration | PUT | /v1/config/capability-defaults/{kind}/{modality} |
Set one capability route default | path kind, modality; body provider, model, base_url, options |
| Configuration | DELETE | /v1/config/capability-defaults/{kind}/{modality} |
Clear one capability route default | path kind, modality |
| Discovery | GET | /v1/models |
List models and filter by provider/capabilities | provider, input_type, output_type, base_url, api_key |
| Discovery | GET | /providers |
Provider status/capabilities | include_models |
| Discovery | GET | /v1/vision/providers/ |
AbstractVision provider catalog for image/video generation models | optional task, provider, include_models, base_url, api_key |
| Discovery | GET | /v1/audio/voices |
AbstractVoice voice/profile catalog for TTS | optional provider, model, providers_only, base_url, api_key |
| Discovery | GET | /v1/audio/speech/models |
AbstractVoice TTS model/provider catalog | optional provider, base_url, api_key |
| Discovery | GET | /v1/audio/speech/providers |
AbstractVoice TTS provider catalog | optional base_url |
| Discovery | GET | /v1/audio/transcriptions/models |
AbstractVoice STT model/provider catalog | optional provider, base_url, api_key |
| Discovery | GET | /v1/audio/transcriptions/providers |
AbstractVoice STT provider catalog | optional base_url |
| Discovery | GET | /v1/voice/clone/providers |
AbstractVoice voice clone provider catalog | optional base_url |
| Chat | POST | /v1/chat/completions |
OpenAI-compatible chat, streaming, tools, media | model, messages, stream, tools, tool_choice, temperature, max_tokens, base_url, agent_format, thinking |
| Chat | POST | /{provider}/v1/chat/completions |
Provider-scoped chat route where body model is unprefixed | path provider, body model, messages, chat parameters |
| Responses | POST | /v1/responses |
OpenAI Responses API (object:"response") + legacy chat fallback |
model, input or messages, stream, generation parameters, base_url, agent_format, thinking, prompt_cache_key, prompt_cache_binding |
| Embeddings | POST | /v1/embeddings |
OpenAI-compatible embedding vectors | model, input, dimensions, encoding_format, user, base_url |
| Images | POST | /v1/images/generations |
Text-to-image generation | prompt, optional model, provider, base_url, width, height, size, n, steps, guidance_scale, seed, quality, extra |
| Images | POST | /{provider}/v1/images/generations |
Provider-scoped text-to-image route where body model is unprefixed | path provider, body model, optional base_url, image generation parameters |
| Images | POST | /v1/images/edits |
Image edit/inpaint via multipart form | prompt, image, optional mask, model, provider, base_url, size, steps, guidance_scale, seed, extra_json |
| Images | POST | /{provider}/v1/images/edits |
Provider-scoped image edit route where body model is unprefixed | path provider, optional base_url, image edit form fields |
| Videos | POST | /v1/videos/generations |
Text-to-video generation | prompt, optional model, provider, base_url, width, height, fps, num_frames, steps, guidance_scale, extra |
| Videos | POST | /{provider}/v1/videos/generations |
Provider-scoped text-to-video route where body model is unprefixed | path provider, body model, optional base_url, video generation parameters |
| Videos | POST | /v1/videos/edits |
Image-to-video via multipart form | prompt, image, optional model, provider, base_url, width, height, fps, num_frames, extra_json |
| Videos | POST | /{provider}/v1/videos/edits |
Provider-scoped image-to-video route where body model is unprefixed | path provider, optional base_url, image-to-video form fields |
| Vision Jobs | POST | /v1/vision/jobs/images/generations |
Async image generation with polling | same body as /v1/images/generations |
| Vision Jobs | POST | /v1/vision/jobs/images/edits |
Async image edit with polling | same form fields as /v1/images/edits |
| Vision Jobs | POST | /v1/vision/jobs/videos/generations |
Async text-to-video with polling and progress events | same body as /v1/videos/generations |
| Vision Jobs | POST | /v1/vision/jobs/videos/edits |
Async image-to-video with polling and progress events | same form fields as /v1/videos/edits |
| Vision Jobs | GET | /v1/vision/jobs/{job_id} |
Poll/consume async job state | path job_id, query consume |
| Vision Models | GET | /v1/vision/models |
Available AbstractVision model catalog | optional task, provider, base_url, api_key |
| Audio | POST | /v1/audio/transcriptions |
Speech-to-text multipart endpoint | file, optional provider, model, language, prompt, response_format, temperature, format, base_url |
| Audio | POST | /{provider}/v1/audio/transcriptions |
Provider-scoped speech-to-text route where body model is unprefixed | path provider, optional base_url, STT form fields |
| Audio | POST | /v1/audio/speech |
Text-to-speech endpoint | input/text, optional provider, model, voice, response_format/format, speed, instructions, profile, quality_preset, quality, base_url |
| Audio | POST | /{provider}/v1/audio/speech |
Provider-scoped text-to-speech route where body model is unprefixed | path provider, optional base_url, TTS body fields |
| Audio | POST | /v1/voice/clone |
AbstractVoice-compatible voice-clone/custom-voice extension | file, optional provider, model, tts_model, cloning_engine, base_url, name, reference_text, validate |
| Audio | POST | /{provider}/v1/voice/clone |
Provider-scoped voice-clone route where body model is unprefixed | path provider, optional base_url, voice-clone form fields |
| Audio | POST | /v1/audio/translations |
Reserved OpenAI-compatible translation route | file, model; returns 501 in this version |
| Audio | POST | /v1/audio/music |
Extension endpoint for text-to-music plugins | prompt/input/text, optional provider, model, lyrics, duration_s, seed, num_inference_steps, guidance_scale, format; requires a music capability plugin |
| Audio | POST | /{provider}/v1/audio/music |
Backend-scoped text-to-music route | path provider, music body fields |
| Runtime | POST | /acore/models/load |
Load and keep warm a task-specific model runtime | optional task (text_generation default, image_generation, video_generation, text_to_video, image_to_video, tts, stt), provider, model, options, pin, base_url, timeout_s |
| Runtime | GET | /acore/models/loaded |
List task-aware loaded runtimes | optional task, provider, model |
| Runtime | POST | /acore/models/unload |
Unload a task-specific runtime | runtime_id or provider + model, optional task, base_url, options |
| Prompt Cache | GET | /acore/prompt_cache/stats |
Cache stats on a loaded gateway runtime or upstream AbstractEndpoint | provider + model or base_url; provider key header if required |
| Prompt Cache | GET | /acore/prompt_cache/capabilities |
Cache capability discovery on a loaded gateway runtime or upstream AbstractEndpoint | provider + model or base_url; provider key header if required |
| Prompt Cache | POST | /acore/prompt_cache/set |
Select/create a cache key locally or upstream | provider + model or base_url, key, make_default, ttl_s |
| Prompt Cache | POST | /acore/prompt_cache/update |
Prepare prompt/messages/tools locally or upstream | provider + model or base_url, key, prompt or messages, system_prompt, tools, optional thinking, ttl_s |
| Prompt Cache | POST | /acore/prompt_cache/fork |
Fork one cache key to another locally or upstream | provider + model or base_url, from_key, to_key, make_default, ttl_s |
| Prompt Cache | POST | /acore/prompt_cache/clear |
Clear local or upstream cache state | provider + model or base_url, optional key |
| Prompt Cache | POST | /acore/prompt_cache/prepare_modules |
Prepare reusable module/tool context locally or upstream | provider + model or base_url, namespace, modules, make_default, ttl_s, version |
| Memory Blocs | POST | /acore/blocs/upsert_text |
Persist extracted text into the gateway-local bloc store or an upstream AbstractEndpoint bloc store | optional base_url, path, content, optional bloc metadata |
| Memory Blocs | GET | /acore/blocs |
List gateway-local or upstream bloc records | optional base_url, sha256, bloc_id |
| Memory Blocs | GET | /acore/blocs/record |
Inspect a gateway-local or upstream bloc record | optional base_url, sha256 or bloc_id |
| Memory Blocs | POST | /acore/blocs/delete |
Delete one bloc with optional live KV safety checks | optional base_url, sha256 or bloc_id, delete_kv, clear_loaded, force, dry_run |
| Memory Blocs | GET | /acore/blocs/kv/manifest |
Inspect a gateway-local or upstream bloc KV manifest | provider + model or base_url, sha256 or bloc_id, optional artifact_path |
| Memory Blocs | GET | /acore/blocs/kv/list |
List manifest-backed bloc KV artifacts | optional base_url, provider, model, sha256, bloc_id |
| Memory Blocs | POST | /acore/blocs/kv/ensure |
Compile or validate a local or upstream provider-backed bloc KV artifact | provider + model or base_url, sha256 or bloc_id, optional artifact_path, force_rebuild, debug |
| Memory Blocs | POST | /acore/blocs/kv/load |
Load or fork a local or upstream provider-backed bloc KV artifact into a cache key | provider + model or base_url, sha256 or bloc_id, optional artifact_path, stable_cache_key, key, make_default, force_rebuild, debug |
| Memory Blocs | POST | /acore/blocs/kv/delete |
Delete one bloc KV artifact with live-binding safety | provider + model or base_url when checking live state, sha256 or bloc_id, optional artifact_path, clear_loaded, force, dry_run, debug |
| Memory Blocs | POST | /acore/blocs/kv/prune |
Delete matching bloc KV artifacts by filter | optional provider, model, base_url, sha256, bloc_id, clear_loaded, force, dry_run, debug |
| Capabilities | GET | /v1/capabilities |
Inspect optional capability plugin availability and backend metadata | none |
| Capabilities | GET | /v1/capabilities/{capability}/providers |
List normalized providers for one capability plugin | path capability, optional task |
| Capabilities | GET | /v1/capabilities/{capability}/models |
List normalized models for one capability plugin | path capability, optional task, provider |
| Audio | GET | /v1/audio/music/providers |
List music capability providers | optional task |
| Audio | GET | /v1/audio/music/models |
List music capability models | optional task, provider |
/v1/config/capability-defaults exposes the execution host's explicit route
defaults for input, output, embedding, and rerank capabilities. Gateway
uses this route as its control-plane source when ABSTRACTCORE_SERVER_BASE_URL
points at a remote Core server.
These defaults are configuration only; they do not load a model. Use the
runtime residency routes under /acore/models/* to inspect or change
provider-loaded state.
modelusually usesprovider/modelformat, for exampleopenai/gpt-4o-mini,anthropic/claude-haiku-4-5,ollama/qwen3:4b,lmstudio/qwen/qwen3-vl-4b, oropenai-compatible/my-model.base_urlis an AbstractCore extension for routing a provider to a specific OpenAI-compatible endpoint. Loopback URLs are allowed by default; non-loopback URLs requireABSTRACTCORE_SERVER_BASE_URL_ALLOWLIST.- Media routes also accept an optional
providerrouting hint. This is mainly useful when you omitmodel, use a provider-scoped route, or pair a custombase_urlwith the default local/plugin path. X-AbstractCore-Provider-API-Keyoverrides only the requested upstream provider for that request. It does not replace the AbstractCore server token.- Provider keys in request bodies remain disabled; use
X-AbstractCore-Provider-API-Keyfor per-request upstream overrides. Select discovery endpoints accept anapi_keyquery parameter for tooling/Swagger UI convenience. - Remote URL media fetches are SSRF-protected by default. Local file paths are
disabled unless
ABSTRACTCORE_SERVER_MEDIA_ROOTorABSTRACTCORE_SERVER_ALLOW_LOCAL_FILES=1is configured.
Endpoint: POST /v1/chat/completions
Standard OpenAI-compatible endpoint. Works with all providers.
Server auth:
- If
ABSTRACTCORE_AUTH_TOKENis configured, every non-health endpoint requiresAuthorization: Bearer $ABSTRACTCORE_AUTH_TOKEN. Authenticated clients can use all provider keys/endpoints configured on the server. - If
ABSTRACTCORE_AUTH_TOKENis not configured, either setABSTRACTCORE_SERVER_ALLOW_UNAUTHENTICATED=1for intentional local/dev use, or provide an upstream provider key explicitly viaX-AbstractCore-Provider-API-Key. - Health checks (
GET /health) are always unauthenticated.
Request:
{
"model": "provider/model-name",
"messages": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello!"}
],
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}Key Parameters:
model(required): Prefer"provider/model-name"(e.g.,"openai/gpt-4o-mini"). If you pass a bare model name (no/), the server will best-effort auto-detect a provider.messages(required): Array of message objectsstream(optional): Enable streaming responsestools(optional): Tools for function callingagent_format(optional, AbstractCore extension): Tool-call syntax output format for agentic clients ("auto"|"openai"|"codex"|"qwen3"|"llama3"|"gemma"|"xml"|"passthrough"). When omitted, the server auto-detects from user-agent + model heuristics.api_key(deprecated/disabled, AbstractCore extension): Provider API keys are not accepted in request bodies. Configure provider keys on the server or useX-AbstractCore-Provider-API-Keyfor a per-request provider override. Select discovery endpoints accept anapi_keyquery parameter for tooling/Swagger UI convenience.base_url(optional, AbstractCore extension): Override the provider endpoint (include/v1for OpenAI-compatible servers like LM Studio / vLLM / OpenRouter)unload_after(optional, AbstractCore extension): Iftrue, callsllm.unload_model(model)after the request completes. Disabled forollama/*unlessABSTRACTCORE_ALLOW_UNSAFE_UNLOAD_AFTER=1.prompt_cache_key(optional, AbstractCore extension): Best-effort prompt caching key (semantics depend on provider/backend). Seedocs/prompt-caching.md.prompt_cache_binding(optional, AbstractCore extension): Exact durable bloc binding returned by/acore/blocs/kv/load. When supplied, the server verifies the cache key before generation or streaming; stale/missing bindings return409.prompt_cache_retention(optional, AbstractCore extension): Prompt cache retention policy (OpenAI:"in_memory"or"24h"; ignored by other providers). Seedocs/prompt-caching.md.thinking(optional, AbstractCore extension): Unified thinking/reasoning control (null|"auto"|"on"|"off"|"none"or"low"|"medium"|"high"|"xhigh"when supported). Note:"none"is treated as an alias for"off".temperature,max_tokens,top_p: Standard LLM parameters
The server forwards thinking to the underlying provider using AbstractCore’s unified thinking mapping (see Generation Parameters).
Example (route to LM Studio + Qwen3.5, disable thinking):
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "lmstudio/qwen3.5-27b@q4_k_m",
"base_url": "http://localhost:1234/v1",
"messages": [{"role": "user", "content": "Compute 17*23 - 19*11. Reply with the integer only."}],
"thinking": "none",
"max_tokens": 64
}'Notes:
- For Qwen3 / Qwen3.5 on LM Studio,
thinking="none"maps to LM Studio’s template variables (enable_thinking/enableThinking) plus a Qwen template “hard switch” fallback (empty<think></think>) when needed. This avoids injecting “reasoning effort” instructions into the system prompt. - Not every backend supports per-effort budgets for
low|medium|high; when unavailable, levels degrade to “thinking enabled”.
Example with streaming:
import os
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key=os.environ["ABSTRACTCORE_AUTH_TOKEN"])
stream = client.chat.completions.create(
model="ollama/qwen3-coder:30b",
messages=[{"role": "user", "content": "Write a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Route a provider to a specific endpoint (useful for remote OpenAI-compatible servers):
Security notes:
- Request-level
base_urloverrides are loopback-only by default. To allow additional origins or host globs, setABSTRACTCORE_SERVER_BASE_URL_ALLOWLIST. URL entries are parsed and matched on scheme, exact host, effective port, and path-segment prefix. - If the server has an environment provider key set (e.g.
OPENAI_API_KEY) and you route to a non-loopbackbase_url, the request is refused unless the provider key was supplied explicitly withX-AbstractCore-Provider-API-Key, or withAuthorizationwhen server auth is disabled.
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "lmstudio/qwen/qwen3-4b-2507",
"base_url": "http://localhost:1234/v1",
"messages": [{"role": "user", "content": "Hello from a remote LM Studio endpoint"}]
}'Do not put provider keys in request bodies. Those fields are disabled because they leak through
logs, shell history, browser history, and reverse proxies. For discovery/model catalog endpoints,
an api_key query parameter exists for tooling/Swagger UI convenience, but headers remain preferred.
# Preferred: configure provider keys on the server and authenticate to AbstractCore.
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ABSTRACTCORE_AUTH_TOKEN" \
-d '{
"model": "openai/gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'When ABSTRACTCORE_AUTH_TOKEN is not configured, either set
ABSTRACTCORE_SERVER_ALLOW_UNAUTHENTICATED=1 for intentional local/dev use, or provide an
upstream provider key explicitly via X-AbstractCore-Provider-API-Key. Once server auth is
enabled, Authorization is reserved for the AbstractCore server auth token and is never forwarded
upstream.
To override a single upstream provider while still using the server auth token, send the provider
key in X-AbstractCore-Provider-API-Key. The override applies only to the requested provider:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $ABSTRACTCORE_AUTH_TOKEN" \
-H "X-AbstractCore-Provider-API-Key: $ANTHROPIC_API_KEY" \
-d '{
"model": "anthropic/claude-haiku-4-5",
"messages": [{"role": "user", "content": "Hello!"}]
}'Endpoint: POST /{provider}/v1/chat/completions
This route is useful for clients that already route by base URL path and expect
the body model to be provider-local. It is equivalent to using
POST /v1/chat/completions with model="{provider}/{model}".
Parameters:
- Path
provider(required): provider route prefix such asopenai,anthropic,ollama,openrouter,portkey,lmstudio,vllm, oropenai-compatible. - Body
model(required): provider-local model id, without the provider prefix. - Body
messages,stream,tools,tool_choice,agent_format,thinking,base_url, and other chat parameters behave like/v1/chat/completions.
Example:
curl -X POST http://localhost:8000/openai/v1/chat/completions \
-H "Authorization: Bearer $ABSTRACTCORE_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o-mini",
"messages": [{"role": "user", "content": "Hello!"}]
}'AbstractCore Server can optionally expose OpenAI-compatible image/video generation and audio endpoints.
Important notes:
- These are interoperability-first endpoints (return
b64_jsonor raw bytes), not an artifact-first durability contract. - If the required plugin/backend is not available, the server returns
501with actionable messaging.
Thin clients can preflight the configured media surface without importing
abstractvision or abstractvoice directly:
| Endpoint | Purpose | Notes |
|---|---|---|
GET /v1/vision/providers/ |
Lists provider image/video catalog entries through the selected AbstractVision backend. | Optional task, provider, include_models, base_url, api_key. Set include_models=true to include full provider model catalogs (slower). |
GET /v1/audio/voices |
Lists TTS profiles/voices, active profile, active model, and bounded catalog data through AbstractVoice. | Optional provider, model, providers_only, base_url, api_key. |
GET /v1/audio/speech/models |
TTS model id projection with provider/model route strings. | Includes models_by_provider and provider_models for clients that route via provider/model. |
GET /v1/audio/speech/providers |
TTS provider projection. | Useful for clients that pick /{provider}/v1/audio/speech first and then choose a model. |
GET /v1/audio/transcriptions/models |
STT model id projection with provider/model route strings. | Includes models_by_provider and provider_models. |
GET /v1/audio/transcriptions/providers |
STT provider projection. | Mirrors speech provider discovery for /v1/audio/transcriptions. |
GET /v1/voice/clone/providers |
Voice cloning provider projection. | Uses AbstractVoice clone provider availability. |
These routes instantiate only the selected capability backend needed for deep
catalog discovery. Shallow plugin availability remains available through the
library llm.capabilities.status() call. Server-held provider keys remain behind
server auth; per-request upstream key overrides must use
X-AbstractCore-Provider-API-Key. For tooling/Swagger UI convenience, these catalog routes also accept an api_key query parameter (redacted from server logs).
Endpoints:
POST /v1/images/generationsPOST /{provider}/v1/images/generationsPOST /v1/images/editsPOST /{provider}/v1/images/edits
Remote OpenAI-compatible image proxying is included in abstractcore[server]
and is enabled by setting OPENAI_BASE_URL.
The synchronous image routes use the same internal generate(..., output="image")
dispatcher as the Python API, then serialize the result back to the
OpenAI-compatible b64_json response shape.
Install for remote image proxying:
pip install "abstractcore[server]"Install local image backends only when you want the server to load Diffusers, MLX-Gen, or stable-diffusion.cpp models itself:
pip install "abstractcore[server,vision]"Use provider/model-style image ids:
- Omit
modelonly when this server has a configured AbstractVision/OpenAI-compatible image default, for example viaOPENAI_BASE_URLplus an optional default model id. - Provider-scoped routes such as
/openai-compatible/v1/images/generationsand/diffusers/v1/images/generationsaccept an unprefixed bodymodeland internally route it asprovider/model, matching/{provider}/v1/chat/completions. diffusers/defaultselects the configured local Diffusers default:ABSTRACTCORE_VISION_MODEL_ID/ABSTRACTVISION_DIFFUSERS_MODEL_ID/ABSTRACTVISION_MODEL_ID.diffusers/<huggingface-repo>selects an explicit local Diffusers model.mlx-gen/defaultselects the configured local MLX-Gen model; use AbstractVision's q4 AbstractFramework presets by default and q8 variants when quality is paramount.mlx-gen/<exact-huggingface-repo>selects an explicit cached MLX-Gen model such asmlx-gen/AbstractFramework/flux.2-klein-4b-4bitormlx-gen/AbstractFramework/qwen-image-edit-2511-4bit. Official MLX-Gen runtime snapshots such asmlx-gen/briaai/FIBOandmlx-gen/Wan-AI/Wan2.2-TI2V-5B-Diffusersare selected the same way. Legacymfluxprefixes remain accepted as compatibility aliases, but the model id itself must be the exact published repo id.sdcpp/defaultselects the configured stable-diffusion.cpp model.openai-compatible/<model>routes to the configured OpenAI-compatible image endpoint.openai/gpt-image-1or provider-scoped/openai/v1/images/generationsroutes to OpenAI's Images API and usesOPENAI_API_KEYwhen an explicit AbstractVision upstream base URL is not configured.
Local Diffusers generation is cache-only by default; set
ABSTRACTCORE_VISION_ALLOW_DOWNLOAD=1 or
ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1 only when runtime downloads are
intentional.
POST /v1/images/generations JSON parameters:
| Field | Required | Notes |
|---|---|---|
prompt |
yes | Text prompt to render. |
model |
no | Omit for the server's configured AbstractVision default. If present, use provider/model routing: diffusers/default, diffusers/<huggingface-repo>, mlx-gen/default, mlx-gen/<exact-huggingface-repo>, sdcpp/default, openai-compatible/<model>, or openai/gpt-image-1. Provider-scoped routes accept the same model without the prefix. |
provider |
no | Optional routing hint when you want the configured default model/backend for a specific provider, or when pairing a request with base_url. |
width, height |
no | Requested output dimensions in pixels. These are the natural fields for local engines and remain accepted for remote routes. |
size |
no | OpenAI-style size such as 1024x1024. The server normalizes size with width/height so OpenAI-style and local-engine clients can use the same route. |
n |
no | Number of images; clamped to 1..10. |
response_format |
no | Server response format. b64_json is the supported response shape. |
negative_prompt |
no | Local/backend-specific negative prompt. Strict OpenAI-compatible upstreams do not receive this top-level field; use extra only when your custom upstream supports it. |
seed |
no | Local deterministic seed. Strict OpenAI-compatible upstreams do not receive this top-level field; use extra.seed only when your custom upstream supports it. |
steps |
no | Local denoising/inference step count. Strict OpenAI-compatible upstreams do not receive this top-level field; use extra.steps only when your custom upstream supports it. |
guidance_scale |
no | Local classifier-free guidance scale. Strict OpenAI-compatible upstreams do not receive this top-level field; use extra.guidance_scale only when your custom upstream supports it. |
quality, style, user, background, output_format, output_compression, moderation |
no | Named OpenAI-compatible passthrough fields for upstream image endpoints. |
base_url |
no | OpenAI-compatible endpoint override. Prefer this with openai-compatible/...; if set with openai/..., the request is sent to that URL instead of api.openai.com. Loopback is allowed by default; non-loopback requires allowlist. |
extra |
no | JSON object for backend-specific passthrough fields. Prefer this over arbitrary top-level keys so the schema stays explicit. |
POST /v1/images/edits multipart parameters:
| Field | Required | Notes |
|---|---|---|
prompt |
yes | Edit/inpaint instruction. |
image |
yes | Source image file. |
mask |
no | Optional mask image for inpainting/edit-capable backends. |
model |
no | Same provider/model routing as generation; omit for the server default. Provider-scoped routes accept the same model without the prefix. |
provider |
no | Optional routing hint when you want the configured default backend for a specific provider, or when pairing a request with base_url. |
size |
no | OpenAI-style edit output size such as 1024x1024; multipart edit compatibility keeps this field. |
response_format |
no | Server response shape; b64_json is supported. |
negative_prompt, seed, steps, guidance_scale |
no | Local/backend-specific fields. Strict OpenAI-compatible upstreams do not receive them as top-level fields; use extra_json only when your custom upstream supports them. |
base_url |
no | OpenAI-compatible endpoint override. Loopback is allowed by default; non-loopback requires allowlist. |
extra_json |
no | JSON object string with backend/upstream-specific parameters. |
Async image jobs are available when a request can take long enough that polling is preferable:
POST /v1/vision/jobs/images/generationsuses the same JSON body as/v1/images/generationsand returns{"job_id": "..."}.POST /v1/vision/jobs/images/editsuses the same multipart fields as/v1/images/editsand returns{"job_id": "..."}.GET /v1/vision/jobs/{job_id}returnsqueued,running,succeeded, orfailed. Add?consume=trueto remove a completed job from the in-memory job store after reading it.
Endpoints:
POST /v1/videos/generationsPOST /{provider}/v1/videos/generationsPOST /v1/videos/editsPOST /{provider}/v1/videos/editsPOST /v1/vision/jobs/videos/generationsPOST /v1/vision/jobs/videos/edits
The synchronous video routes use the same internal
generate(..., output={"modality": "video"}) dispatcher as the Python API and
return {"data":[{"b64_json":"..."}]} with MP4 bytes encoded in base64.
Async video jobs are the preferred path for longer local runs; polling
GET /v1/vision/jobs/{job_id} includes progress.last_event when the selected
backend reports richer progress events.
Use exact provider/model ids. For MLX-Gen, select the published model repo id, for example:
mlx-gen/Wan-AI/Wan2.2-TI2V-5B-Diffusersfor text-to-video or image-to-video.mlx-gen/AbstractFramework/qwen-image-2512-4bitfor text-to-image.
Core does not expose a quantization override. Q4/Q8 choices are part of the model id that AbstractVision/MLX-Gen loads.
POST /v1/videos/generations JSON parameters:
| Field | Required | Notes |
|---|---|---|
prompt |
yes | Text prompt to render as video. |
model |
no | Provider/model id such as mlx-gen/Wan-AI/Wan2.2-TI2V-5B-Diffusers or openai-compatible/<model>. Provider-scoped routes accept the same model without the prefix. |
provider |
no | Optional routing hint, e.g. mlx-gen or openai-compatible. |
width, height, size |
no | Requested output dimensions. size accepts WIDTHxHEIGHT. |
fps, num_frames / frames |
no | Video frame rate and frame count. |
response_format |
no | b64_json is the supported response shape. |
negative_prompt, seed, steps, guidance_scale |
no | Backend-specific generation controls. |
extra.max_sequence_length |
no | Useful for MLX-Gen Wan-style video runs. |
POST /v1/videos/edits multipart parameters mirror generation and add
required image=@first-frame.png. This route is the image-to-video path; the
alias /v1/videos/from-image is accepted for clients that prefer a literal
name.
Examples:
# Remote OpenAI-compatible image endpoint.
BASE=http://127.0.0.1:8000
TOKEN=replace-with-server-token
curl -sS -X POST "$BASE/v1/images/generations" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"model":"openai-compatible/gpt-image-1","prompt":"A clean product photo of a red ceramic mug on a white table.","n":1,"width":1024,"height":1024,"response_format":"b64_json","quality":"low"}' \
> /tmp/acore-image.json
python - <<'PY'
import base64
import json
from pathlib import Path
data = json.loads(Path("/tmp/acore-image.json").read_text())
Path("/tmp/acore-image.png").write_bytes(base64.b64decode(data["data"][0]["b64_json"]))
PY
# Image edit using the generated image.
curl -sS -X POST "$BASE/v1/images/edits" \
-H "Authorization: Bearer $TOKEN" \
-F "model=openai-compatible/gpt-image-1" \
-F "prompt=Make the mug blue while keeping the white table." \
-F "image=@/tmp/acore-image.png;type=image/png" \
-F "size=1024x1024" \
-F "response_format=b64_json" \
-F 'extra_json={"quality":"low"}' \
> /tmp/acore-edit.json
python - <<'PY'
import base64
import json
from pathlib import Path
data = json.loads(Path("/tmp/acore-edit.json").read_text())
Path("/tmp/acore-edit.png").write_bytes(base64.b64decode(data["data"][0]["b64_json"]))
PY
# Configured server image default
curl -sS -X POST "$BASE/v1/images/generations" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"prompt":"a red fox in snow","width":512,"height":512,"response_format":"b64_json"}'
# Text-to-video, asynchronous job with progress polling.
curl -sS -X POST "$BASE/v1/vision/jobs/videos/generations" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"provider":"mlx-gen","model":"Wan-AI/Wan2.2-TI2V-5B-Diffusers","prompt":"A slow camera move through a luminous data center.","width":1280,"height":704,"fps":24,"num_frames":121,"steps":50,"guidance_scale":5.0,"extra":{"max_sequence_length":256}}'
# Image-to-video, synchronous multipart route.
curl -sS -X POST "$BASE/v1/videos/edits" \
-H "Authorization: Bearer $TOKEN" \
-F "provider=mlx-gen" \
-F "model=Wan-AI/Wan2.2-TI2V-5B-Diffusers" \
-F "prompt=Slow camera push-in." \
-F "image=@./first-frame.png;type=image/png" \
-F "width=1280" \
-F "height=704" \
-F "fps=24" \
-F "num_frames=121" \
-F "steps=50" \
-F 'extra_json={"max_sequence_length":256}'Local vision model helper endpoint:
| Endpoint | Purpose | Notes |
|---|---|---|
GET /v1/vision/models |
List available AbstractVision provider models. | Includes remote providers when their API key/base URL is configured and local models when they are present in known caches. |
Endpoints:
POST /v1/audio/transcriptions(multipart;file=...)POST /{provider}/v1/audio/transcriptions(multipart; provider-scoped STT)POST /v1/audio/speech(json;input=..., optionalvoice, optionalformat)POST /{provider}/v1/audio/speech(json; provider-scoped TTS)POST /v1/voice/clone(multipart; extension route for AbstractVoice-compatible voice cloning)POST /{provider}/v1/voice/clone(multipart; provider-scoped voice cloning)POST /v1/audio/translations(multipart; reserved for compatibility, returns501)POST /v1/audio/music(json; extension endpoint, requires a music capability plugin)POST /{provider}/v1/audio/music(json; provider/backend-scoped music route)
Local plugin fallback is enabled when model is omitted. OpenAI SDK-style
clients that require a non-empty model string can use abstractvoice/default.
Remote provider routing is enabled when model is supplied in provider/model format:
openai/gpt-4o-mini-transcribe,openai/whisper-1openai/gpt-4o-mini-tts,openai/tts-1openrouter/...for OpenRouter STT/TTS modelsportkey/...for Portkey-routed OpenAI-compatible audio modelsopenai-compatible/...for endpoints that implement OpenAI-compatible audio routes
Provider-scoped audio routes mirror chat routing. For example,
POST /openai/v1/audio/transcriptions with model=gpt-4o-mini-transcribe
is equivalent to POST /v1/audio/transcriptions with
model=openai/gpt-4o-mini-transcribe; the same applies to
/openai-compatible/v1/audio/speech and other supported provider prefixes.
For openai-compatible/..., request-level base_url can point to a local
AbstractVoice/OpenAI-compatible audio server. Loopback URLs are allowed by
default; non-loopback URLs require ABSTRACTCORE_SERVER_BASE_URL_ALLOWLIST.
If model is omitted, the endpoint delegates to local capability plugins
(typically abstractvoice) and returns 501 when no suitable plugin is installed.
Those local/plugin paths use the same internal generate(..., output=...)
dispatcher as the Python API; provider/model remote routes keep their
OpenAI-compatible HTTP wire behavior.
Install for remote audio:
pip install "abstractcore[server,remote]"Install for plugin-backed routing:
pip install "abstractcore[server]"
pip install "abstractcore[voice]"
pip install "abstractcore[music]"Notes:
abstractvoice0.10.17+ can install the base plugin path on Python 3.9 without OmniVoice, torch, or torchaudio. Python 3.10+ is recommended. Use explicit local aggregate profiles such asabstractcore[all-apple]orabstractcore[all-gpu]when you want local voice engines; AEC requires Python 3.11+./v1/audio/transcriptionsrequirespython-multipartfor form parsing (included in the server extra).- Uploaded audio is limited by
ABSTRACTCORE_SERVER_AUDIO_MAX_BYTES(default: 25 MB).
POST /v1/audio/transcriptions multipart parameters:
| Field | Required | Notes |
|---|---|---|
file |
yes | Audio file to transcribe, commonly mp3, mp4, mpeg, mpga, m4a, wav, or webm. |
model |
no | Provider/model id for remote STT (openai/gpt-4o-mini-transcribe, openai/whisper-1, openrouter/..., portkey/..., openai-compatible/...). Omit for local abstractvoice plugin fallback; abstractvoice/default is accepted for clients that require a model string. |
provider |
no | Optional routing hint when omitting model, using a provider-scoped route, or pairing the request with base_url. |
language |
no | Input language code such as en or fr. |
prompt |
no | Provider transcription prompt/context. |
response_format |
no | Provider response format such as json, text, srt, or vtt. |
temperature |
no | Provider sampling temperature where supported. |
format |
no | Audio format override for providers that need it, notably OpenRouter base64 audio input. |
base_url |
no | Endpoint override for local/gateway routing. Prefer this with openai-compatible/...; if set with openai/..., the request is sent to that URL instead of api.openai.com. Loopback is allowed by default; non-loopback requires allowlist. |
POST /v1/audio/speech JSON parameters:
| Field | Required | Notes |
|---|---|---|
input or text |
yes | Text to synthesize. text is the AbstractCore-compatible alias. |
model |
no | Provider/model id for remote TTS (openai/gpt-4o-mini-tts, openai/tts-1, openrouter/..., portkey/..., openai-compatible/...). Omit for local plugin fallback; abstractvoice/default is accepted. |
voice |
no | Provider/backend voice name; remote OpenAI-compatible routing defaults to alloy. OpenAI TTS voices include alloy, ash, ballad, coral, echo, fable, nova, onyx, sage, shimmer, verse, marin, and cedar; the Swagger example uses coral. |
response_format or format |
no | Audio output format. Remote providers commonly support mp3, wav, opus, aac, flac, or pcm; local plugin fallback defaults to wav. |
speed |
no | Speech speed multiplier when supported. |
instructions |
no | Provider-specific style/instruction text for expressive TTS. |
provider |
no | Optional routing hint when omitting model, using a provider-scoped route, or pairing the request with base_url. |
profile |
no | AbstractVoice profile hint for compatible local/plugin backends. |
quality_preset |
no | AbstractVoice/local-backend quality preset when supported. |
quality |
no | OpenAI-compatible quality selector or backend-specific quality hint. |
base_url |
no | Endpoint override for local/gateway routing. Prefer this with openai-compatible/...; if set with openai/..., the request is sent to that URL instead of api.openai.com. Loopback is allowed by default, non-loopback requires allowlist. |
Swagger UI can execute /v1/audio/speech. AbstractCore serves a small custom
Swagger wrapper that converts authenticated binary audio POST responses into
browser blob: URLs before Swagger renders the player. The example uses
response_format="wav" because WAV has explicit duration metadata and is the
most reliable inline preview format. If a browser still cannot play the inline
preview, use the response download or a curl --output command; the endpoint
returns normal audio/* bytes and includes a filename in Content-Disposition.
POST /v1/voice/clone and POST /{provider}/v1/voice/clone multipart parameters:
| Field | Required | Notes |
|---|---|---|
file |
yes | Reference voice audio file. |
model |
no | Provider/model id for remote clone routing. Use openai-compatible/default for an AbstractVoice-compatible server, or openai/default where OpenAI custom voice creation is available. Omit for local AbstractVoice clone fallback. |
provider |
no | Optional routing hint when omitting model, using a provider-scoped route, or pairing the request with base_url. |
tts_model |
no | Optional TTS model to associate with the clone for compatible local/plugin backends. |
cloning_engine |
no | Optional clone backend/engine selector for compatible local/plugin backends. |
name |
no | Friendly cloned voice name. |
reference_text |
no | Transcript of the reference audio when available. |
validate |
no | Ask compatible clone servers to validate/smoke-test the clone before returning. |
base_url |
no | OpenAI-compatible endpoint override for openai-compatible/...; loopback is allowed by default, non-loopback requires allowlist. |
clone_path |
no | Provider-specific clone path. Defaults to /voice/clone for OpenAI-compatible servers and /audio/voices for OpenAI. |
file_field |
no | Provider-specific multipart file field. Defaults to file; OpenAI uses audio_sample. |
consent |
no | Provider-specific consent id when custom voice creation requires it. |
The returned voice_id / id can be used as the voice value in
/v1/audio/speech when the selected backend supports custom voices.
POST /v1/audio/music and POST /{provider}/v1/audio/music JSON parameters:
| Field | Required | Notes |
|---|---|---|
prompt or input or text |
yes | Music generation prompt. |
provider |
no | Music backend selector, for example acemusic, acestep, stable-audio, stable-audio-3, or diffusers. The provider-scoped path can also select a backend, e.g. /acemusic/v1/audio/music or /diffusers/v1/audio/music. |
model |
no | Music model id for the selected backend, for example acemusic/ace-step-api for remote ACE Music or a Hugging Face repo id for local AbstractMusic backends. |
lyrics |
no | Optional lyrics for vocal music backends. |
duration_s |
no | Requested output duration in seconds. |
seed |
no | Deterministic seed when supported. |
num_inference_steps |
no | Diffusion/sampling step count when supported. |
guidance_scale |
no | Guidance scale when supported. |
instrumental |
no | Request instrumental output when supported. |
enhance_prompt / structure_prompt / auto_lyrics |
no | Prompt/lyrics planning controls for compatible music backends. |
text_planner_mode |
no | Host/plugin text-planning mode such as auto, on, or off. |
response_format or format |
no | Server contract supports wav, mp3, and flac; backend support can be narrower. |
| extra top-level fields | no | Best-effort passthrough to the installed music capability plugin. |
With abstractmusic>=0.1.12, the base install includes the remote ACE Music
backend. Configure ACEMUSIC_API_KEY in the server environment, optionally set
ACEMUSIC_BASE_URL, and use provider="acemusic" or the /acemusic/v1/audio/music
path. Local ACE-Step/Diffusers routes remain opt-in AbstractMusic extras.
Examples:
BASE=http://127.0.0.1:8000
TOKEN=replace-with-server-token
# Local/plugin TTS through AbstractCore's unified output dispatcher.
curl -sS -X POST "$BASE/v1/audio/speech" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: audio/wav" \
-d '{"input":"Hello from the updated AbstractCore server.","voice":"coral","response_format":"wav"}' \
--output /tmp/acore-speech.wav
# Local/plugin STT through AbstractCore's unified output dispatcher.
curl -sS -X POST "$BASE/v1/audio/transcriptions" \
-H "Authorization: Bearer $TOKEN" \
-F "file=@/tmp/acore-speech.wav;type=audio/wav" \
-F "language=en"
# Remote speech-to-text (STT)
curl -sS -X POST "$BASE/v1/audio/transcriptions" \
-H "Authorization: Bearer $TOKEN" \
-F "file=@speech.wav" \
-F "model=openai/gpt-4o-mini-transcribe" \
-F "language=en"
# Remote text-to-speech (TTS)
curl -sS -X POST "$BASE/v1/audio/speech" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"model":"openai/gpt-4o-mini-tts","input":"Hello!","voice":"coral","response_format":"wav"}' \
--output hello.wav
# Local abstractvoice TTS through the OpenAI-compatible endpoint
curl -sS -X POST "$BASE/v1/audio/speech" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"model":"abstractvoice/default","input":"Hello!","voice":"alloy","format":"wav"}' \
--output hello.wav
# Remote ACE Music through AbstractMusic.
# Start the server with ACEMUSIC_API_KEY set in its environment.
curl -sS -X POST "$BASE/v1/audio/music" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"prompt":"A short calm piano loop.","provider":"acemusic","duration_s":8,"format":"mp3"}' \
--output music.mp3
# Remote/local OpenAI-compatible voice clone endpoint
curl -sS -X POST "$BASE/v1/voice/clone" \
-H "Authorization: Bearer $TOKEN" \
-F "file=@reference.wav" \
-F "model=openai-compatible/default" \
-F "base_url=http://127.0.0.1:5000/v1" \
-F "name=my_voice" \
-F "reference_text=Hello from my reference recording." \
-F "validate=true"If you want to “ask a model about an audio file”, prefer one of:
- Run STT first (
/v1/audio/transcriptions) then send the transcript toPOST /v1/chat/completions, or - Configure the server’s default audio strategy (
config.audio.strategy) to enable STT fallback for audio attachments, then attach audio in chat requests.
AbstractCore server supports comprehensive file attachments using OpenAI-compatible multimodal message format, plus AbstractCore's convenient @filename syntax.
Security note (HTTP server): local file paths are disabled by default (including @/path/to/file and {"url": "/path/to/file"}).
Use http(s) URLs or data: base64, or enable local paths via ABSTRACTCORE_SERVER_MEDIA_ROOT (safe) / ABSTRACTCORE_SERVER_ALLOW_LOCAL_FILES=1 (unsafe).
Image analysis example using a local generated image:
BASE=http://127.0.0.1:8000
TOKEN=replace-with-server-token
python - <<'PY'
import base64
from pathlib import Path
Path("/tmp/acore-image.b64").write_text(base64.b64encode(Path("/tmp/acore-image.png").read_bytes()).decode("ascii"))
PY
jq -n --rawfile img /tmp/acore-image.b64 '{
model: "openai/gpt-4o-mini",
messages: [{
role: "user",
content: [
{type: "text", text: "Describe this image in one concise sentence."},
{type: "image_url", image_url: {url: ("data:image/png;base64," + $img)}}
]
}],
max_tokens: 80,
temperature: 0
}' > /tmp/acore-vision-chat.json
curl -sS -X POST "$BASE/v1/chat/completions" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
--data-binary @/tmp/acore-vision-chat.json \
| jq -r '.choices[0].message.content'- Images: PNG, JPEG, GIF, WEBP, BMP, TIFF
- Documents: PDF, DOCX, XLSX, PPTX
- Data/Text: CSV, TSV, TXT, MD, JSON, XML
- Size Limits: 10MB per file, 32MB total per request
Simple syntax that works with all providers (requires local paths enabled via ABSTRACTCORE_SERVER_MEDIA_ROOT or ABSTRACTCORE_SERVER_ALLOW_LOCAL_FILES=1):
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "user", "content": "What is in this document? @/path/to/report.pdf"}
]
}'Standard OpenAI format for images:
{
"model": "anthropic/claude-haiku-4-5",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "What is in this image?"},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.jpg"
}
}
]
}
]
}Base64 Images:
{
"type": "image_url",
"image_url": {
"url": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD..."
}
}AbstractCore supports OpenAI's planned file format with simplified structure (consistent with image_url):
File URL Format (Recommended - Same Pattern as image_url):
{
"model": "ollama/qwen3:4b",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Analyze this document"},
{
"type": "file",
"file_url": {
"url": "https://example.com/documents/report.pdf"
}
}
]
}
]
}Local File Path:
{
"type": "file",
"file_url": {
"url": "/Users/username/documents/data.csv"
}
}Note: local file paths require ABSTRACTCORE_SERVER_MEDIA_ROOT (safe) or ABSTRACTCORE_SERVER_ALLOW_LOCAL_FILES=1 (unsafe) on the server.
Base64 Data URL:
{
"type": "file",
"file_url": {
"url": "data:application/pdf;base64,JVBERi0xLjQKMSAwIG9iago<PAovVHlwZS..."
}
}Filename Extraction:
- URLs/Paths: Extracted automatically (
/path/file.pdf→file.pdf) - Base64: Generated from MIME type (
data:application/pdf;base64,...→document.pdf)
Combine text, images, and documents in a single request:
{
"model": "openai/gpt-4o",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Compare this chart with the data in the spreadsheet"},
{
"type": "image_url",
"image_url": {"url": "data:image/png;base64,iVBORw0KGgoAAAANS..."}
},
{
"type": "file",
"file_url": {
"url": "https://example.com/data/sales_data.xlsx"
}
}
]
}
]
}Using OpenAI Client:
import os
from openai import OpenAI
import base64
client = OpenAI(base_url="http://localhost:8000/v1", api_key=os.environ["ABSTRACTCORE_AUTH_TOKEN"])
# Method 1: @filename syntax
response = client.chat.completions.create(
model="anthropic/claude-haiku-4-5",
messages=[{"role": "user", "content": "Summarize @document.pdf"}]
)
# Method 2: File URL (HTTP/HTTPS)
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What are the key findings?"},
{
"type": "file",
"file_url": {
"url": "https://example.com/documents/report.pdf"
}
}
]
}]
)
# Method 3: Local file path
response = client.chat.completions.create(
model="anthropic/claude-haiku-4-5",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "Analyze this local document"},
{
"type": "file",
"file_url": {
"url": "/Users/username/documents/report.pdf"
}
}
]
}]
)
# Method 4: Base64 data URL
with open("report.pdf", "rb") as f:
file_data = base64.b64encode(f.read()).decode()
response = client.chat.completions.create(
model="lmstudio/qwen/qwen3-next-80b",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What are the key findings?"},
{
"type": "file",
"file_url": {
"url": f"data:application/pdf;base64,{file_data}"
}
}
]
}]
)Universal Provider Support:
# Same syntax works across all providers
providers_models = [
"openai/gpt-4o",
"anthropic/claude-haiku-4-5",
"ollama/qwen2.5vl:7b",
"lmstudio/qwen/qwen2.5-vl-7b"
]
for model in providers_models:
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "Analyze @data.csv and @chart.png"}]
)
print(f"{model}: {response.choices[0].message.content[:100]}...")Endpoint: POST /v1/responses
AbstractCore implements an OpenAI-compatible Responses-style API, including input_file support.
- OpenAI Compatible: Accepts OpenAI Responses API requests and returns an OpenAI Responses
object: "response"payload - Native File Support:
input_filetype designed specifically for document attachments - Cleaner API: Explicit separation between text (
input_text) and files (input_file) - Backward Compatible: Existing
messagesformat still works alongside newinputformat - Optional Streaming:
"stream": truestreams OpenAI Responses events (OpenAI format) or chat-completions chunks (legacy format)
OpenAI Responses API Format (Recommended):
{
"model": "gpt-4o",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "Analyze this document"},
{"type": "input_file", "file_url": "https://example.com/report.pdf"}
]
}
],
"tools": [
{"type": "web_search", "external_web_access": true}
],
"tool_choice": "auto",
"stream": false,
"max_output_tokens": 2000,
"temperature": 0.7
}Key parameters:
| Field | Required | Notes |
|---|---|---|
model |
yes | Provider/model id. Bare model ids may be auto-detected, but provider/model is preferred. |
input |
yes, unless messages is used |
OpenAI Responses input. Supports a string, or an array of input items such as {"type":"message","role":"user","content":"..."} and {"type":"function_call_output","call_id":"...","output":"..."}. Message content can be a string or an array of input_text / input_file / input_image items. |
messages |
yes, unless input is used |
Backward-compatible chat-completions request shape. |
instructions |
no | System-level instructions prepended ahead of input (best-effort). |
stream |
no | When true, returns server-sent events. |
tools |
no | Responses-style tools. AbstractCore does not execute tools server-side; tools are only transported to the model prompt. web_search* tools are normalized into function tools for local-model prompting and host-side execution. Unsupported built-in tool types return a 400 error. |
tool_choice |
no | Tool selection control; normalized where needed (best-effort). |
max_output_tokens / max_tokens, temperature, top_p, stop, seed, frequency_penalty, presence_penalty |
no | Standard generation controls, forwarded where supported. |
base_url, agent_format, thinking, prompt_cache_key, prompt_cache_retention, timeout_s, unload_after |
no | AbstractCore text-inference extensions with the same behavior as /v1/chat/completions for shared fields. |
Legacy Format (Still Supported):
{
"model": "openai/gpt-4",
"messages": [
{"role": "user", "content": "Tell me a story"}
],
"stream": false
}The server automatically detects which format you're using:
- OpenAI Format: Presence of
inputfield → converts to internal format - Legacy Format: Presence of
messagesfield → processes directly - Error: Missing both fields → returns 400 error with clear message
Simple Text Request:
curl -X POST http://localhost:8000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "lmstudio/qwen/qwen3-next-80b",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "What is Python?"}
]
}
]
}'File Analysis:
curl -X POST http://localhost:8000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "Analyze the letter and summarize key points"},
{"type": "input_file", "file_url": "https://www.berkshirehathaway.com/letters/2024ltr.pdf"}
]
}
],
"thinking": "off",
"prompt_cache_key": "tenantA:doc-review"
}'Multiple Files:
curl -X POST http://localhost:8000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-haiku-4-5",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "Compare these documents"},
{"type": "input_file", "file_url": "https://example.com/report1.pdf"},
{"type": "input_file", "file_url": "https://example.com/report2.pdf"},
{"type": "input_file", "file_url": "https://example.com/chart.png"}
]
}
],
"max_tokens": 2000
}'Streaming Response:
curl -X POST http://localhost:8000/v1/responses \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "Summarize this document"},
{"type": "input_file", "file_url": "https://example.com/document.pdf"}
]
}
],
"stream": true
}' --no-bufferAll file types supported via URL, local path, or base64:
- Documents: PDF, DOCX, XLSX, PPTX
- Data Files: CSV, TSV, JSON, XML
- Text Files: TXT, MD
- Images: PNG, JPEG, GIF, WEBP, BMP, TIFF
- Size Limits: 10MB per file, 32MB total per request
Source Options:
// HTTP/HTTPS URL
{"type": "input_file", "file_url": "https://example.com/report.pdf"}
// Local file path
{"type": "input_file", "file_url": "/path/to/document.xlsx"}
// Base64 data URL
{"type": "input_file", "file_url": "data:application/pdf;base64,JVBERi0x..."}import os
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1", api_key=os.environ["ABSTRACTCORE_AUTH_TOKEN"])
# Direct request to /v1/responses endpoint
import requests
response = requests.post(
"http://localhost:8000/v1/responses",
json={
"model": "gpt-4o",
"input": [
{
"role": "user",
"content": [
{"type": "input_text", "text": "Analyze this document"},
{"type": "input_file", "file_url": "https://example.com/report.pdf"}
]
}
]
}
)
result = response.json()
print(result["choices"][0]["message"]["content"])Endpoint: POST /v1/embeddings
Generate embedding vectors for semantic search, RAG, and similarity analysis.
Request:
{
"input": "Text to embed",
"model": "huggingface/sentence-transformers/all-MiniLM-L6-v2"
}Supported Providers:
- HuggingFace: Local models with ONNX acceleration
- Ollama:
ollama/granite-embedding:278m, etc. - LMStudio: Any loaded embedding model
- OpenAI:
openai/text-embedding-3-small,openai/text-embedding-3-large - OpenRouter:
openrouter/openai/text-embedding-3-small, etc. - Portkey:
portkey/...with your Portkey routing configuration - OpenAI-compatible:
openai-compatible/...against configured/local/v1/embeddingsendpoints
Anthropic does not expose a native embeddings API. Use OpenAI, OpenRouter, Portkey, an OpenAI-compatible endpoint, or a local embedding provider.
For endpoint-backed providers such as LM Studio, vLLM, and generic
OpenAI-compatible servers, the embedding route does not require the embedding
model to appear in a chat model catalogue before the request is sent. This
supports embedding-only endpoints whose /models response is incomplete or
chat-only.
OpenAI-compatible request fields are forwarded where supported:
dimensionsencoding_formatuserbase_url(AbstractCore extension; loopback by default, allowlist required for non-loopback)
Parameters:
| Field | Required | Notes |
|---|---|---|
input |
yes | String or array of strings. Arrays return one vector per input item. |
model |
yes | Provider/model id such as openai/text-embedding-3-small, openrouter/openai/text-embedding-3-small, portkey/..., openai-compatible/..., ollama/..., lmstudio/..., or huggingface/.... |
encoding_format |
no | float by default; base64 is accepted where supported by the provider/backend. |
dimensions |
no | Requested output dimensions for providers that support native dimension reduction; local backends may truncate when appropriate. |
user |
no | End-user identifier forwarded to providers that support abuse monitoring. |
base_url |
no | OpenAI-compatible endpoint override with the same allowlist policy as chat. |
api_key |
no | Deprecated/disabled in the body. Use X-AbstractCore-Provider-API-Key for provider overrides. |
Batch Embedding:
curl -X POST http://localhost:8000/v1/embeddings \
-H "Content-Type: application/json" \
-d '{
"input": ["text 1", "text 2", "text 3"],
"model": "ollama/granite-embedding:278m"
}'Endpoint: GET /v1/models
List all available models from configured providers.
Query Parameters:
provider: Filter by provider (e.g.,ollama,openai,anthropic,lmstudio,openai-compatible).input_type: Filter by input capability:text,image,audio, orvideo.output_type: Filter by output capability:textorembeddings.base_url: Optional upstream base URL override for providers that support OpenAI-compatible discovery. Loopback is allowed by default; non-loopback requiresABSTRACTCORE_SERVER_BASE_URL_ALLOWLIST.api_key: Optional upstream provider API key override for discovery. Requiresprovider=...so the override target is unambiguous. PreferX-AbstractCore-Provider-API-Key.
Examples:
# All models
curl http://localhost:8000/v1/models
# Ollama models only
curl http://localhost:8000/v1/models?provider=ollama
# Embedding models only
curl http://localhost:8000/v1/models?output_type=embeddings
# Vision-capable input models
curl http://localhost:8000/v1/models?input_type=image
# Ollama embeddings
curl http://localhost:8000/v1/models?provider=ollama&output_type=embeddingsEndpoint: GET /providers
List all available providers and their status.
Query Parameters:
include_models(optional, defaultfalse): Include model lists for each provider. This is slower because it may query provider registries/endpoints.
Response:
{
"providers": [
{
"name": "ollama",
"type": "llm",
"model_count": 15,
"status": "available"
}
]
}Endpoint: GET /health
Server health check for monitoring.
Response: includes status, server version, and enabled feature flags.
If you want the gateway itself to keep a local model warm, use:
POST /acore/models/loadGET /acore/models/loadedPOST /acore/models/unload
/acore/models/load creates or reuses a task-specific runtime. Omitted task
keeps the existing text behavior, keyed by provider, model, optional
base_url, and the explicit provider-key override when one is supplied. Later
/v1/chat/completions calls that target the same provider/model automatically
reuse that warm runtime instead of creating a fresh provider instance per
request.
For text-generation runtimes, Core reports provider-owned loaded-model truth
separately from gateway client cache state. A configured default model, model
catalog row, reachable server, or cached Core client is not proof that the
provider has a model loaded. Providers that can verify residency expose it
through get_model_residency(...); providers that cannot verify it report
provider_residency_verified=false, provider_resident=null, and loaded=false.
When a provider exposes a native load/warm hook, /acore/models/load calls it
and then verifies the result through the same residency contract.
For non-text tasks, the same route delegates to capability-owned load/list/unload:
image_generation, video_generation, text_to_video, and image_to_video
reuse the server's AbstractVision backend cache, while tts and stt delegate
through the shared AbstractVoice capability core when the selected plugin
exposes residency hooks. Remote OpenAI-compatible image/video/audio providers
are reported as configured rather than locally loaded unless the upstream
exposes a real loaded-state signal.
loaded_new is an event signal for the load call, not a synonym for loaded.
For capability-backed tasks it is true only when the backend reports or clearly
implies that this request transitioned the model from not loaded to loaded.
Already-loaded models should return loaded_new=false.
Prompt-cache routes support two modes:
- direct gateway mode:
- target a previously loaded runtime with
provider+model
- target a previously loaded runtime with
- proxy mode:
- target an upstream AbstractEndpoint with
base_url
- target an upstream AbstractEndpoint with
In proxy mode, the gateway normalizes base_url, enforces the same base URL
allowlist rules as other request-level routing, and forwards provider auth only
from X-AbstractCore-Provider-API-Key or from Authorization when server auth
is disabled.
Common fields:
| Field | Location | Required | Notes |
|---|---|---|---|
runtime_id |
query or JSON body | no | Stable selector returned by /acore/models/load. Use this when multiple warm runtimes share the same provider + model. |
provider + model |
query or JSON body | yes, unless base_url is provided |
Select a loaded gateway-local runtime. |
base_url |
query or JSON body | yes, unless runtime_id or provider + model is provided |
Upstream AbstractEndpoint URL. It may include /v1; the proxy strips that suffix for control-plane calls. In local mode it can also disambiguate a warm runtime that was loaded with a base URL. |
X-AbstractCore-Provider-API-Key |
header | no | Upstream endpoint token when required. |
api_key |
query/body | no | Deprecated/disabled; do not use. |
ttl_s |
JSON body | no | Optional upstream cache TTL in seconds, where supported. |
Operations:
| Endpoint | Method | Parameters | Result |
|---|---|---|---|
/acore/prompt_cache/capabilities |
GET | provider + model or base_url |
Cache features on the selected local or upstream runtime. |
/acore/prompt_cache/stats |
GET | provider + model or base_url |
Cache stats on the selected local or upstream runtime. |
/acore/prompt_cache/set |
POST | provider + model or base_url, key, make_default, ttl_s |
Select/create a cache key locally or upstream. |
/acore/prompt_cache/update |
POST | provider + model or base_url, key, prompt or messages, system_prompt, tools, optional thinking, add_generation_prompt, ttl_s |
Prepare prompt/messages/tools into a local or upstream cache key. |
/acore/prompt_cache/fork |
POST | provider + model or base_url, from_key, to_key, make_default, ttl_s |
Fork an existing local or upstream key. |
/acore/prompt_cache/clear |
POST | provider + model or base_url, optional key |
Clear a local or upstream key, or default/all cache state depending on backend support. |
/acore/prompt_cache/prepare_modules |
POST | provider + model or base_url, namespace, modules, make_default, ttl_s, version |
Prepare reusable module/tool context locally or upstream. |
Example:
curl -X POST http://localhost:8000/acore/prompt_cache/update \
-H "Authorization: Bearer $ABSTRACTCORE_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"base_url": "http://127.0.0.1:8001/v1",
"key": "project-default",
"messages": [{"role": "system", "content": "You are concise."}],
"thinking": "off",
"ttl_s": 3600
}'thinking on /acore/prompt_cache/update is applied before the provider appends the cached fragment. This keeps cache-prefilled prompt state aligned with later /v1/chat/completions or /v1/responses calls when reasoning control changes prompt serialization.
Memory-bloc routes also support two modes:
- direct gateway mode:
POST /acore/models/load- local
POST /acore/blocs/upsert_text - local
POST /acore/blocs/kv/ensure - local
POST /acore/blocs/kv/load - then normal
/v1/chat/completions
- proxy mode:
- the same
/acore/blocs/*routes withbase_urltargeting an upstreamAbstractEndpoint
- the same
That distinction matters:
- gateway-local bloc records live in the gateway bloc store
- gateway-local loaded cache keys live on the selected loaded runtime
- proxy-mode loaded cache keys live on the upstream
AbstractEndpoint
Operations:
| Endpoint | Method | Parameters | Result |
|---|---|---|---|
/acore/blocs/upsert_text |
POST | optional base_url, path, content, optional bloc metadata |
Persist extracted text into the local bloc store or upstream bloc store. |
/acore/blocs/record |
GET | optional base_url, sha256 or bloc_id |
Inspect a local or upstream bloc record. |
/acore/blocs/kv/manifest |
GET | runtime_id or provider + model or base_url, sha256 or bloc_id, optional artifact_path |
Inspect the local or upstream KV manifest for the selected model. |
/acore/blocs/kv/ensure |
POST | runtime_id or provider + model or base_url, sha256 or bloc_id, optional artifact_path, force_rebuild, debug |
Compile or validate the durable provider/model bloc KV artifact locally or upstream. |
/acore/blocs/kv/load |
POST | runtime_id or provider + model or base_url, sha256 or bloc_id, optional artifact_path, stable_cache_key, key, make_default, force_rebuild, debug |
Load or fork the local or upstream artifact into a prompt-cache key and return prompt_cache_binding. |
Typical direct gateway flow:
POST /acore/models/loadPOST /acore/blocs/upsert_textPOST /acore/blocs/kv/ensurePOST /acore/blocs/kv/load- call
/v1/chat/completionswith returnedartifact.prompt_cache_bindingwhen exact binding is required
Typical remote flow:
POST /acore/blocs/upsert_textPOST /acore/blocs/kv/ensurePOST /acore/blocs/kv/load- call
/v1/chat/completionswith returnedartifact.prompt_cache_bindingwhen exact binding is required
Example:
curl -X POST http://localhost:8000/acore/blocs/kv/load \
-H "Authorization: Bearer $ABSTRACTCORE_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"base_url": "http://127.0.0.1:8001/v1",
"sha256": "abababababababababababababababababababababababababababababababab",
"stable_cache_key": "stable:orbit",
"key": "work:orbit",
"make_default": false,
"debug": true
}'The load response includes:
artifact.key: the worker-local runtime cache keyartifact.binding_id: the opaque exact-artifact identityartifact.prompt_cache_binding: object to pass to chat asprompt_cache_bindingartifact.debug: verbose proof fields whendebug=true
Supported local artifact backends share this route shape: MLX, HuggingFace transformers, and
HuggingFace GGUF exact-renderer paths. Remote providers and unsupported GGUF chat formats remain
best-effort prompt_cache_key paths.
AbstractCore Server is OpenAI-compatible. Most OpenAI-compatible CLIs/SDKs can be pointed at it by setting:
OPENAI_BASE_URL="http://localhost:8000/v1"(or an equivalent flag)OPENAI_API_KEY="unused"(many clients require a non-empty key even for local servers)
- The server does not execute tools (it always returns tool calls; your host/runtime executes them).
- It can emit tool calls either as structured
tool_calls(OpenAI/Codex style) or as tagged content for clients that parse tool calls from assistant text. - Control the output format with
agent_format(request body, AbstractCore extension), or rely on auto-detection (user-agent + model heuristics).
Supported agent_format values: auto, openai, codex, qwen3, llama3, gemma, xml, passthrough.
export OPENAI_BASE_URL="http://localhost:8000/v1"
export OPENAI_API_KEY="unused"
codex --model "ollama/qwen3-coder:30b" "Write a factorial function"curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ollama/qwen3:4b-instruct-2507-q4_K_M",
"messages": [{"role": "user", "content": "Use the tool."}],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather by city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"]
}
}
}
],
"agent_format": "llama3"
}'Release images are published to GitHub Container Registry after the matching PyPI release succeeds:
ghcr.io/lpalbou/abstractcore-server:<version>The image is built from PyPI, not from the repository checkout, and installs:
abstractcore[server,remote,media,tokens,compression]==<version>It includes remote chat/responses, remote embeddings, remote STT/TTS routing,
remote OpenAI-compatible image proxying, server dependencies, media parsing,
token counting, and compression helpers. It intentionally does not include
AbstractCore local LLM runtimes (vllm, mlx, huggingface), local embedding
dependencies (sentence-transformers), or optional capability plugin entry
points. Remote image/audio OpenAI-compatible endpoint routes still work without
those plugins. Build a custom image with
abstractcore[server,remote,media,tokens,compression,voice,vision] when you
want plugin-backed media catalogs or plugin default routes; these capability
extras stay remote-light. Add explicit local aggregate profiles such as
abstractcore[all-apple] or abstractcore[all-gpu] only when you want local
native inference engines.
Run:
docker pull ghcr.io/lpalbou/abstractcore-server:2.13.12For local development, keep secrets in an uncommitted .env file:
ABSTRACTCORE_AUTH_TOKEN=replace-with-a-server-token
OPENAI_API_KEY=sk-...
OPENROUTER_API_KEY=sk-or-...
ANTHROPIC_API_KEY=sk-ant-...
PORTKEY_API_KEY=pk_...
PORTKEY_CONFIG=pcfg_...
OPENAI_BASE_URL=http://host.docker.internal:1234/v1Then run the image with that environment file:
docker run --rm --name abstractcore-server \
-p 127.0.0.1:8000:8000 \
--env-file .env \
ghcr.io/lpalbou/abstractcore-server:2.13.12ABSTRACTCORE_AUTH_TOKEN is the AbstractCore server auth token. Clients send it as Authorization: Bearer <token>.
At /docs, use Swagger UI's normal Authorize button when server auth is
enabled; AbstractCore validates that bearer token before Swagger marks it
authorized. Provider keys such as
OPENAI_API_KEY, OPENROUTER_API_KEY, ANTHROPIC_API_KEY, and
PORTKEY_API_KEY stay inside the server container.
Set ABSTRACTCORE_SERVER_PROTECT_DOCS=1 if /docs, /redoc, and
/openapi.json should require the same server token.
For local OpenAI-compatible endpoints such as LM Studio or Ollama's /v1
server, point the container at a URL reachable from Docker:
docker run --rm --name abstractcore-server \
-p 127.0.0.1:8000:8000 \
-e ABSTRACTCORE_AUTH_TOKEN="$ABSTRACTCORE_AUTH_TOKEN" \
-e OPENAI_BASE_URL="http://host.docker.internal:1234/v1" \
-e OPENAI_API_KEY="$OPENAI_API_KEY" \
ghcr.io/lpalbou/abstractcore-server:2.13.12version: '3.8'
services:
abstractcore:
image: ghcr.io/lpalbou/abstractcore-server:2.13.12
ports:
- "8000:8000"
environment:
- ABSTRACTCORE_AUTH_TOKEN=${ABSTRACTCORE_AUTH_TOKEN}
- ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
- PORTKEY_API_KEY=${PORTKEY_API_KEY}
- PORTKEY_CONFIG=${PORTKEY_CONFIG}
- OPENAI_BASE_URL=${OPENAI_BASE_URL}
restart: unless-stoppedpip install gunicorn
gunicorn abstractcore.server.app:app \
--worker-class uvicorn.workers.UvicornWorker \
--workers 4 \
--bind 0.0.0.0:8000Debug mode provides comprehensive logging and detailed error reporting for troubleshooting API issues.
# Method 1: Using command line flag (recommended)
python -m abstractcore.server.app --debug
# Method 2: Using environment variable
export ABSTRACTCORE_DEBUG=true
python -m abstractcore.server.app
# Method 3: With uvicorn directly
export ABSTRACTCORE_DEBUG=true
uvicorn abstractcore.server.app:app --host 0.0.0.0 --port 8000Enhanced Error Reporting:
- Before: Uninformative "422 Unprocessable Entity" messages
- After: Detailed field validation errors with request body capture
Example Debug Output:
🔴 Request Validation Error (422) | method=POST | error_count=2 | errors=[
{"field": "body -> model", "message": "Field required", "type": "missing"},
{"field": "body -> messages", "message": "Field required", "type": "missing"}
] | client=127.0.0.1
📋 Request Body (Validation Error) | body={"invalid": "data"}Request/Response Tracking:
- Full HTTP request details (method, URL, headers, client IP)
- Response status codes and processing times
- Structured JSON logging for machine processing
Log Files:
logs/abstractcore_TIMESTAMP.log- Structured eventslogs/YYYYMMDD-payloads.jsonl- Full request bodieslogs/verbatim_TIMESTAMP.jsonl- Complete I/O
Useful Commands:
# Find errors
grep '"level": "error"' logs/abstractcore_*.log
# Track token usage
cat logs/verbatim_*.jsonl | jq '.metadata.tokens | .input + .output' | \
awk '{sum+=$1} END {print "Total:", sum}'
# Monitor specific model
grep '"model": "qwen3-coder:30b"' logs/verbatim_*.jsonlimport requests
providers = [
"ollama/qwen3-coder:30b",
"openai/gpt-4o-mini",
"anthropic/claude-haiku-4-5"
]
def generate_with_fallback(prompt):
for model in providers:
try:
response = requests.post(
"http://localhost:8000/v1/chat/completions",
json={"model": model, "messages": [{"role": "user", "content": prompt}]},
timeout=30
)
if response.status_code == 200:
return response.json()
except Exception:
continue
raise Exception("All providers failed")# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen3-coder:30b
# Use via AbstractCore server
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ollama/qwen3-coder:30b",
"messages": [{"role": "user", "content": "Write a Python function"}]
}'# Check port availability
lsof -i :8000
# Use different port
uvicorn abstractcore.server.app:app --port 3000# Check providers
curl http://localhost:8000/providers
# Check API keys
echo $OPENAI_API_KEY
# Start Ollama
ollama serve
ollama list# Set API keys
export ABSTRACTCORE_AUTH_TOKEN="acore-server-secret"
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
# Restart server after setting keys- Universal: One API for all providers
- OpenAI Compatible: Drop-in replacement
- Simple: Clean, focused endpoints
- Fast: Lightweight, high-performance
- Debuggable: Comprehensive logging
- CLI Ready: Codex, Gemini CLI, Crush support
- Production Ready: Docker, multi-worker, health checks
- Getting Started - Core library quick start
- Architecture - System architecture including server
- Python API Reference - Core library API
- Embeddings Guide - Embeddings deep dive
- Troubleshooting - Common issues and solutions
AbstractCore Server - One server, all models, any client.