The default install is intentionally lightweight. It includes the core API (create_llm, BasicSession, tool definitions, structured output plumbing) and uses only small dependencies (pydantic, httpx).
Anything heavy (provider SDKs, torch/transformers, PDF parsing, embeddings models, local voice/image/music engines, web scraping deps, the HTTP server) is behind install extras. See Getting Started and Prerequisites.
- Hosted SDK bundle:
pip install "abstractcore[remote]"installs OpenAI + Anthropic. - OpenAI:
pip install "abstractcore[openai]" - Anthropic:
pip install "abstractcore[anthropic]" - OpenRouter, Portkey, Ollama, LM Studio, and generic OpenAI-compatible
/v1endpoints: core install is enough (pip install abstractcore). - HuggingFace (transformers/torch; heavy):
pip install "abstractcore[huggingface]" - Apple Silicon local LLM stack:
pip install "abstractcore[apple]"(alias ofmlx; heavy) - GPU local LLM stack:
pip install "abstractcore[gpu]"(alias ofvllm; heavy) - Explicit provider extras remain available:
abstractcore[mlx],abstractcore[vllm]
These providers work with the core install (no provider extra): ollama, lmstudio, openrouter, portkey, openai-compatible.
# zsh: keep quotes
pip install "abstractcore[remote,media,tools]"For “turnkey” local-runtime installs, see README.md (all-apple for Apple Silicon, all-gpu for NVIDIA GPU). The apple and gpu extras install only the hardware-specific local LLM engine stack; the all-* extras are larger aggregate profiles that also include local capability plugin engines where supported.
You probably installed a heavy extra (most commonly abstractcore[huggingface], abstractcore[apple]/abstractcore[mlx], abstractcore[gpu]/abstractcore[vllm], or abstractcore[all-*]). The core install (pip install abstractcore) does not include torch/transformers.
- Provider: a backend adapter (
openai,anthropic,ollama,lmstudio, …) - Model: a provider-specific model name (for example
gpt-4o-miniorqwen3:4b-instruct-2507-q4_K_M)
from abstractcore import create_llm
llm = create_llm("openai", model="gpt-4o-mini")AbstractCore is one of the core packages in the AbstractFramework ecosystem:
- AbstractFramework (umbrella): https://github.com/lpalbou/AbstractFramework
- AbstractCore (this package): unified LLM interface + cross-provider infrastructure
- AbstractRuntime: durable tool/effect execution, workflows, and state persistence — https://github.com/lpalbou/abstractruntime
AbstractCore is usable standalone. In the ecosystem, the common pattern is:
- AbstractCore produces
resp.content+resp.tool_calls - a runtime (for example AbstractRuntime) decides whether/how to execute tools (policy, sandboxing, retries, persistence)
See Architecture and Tool Calling.
Use the matching provider and set base_url (or the provider’s base-url env var).
We recommend open-source/local providers first; cloud and gateway providers are optional.
Examples:
from abstractcore import create_llm
llm = create_llm("ollama", model="qwen3:4b-instruct-2507-q4_K_M", base_url="http://localhost:11434")
llm = create_llm("lmstudio", model="qwen/qwen3-4b-2507", base_url="http://localhost:1234/v1")
llm = create_llm("vllm", model="Qwen/Qwen3-Coder-30B-A3B-Instruct", base_url="http://localhost:8000/v1")For a generic OpenAI-compatible endpoint, use openai-compatible:
llm = create_llm("openai-compatible", model="my-model", base_url="http://localhost:1234/v1")See Prerequisites for setup details and env var names.
Gateways like Portkey and OpenRouter forward your payload to the routed backend model, and strict families (for example OpenAI reasoning models like gpt-5/o1) reject unsupported parameters.
In AbstractCore’s gateway providers:
- Portkey uses
PORTKEY_API_KEYandPORTKEY_CONFIG(config id) for routing. - Optional params (
temperature,top_p,max_output_tokens) are only sent when you explicitly set them. - Reasoning families (gpt-5/o1) drop
temperature/top_pand usemax_completion_tokensinstead ofmax_tokens.
If you still see errors, confirm:
- You aren’t mixing routing modes (config vs virtual key vs provider-direct).
- You’re not injecting parameters via Portkey config overrides that the backend rejects.
You can use environment variables, or persist settings via the config CLI:
abstractcore --config
abstractcore --set-api-key openai sk-...
abstractcore --set-api-key anthropic sk-ant-...
abstractcore --statusConfig is stored in ~/.abstractcore/config/abstractcore.json. See Centralized Config.
Yes. You do not have to give a client the AbstractCore server auth token. If ABSTRACTCORE_AUTH_TOKEN is not configured, a client can bring its own upstream provider key, for example an Anthropic, OpenRouter, or Portkey key, by sending it as X-AbstractCore-Provider-API-Key.
That key is forwarded only to the provider requested by the model route, such as anthropic/..., openrouter/..., or portkey/.... It does not unlock other server-configured provider keys, and it does not grant access to providers the client did not supply credentials for.
If ABSTRACTCORE_AUTH_TOKEN is configured, Authorization is reserved for the AbstractCore server auth token. In that mode, use X-AbstractCore-Provider-API-Key only when you want to override the upstream provider key for a single request.
Provider keys in request bodies remain disabled. Select discovery endpoints accept an api_key query parameter for tooling/Swagger UI convenience, but headers remain preferred.
By default, AbstractCore runs in pass-through mode (execute_tools=False): it returns tool calls in resp.tool_calls, and your host/runtime decides whether/how to execute them.
Automatic execution (execute_tools=True) exists but is deprecated for most use cases. See Tool Calling.
These built-in web tools live in abstractcore.tools.common_tools and require:
pip install "abstractcore[tools]"web_search: fuller DuckDuckGo result set (good when you want breadth or more options).skim_websearch: compact/filtered search results (good default for agents to keep prompts smaller). Defaults to 5 results and truncates long snippets.skim_url: fast URL triage (fetches only a prefix and extracts lightweight metadata + a short preview). Defaults:max_bytes=200_000,max_preview_chars=1200,max_headings=8.fetch_url: full fetch + parsing for text-first types (HTML→Markdown, JSON/XML/text). For PDFs/images/other binaries it returns metadata and optional previews; it does not do full PDF text extraction. It downloads up to 10MB by default; useinclude_full_content=Falsefor smaller outputs.
Recommended workflow: skim_websearch → skim_url → fetch_url (use include_full_content=False when you want a smaller fetch_url output).
Use tool-call syntax rewriting:
- Python: pass
tool_call_tags=...togenerate()/agenerate() - Server: set
agent_formatin requests
Pass a Pydantic model via response_model=...:
from pydantic import BaseModel
from abstractcore import create_llm
class Answer(BaseModel):
title: str
bullets: list[str]
llm = create_llm("openai", model="gpt-4o-mini")
result = llm.generate("Summarize HTTP/3 in 3 bullets.", response_model=Answer)See Structured Output.
Structured output is validated against your schema. If validation fails, AbstractCore retries with feedback (up to the configured retry limit). Common fixes:
- simplify schemas (fewer nested structures; fewer strict constraints)
- tighten prompts (be explicit about allowed values and ranges)
- increase timeouts for slow backends
See Structured Output and Troubleshooting.
Those require the media extra:
pip install "abstractcore[media]"Then pass media=[...] to generate() or use the media pipeline. See Media Handling.
Audio and video attachments are supported via media=[...], but they are policy-driven by design:
- Audio defaults to
audio_policy="native_only"(fails loudly unless the model supports native audio input). - Video defaults to
video_policy="auto"(native video when supported; otherwise sample frames and route through image/vision handling). Frame sampling requiresffmpeg/ffprobe.
Speech-to-text fallback for audio (audio_policy="speech_to_text" or "auto") typically requires installing abstractvoice (capability plugin).
You can set defaults via the config CLI:
abstractcore --set-audio-strategy auto
abstractcore --set-video-strategy auto
abstractcore --set-video-max-frames 6See:
- Media Handling (policies + fallbacks)
- Vision Capabilities (image/video input + fallback behavior)
Install the optional capability plugin package:
pip install "abstractcore[voice]"This installs the remote-light AbstractVoice capability path. Local voice
engines require an explicit local profile such as abstractcore[all-apple] or
abstractcore[all-gpu].
Then use the deterministic capability surfaces:
from abstractcore import create_llm
llm = create_llm("openai", model="gpt-4o-mini") # provider/model is only for LLM calls; STT/TTS are deterministic
print(llm.capabilities.status()) # shows which capability backends are available/selected
wav_bytes = llm.voice.tts("Hello", format="wav")
text = llm.audio.transcribe("speech.wav")If you run the optional HTTP server, you can also use OpenAI-compatible endpoints:
POST /v1/audio/transcriptionsPOST /v1/audio/speech
See: Server and Capabilities.
Generative vision is dependency-light by default. AbstractCore Server can proxy OpenAI-compatible image endpoints without local vision runtimes. For local Diffusers/sdcpp image generation, install the vision extra:
pip install "abstractcore[server,vision]"You can use generative vision through AbstractCore’s llm.vision.* capability plugin surface, or through AbstractCore Server’s optional endpoints:
POST /v1/images/generationsPOST /v1/images/edits
Omit model with the server endpoints only when this server has a configured
AbstractVision/OpenAI-compatible image default. Use model="diffusers/default"
or model="diffusers/<huggingface-repo>" for explicit Diffusers routing,
model="sdcpp/default" for configured stable-diffusion.cpp, or
model="openai-compatible/<model>" with a configured image base URL for remote
OpenAI-compatible endpoints. Local Diffusers is cache-only by default, so
pre-download the model or set ABSTRACTCORE_VISION_ALLOW_DOWNLOAD=1 /
ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1 when runtime downloads are intended.
See: Server, Capabilities, and abstractvision/docs/reference/abstractcore-integration.md (in the AbstractVision repo).
Glyph visual-text compression is an optional feature for long documents. Install:
pip install "abstractcore[compression]"(renderer)- plus
pip install "abstractcore[media]"if you want PDF extraction support
See Glyph Visual-Text Compression.
Embeddings are opt-in:
pip install "abstractcore[embeddings]"Then import from the embeddings module:
from abstractcore.embeddings import EmbeddingManagerSee Embeddings.
No. The server is optional and is mainly for:
- exposing one OpenAI-compatible
/v1endpoint that can route to multiple providers/models - integrating with OpenAI-compatible clients and agentic CLIs
Install and run:
pip install "abstractcore[server]"
python -m abstractcore.server.appSee Server.
- Logging (console/file) is configured via the config CLI and config file. See Structured Logging.
- Interaction tracing is opt-in (
enable_tracing=True). See Interaction Tracing.
- Per-provider: pass
timeout=...tocreate_llm(...)(timeout=Nonemeans unlimited). - Process-wide default: set
abstractcore --set-default-timeout 0(0 = unlimited), or set a larger value. - Some CLI apps have their own
--timeoutflags; run--helpfor the exact behavior.
See Troubleshooting and Centralized Config.
The HuggingFace provider respects AbstractCore’s offline-first settings. If you want HuggingFace to fetch from the Hub, update ~/.abstractcore/config/abstractcore.json:
- set
"offline_first": false - set
"force_local_files_only": false
Restart your Python process after changing this (the provider reads these settings at import time).
AbstractCore focuses on provider abstraction + infrastructure (tools, structured output, media handling, tracing). It does not ship a full RAG pipeline or multi-step agent orchestration. See Capabilities.