FAQ

What do I get with `pip install abstractcore`?

The default install is intentionally lightweight. It includes the core API (create_llm, BasicSession, tool definitions, structured output plumbing) and uses only small dependencies (pydantic, httpx).

Anything heavy (provider SDKs, torch/transformers, PDF parsing, embeddings models, local voice/image/music engines, web scraping deps, the HTTP server) is behind install extras. See Getting Started and Prerequisites.

Which extra do I need for my provider?

Hosted SDK bundle: pip install "abstractcore[remote]" installs OpenAI + Anthropic.
OpenAI: pip install "abstractcore[openai]"
Anthropic: pip install "abstractcore[anthropic]"
OpenRouter, Portkey, Ollama, LM Studio, and generic OpenAI-compatible /v1 endpoints: core install is enough (pip install abstractcore).
HuggingFace (transformers/torch; heavy): pip install "abstractcore[huggingface]"
Apple Silicon local LLM stack: pip install "abstractcore[apple]" (alias of mlx; heavy)
GPU local LLM stack: pip install "abstractcore[gpu]" (alias of vllm; heavy)
Explicit provider extras remain available: abstractcore[mlx], abstractcore[vllm]

These providers work with the core install (no provider extra): ollama, lmstudio, openrouter, portkey, openai-compatible.

How do I combine extras?

# zsh: keep quotes
pip install "abstractcore[remote,media,tools]"

For “turnkey” local-runtime installs, see README.md (all-apple for Apple Silicon, all-gpu for NVIDIA GPU). The apple and gpu extras install only the hardware-specific local LLM engine stack; the all-* extras are larger aggregate profiles that also include local capability plugin engines where supported.

Why did my install pull `torch` / take a long time?

You probably installed a heavy extra (most commonly abstractcore[huggingface], abstractcore[apple]/abstractcore[mlx], abstractcore[gpu]/abstractcore[vllm], or abstractcore[all-*]). The core install (pip install abstractcore) does not include torch/transformers.

What’s the difference between “provider” and “model”?

Provider: a backend adapter (openai, anthropic, ollama, lmstudio, …)
Model: a provider-specific model name (for example gpt-4o-mini or qwen3:4b-instruct-2507-q4_K_M)

from abstractcore import create_llm
llm = create_llm("openai", model="gpt-4o-mini")

How does AbstractCore relate to AbstractFramework / AbstractRuntime?

AbstractCore is one of the core packages in the AbstractFramework ecosystem:

AbstractFramework (umbrella): https://github.com/lpalbou/AbstractFramework
AbstractCore (this package): unified LLM interface + cross-provider infrastructure
AbstractRuntime: durable tool/effect execution, workflows, and state persistence — https://github.com/lpalbou/abstractruntime

AbstractCore is usable standalone. In the ecosystem, the common pattern is:

AbstractCore produces resp.content + resp.tool_calls
a runtime (for example AbstractRuntime) decides whether/how to execute tools (policy, sandboxing, retries, persistence)

See Architecture and Tool Calling.

How do I connect to a local server (Ollama / LMStudio / vLLM / llama.cpp / LocalAI)?

Use the matching provider and set base_url (or the provider’s base-url env var). We recommend open-source/local providers first; cloud and gateway providers are optional.

Examples:

from abstractcore import create_llm

llm = create_llm("ollama", model="qwen3:4b-instruct-2507-q4_K_M", base_url="http://localhost:11434")
llm = create_llm("lmstudio", model="qwen/qwen3-4b-2507", base_url="http://localhost:1234/v1")
llm = create_llm("vllm", model="Qwen/Qwen3-Coder-30B-A3B-Instruct", base_url="http://localhost:8000/v1")

For a generic OpenAI-compatible endpoint, use openai-compatible:

llm = create_llm("openai-compatible", model="my-model", base_url="http://localhost:1234/v1")

See Prerequisites for setup details and env var names.

Why do gateway providers return “unsupported parameter” errors (temperature/max_tokens)?

Gateways like Portkey and OpenRouter forward your payload to the routed backend model, and strict families (for example OpenAI reasoning models like gpt-5/o1) reject unsupported parameters.

In AbstractCore’s gateway providers:

Portkey uses PORTKEY_API_KEY and PORTKEY_CONFIG (config id) for routing.
Optional params (temperature, top_p, max_output_tokens) are only sent when you explicitly set them.
Reasoning families (gpt-5/o1) drop temperature/top_p and use max_completion_tokens instead of max_tokens.

If you still see errors, confirm:

You aren’t mixing routing modes (config vs virtual key vs provider-direct).
You’re not injecting parameters via Portkey config overrides that the backend rejects.

How do I set API keys and defaults?

You can use environment variables, or persist settings via the config CLI:

abstractcore --config
abstractcore --set-api-key openai sk-...
abstractcore --set-api-key anthropic sk-ant-...
abstractcore --status

Config is stored in ~/.abstractcore/config/abstractcore.json. See Centralized Config.

Can I use the HTTP server with only provider API keys?

Yes. You do not have to give a client the AbstractCore server auth token. If ABSTRACTCORE_AUTH_TOKEN is not configured, a client can bring its own upstream provider key, for example an Anthropic, OpenRouter, or Portkey key, by sending it as X-AbstractCore-Provider-API-Key.

That key is forwarded only to the provider requested by the model route, such as anthropic/..., openrouter/..., or portkey/.... It does not unlock other server-configured provider keys, and it does not grant access to providers the client did not supply credentials for.

If ABSTRACTCORE_AUTH_TOKEN is configured, Authorization is reserved for the AbstractCore server auth token. In that mode, use X-AbstractCore-Provider-API-Key only when you want to override the upstream provider key for a single request.

Provider keys in request bodies remain disabled. Select discovery endpoints accept an api_key query parameter for tooling/Swagger UI convenience, but headers remain preferred.

Why aren’t tools executed automatically?

By default, AbstractCore runs in pass-through mode (execute_tools=False): it returns tool calls in resp.tool_calls, and your host/runtime decides whether/how to execute them.

Automatic execution (execute_tools=True) exists but is deprecated for most use cases. See Tool Calling.

What’s the difference between `web_search`, `skim_websearch`, `skim_url`, and `fetch_url`?

These built-in web tools live in abstractcore.tools.common_tools and require:

pip install "abstractcore[tools]"

web_search: fuller DuckDuckGo result set (good when you want breadth or more options).
skim_websearch: compact/filtered search results (good default for agents to keep prompts smaller). Defaults to 5 results and truncates long snippets.
skim_url: fast URL triage (fetches only a prefix and extracts lightweight metadata + a short preview). Defaults: max_bytes=200_000, max_preview_chars=1200, max_headings=8.
fetch_url: full fetch + parsing for text-first types (HTML→Markdown, JSON/XML/text). For PDFs/images/other binaries it returns metadata and optional previews; it does not do full PDF text extraction. It downloads up to 10MB by default; use include_full_content=False for smaller outputs.

Recommended workflow: skim_websearch → skim_url → fetch_url (use include_full_content=False when you want a smaller fetch_url output).

How do I preserve tool-call markup in `response.content` for agentic CLIs?

Use tool-call syntax rewriting:

Python: pass tool_call_tags=... to generate() / agenerate()
Server: set agent_format in requests

See Tool Syntax Rewriting.

How do I get structured output (typed objects) instead of parsing JSON?

Pass a Pydantic model via response_model=...:

from pydantic import BaseModel
from abstractcore import create_llm

class Answer(BaseModel):
    title: str
    bullets: list[str]

llm = create_llm("openai", model="gpt-4o-mini")
result = llm.generate("Summarize HTTP/3 in 3 bullets.", response_model=Answer)

See Structured Output.

Why does structured output retry or fail validation?

Structured output is validated against your schema. If validation fails, AbstractCore retries with feedback (up to the configured retry limit). Common fixes:

simplify schemas (fewer nested structures; fewer strict constraints)
tighten prompts (be explicit about allowed values and ranges)
increase timeouts for slow backends

See Structured Output and Troubleshooting.

Why do PDFs / Office docs / images not work?

Those require the media extra:

pip install "abstractcore[media]"

Then pass media=[...] to generate() or use the media pipeline. See Media Handling.

How do I attach audio or video?

Audio and video attachments are supported via media=[...], but they are policy-driven by design:

Audio defaults to audio_policy="native_only" (fails loudly unless the model supports native audio input).
Video defaults to video_policy="auto" (native video when supported; otherwise sample frames and route through image/vision handling). Frame sampling requires ffmpeg/ffprobe.

Speech-to-text fallback for audio (audio_policy="speech_to_text" or "auto") typically requires installing abstractvoice (capability plugin).

You can set defaults via the config CLI:

abstractcore --set-audio-strategy auto
abstractcore --set-video-strategy auto
abstractcore --set-video-max-frames 6

See:

Media Handling (policies + fallbacks)
Vision Capabilities (image/video input + fallback behavior)

How do I do speech-to-text (STT) or text-to-speech (TTS)?

Install the optional capability plugin package:

pip install "abstractcore[voice]"

This installs the remote-light AbstractVoice capability path. Local voice engines require an explicit local profile such as abstractcore[all-apple] or abstractcore[all-gpu].

Then use the deterministic capability surfaces:

from abstractcore import create_llm

llm = create_llm("openai", model="gpt-4o-mini")  # provider/model is only for LLM calls; STT/TTS are deterministic
print(llm.capabilities.status())  # shows which capability backends are available/selected

wav_bytes = llm.voice.tts("Hello", format="wav")
text = llm.audio.transcribe("speech.wav")

If you run the optional HTTP server, you can also use OpenAI-compatible endpoints:

POST /v1/audio/transcriptions
POST /v1/audio/speech

See: Server and Capabilities.

How do I generate or edit images?

Generative vision is dependency-light by default. AbstractCore Server can proxy OpenAI-compatible image endpoints without local vision runtimes. For local Diffusers/sdcpp image generation, install the vision extra:

pip install "abstractcore[server,vision]"

You can use generative vision through AbstractCore’s llm.vision.* capability plugin surface, or through AbstractCore Server’s optional endpoints:

POST /v1/images/generations
POST /v1/images/edits

Omit model with the server endpoints only when this server has a configured AbstractVision/OpenAI-compatible image default. Use model="diffusers/default" or model="diffusers/<huggingface-repo>" for explicit Diffusers routing, model="sdcpp/default" for configured stable-diffusion.cpp, or model="openai-compatible/<model>" with a configured image base URL for remote OpenAI-compatible endpoints. Local Diffusers is cache-only by default, so pre-download the model or set ABSTRACTCORE_VISION_ALLOW_DOWNLOAD=1 / ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1 when runtime downloads are intended.

See: Server, Capabilities, and abstractvision/docs/reference/abstractcore-integration.md (in the AbstractVision repo).

What are “glyphs” and what do they require?

Glyph visual-text compression is an optional feature for long documents. Install:

pip install "abstractcore[compression]" (renderer)
plus pip install "abstractcore[media]" if you want PDF extraction support

See Glyph Visual-Text Compression.

How do I use embeddings?

Embeddings are opt-in:

pip install "abstractcore[embeddings]"

Then import from the embeddings module:

from abstractcore.embeddings import EmbeddingManager

See Embeddings.

Do I need the HTTP server?

No. The server is optional and is mainly for:

exposing one OpenAI-compatible /v1 endpoint that can route to multiple providers/models
integrating with OpenAI-compatible clients and agentic CLIs

Install and run:

pip install "abstractcore[server]"
python -m abstractcore.server.app

See Server.

Where are logs and traces?

Logging (console/file) is configured via the config CLI and config file. See Structured Logging.
Interaction tracing is opt-in (enable_tracing=True). See Interaction Tracing.

I’m getting HTTP timeouts. What should I change?

Per-provider: pass timeout=... to create_llm(...) (timeout=None means unlimited).
Process-wide default: set abstractcore --set-default-timeout 0 (0 = unlimited), or set a larger value.
Some CLI apps have their own --timeout flags; run --help for the exact behavior.

See Troubleshooting and Centralized Config.

HuggingFace won’t download models — why?

The HuggingFace provider respects AbstractCore’s offline-first settings. If you want HuggingFace to fetch from the Hub, update ~/.abstractcore/config/abstractcore.json:

set "offline_first": false
set "force_local_files_only": false

Restart your Python process after changing this (the provider reads these settings at import time).

Is AbstractCore a full agent/RAG framework?

AbstractCore focuses on provider abstraction + infrastructure (tools, structured output, media handling, tracing). It does not ship a full RAG pipeline or multi-step agent orchestration. See Capabilities.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ

What do I get with `pip install abstractcore`?

Which extra do I need for my provider?

How do I combine extras?

Why did my install pull `torch` / take a long time?

What’s the difference between “provider” and “model”?

How does AbstractCore relate to AbstractFramework / AbstractRuntime?

How do I connect to a local server (Ollama / LMStudio / vLLM / llama.cpp / LocalAI)?

Why do gateway providers return “unsupported parameter” errors (temperature/max_tokens)?

How do I set API keys and defaults?

Can I use the HTTP server with only provider API keys?

Why aren’t tools executed automatically?

What’s the difference between `web_search`, `skim_websearch`, `skim_url`, and `fetch_url`?

How do I preserve tool-call markup in `response.content` for agentic CLIs?

How do I get structured output (typed objects) instead of parsing JSON?

Why does structured output retry or fail validation?

Why do PDFs / Office docs / images not work?

How do I attach audio or video?

How do I do speech-to-text (STT) or text-to-speech (TTS)?

How do I generate or edit images?

What are “glyphs” and what do they require?

How do I use embeddings?

Do I need the HTTP server?

Where are logs and traces?

I’m getting HTTP timeouts. What should I change?

HuggingFace won’t download models — why?

Is AbstractCore a full agent/RAG framework?

FilesExpand file tree

faq.md

Latest commit

History

faq.md

File metadata and controls

FAQ

What do I get with pip install abstractcore?

Which extra do I need for my provider?

How do I combine extras?

Why did my install pull torch / take a long time?

What’s the difference between “provider” and “model”?

How does AbstractCore relate to AbstractFramework / AbstractRuntime?

How do I connect to a local server (Ollama / LMStudio / vLLM / llama.cpp / LocalAI)?

Why do gateway providers return “unsupported parameter” errors (temperature/max_tokens)?

How do I set API keys and defaults?

Can I use the HTTP server with only provider API keys?

Why aren’t tools executed automatically?

What’s the difference between web_search, skim_websearch, skim_url, and fetch_url?

How do I preserve tool-call markup in response.content for agentic CLIs?

How do I get structured output (typed objects) instead of parsing JSON?

Why does structured output retry or fail validation?

Why do PDFs / Office docs / images not work?

How do I attach audio or video?

How do I do speech-to-text (STT) or text-to-speech (TTS)?

How do I generate or edit images?

What are “glyphs” and what do they require?

How do I use embeddings?

Do I need the HTTP server?

Where are logs and traces?

I’m getting HTTP timeouts. What should I change?

HuggingFace won’t download models — why?

Is AbstractCore a full agent/RAG framework?

What do I get with `pip install abstractcore`?

Why did my install pull `torch` / take a long time?

What’s the difference between `web_search`, `skim_websearch`, `skim_url`, and `fetch_url`?

How do I preserve tool-call markup in `response.content` for agentic CLIs?