This page is a user-facing map of the public Python API exposed from abstractcore (see abstractcore/__init__.py). For a complete listing of functions/classes (including events), see API Reference.
New to AbstractCore? Start with Getting Started.
Implementation pointers (source of truth):
create_llm:abstractcore/core/factory.py→abstractcore/providers/registry.pyBasicSession:abstractcore/core/session.pyCachedSession:abstractcore/core/cached_session.py(prompt caching; seedocs/prompt-caching.md)- Response/types:
abstractcore/core/types.py - Tool decorator:
abstractcore/tools/core.py
Create a provider instance:
from abstractcore import create_llm
llm = create_llm("openai", model="gpt-4o-mini") # requires: pip install "abstractcore[openai]"
resp = llm.generate("Hello!")
print(resp.content)Provider IDs (common): openai, anthropic, openrouter, portkey, ollama, lmstudio, vllm, openai-compatible, huggingface, mlx.
from abstractcore import create_llm
llm_openrouter = create_llm("openrouter", model="openai/gpt-4o-mini")
llm_portkey = create_llm("portkey", model="gpt-5-mini", api_key="PORTKEY_API_KEY", config_id="pcfg_...")Gateway notes:
- OpenRouter uses
OPENROUTER_API_KEY(model names likeopenai/...). - Portkey uses
PORTKEY_API_KEYplus a config id (PORTKEY_CONFIG). - Optional generation parameters (
temperature,top_p,max_output_tokens, etc.) are only forwarded when explicitly set.
Keep conversation state:
from abstractcore import BasicSession, create_llm
session = BasicSession(create_llm("anthropic", model="claude-haiku-4-5")) # requires: abstractcore[anthropic]
print(session.generate("Give me 3 name ideas.").content)
print(session.generate("Pick the best one.").content)For prompt-cache-aware long chats (reuse stable prefixes like system/tools/files), use CachedSession:
from abstractcore import CachedSession, create_llm
llm = create_llm("mlx", model="mlx-community/Qwen3-4B") # requires: abstractcore[mlx]
session = CachedSession(provider=llm, system_prompt="You are helpful.", prompt_cache_strategy="auto")
session.attach_files(["/path/to/large_context.md"])
print(session.generate("Summarize the attached file.").content)See Prompt Caching.
For exact local-memory reuse, persist text/file content as a bloc, compile one provider/model artifact, then load it into a runtime prompt-cache key:
from abstractcore import (
create_llm,
ensure_bloc_kv_artifact,
load_bloc_kv_artifact,
delete_bloc_kv_artifact,
)
from abstractcore.core.file_blocs import FileBlocStore
llm = create_llm("huggingface", model="/path/to/local/text-model")
store = FileBlocStore()
record = store.upsert(file_meta={...}, content="stable memory text")
ensure = ensure_bloc_kv_artifact(provider=llm, store=store, record=record)
loaded = load_bloc_kv_artifact(provider=llm, store=store, record=record, key="work:memory")
resp = llm.generate("Use the loaded memory.", prompt_cache_binding=loaded.prompt_cache_binding)The public helpers are exported from both abstractcore and abstractcore.core:
ensure_bloc_kv_artifact, load_bloc_kv_artifact, compile_bloc_kv_artifact,
read_bloc_kv_manifest, list_bloc_kv_artifacts, find_bloc_kv_live_bindings,
delete_bloc_kv_artifact, prune_bloc_kv_artifacts, and delete_bloc. Delete helpers are safe
by default: a loaded artifact is blocked until you clear the bound runtime key or explicitly force
the delete. The shared contract currently covers MLX, HuggingFace transformers, and supported
HuggingFace GGUF exact-renderer paths. See Memory Blocs.
Define tools in Python with a decorator, then pass them to generate() / agenerate():
from abstractcore import create_llm, tool
@tool
def get_weather(city: str) -> str:
return f"{city}: 22°C and sunny"
llm = create_llm("openai", model="gpt-4o-mini")
resp = llm.generate("Use the tool.", tools=[get_weather])
print(resp.tool_calls)Most calls return a GenerateResponse object (or an iterator of them for streaming). Common fields:
content: cleaned assistant texttool_calls: structured tool calls (pass-through by default)usage: token usage (provider-dependent)metadata: provider/model specific fields (for example extracted reasoning text when configured)
download_model(...) is an async generator that yields DownloadProgress updates while a model is being fetched.
Supported providers:
ollama: pulls via the Ollama HTTP API (/api/pull)huggingface/mlx: downloads from HuggingFace Hub (requirespip install "abstractcore[huggingface]"; passtoken=for gated models)
Example:
import asyncio
from abstractcore import download_model
async def main():
async for p in download_model("ollama", "qwen3:4b-instruct-2507-q4_K_M"):
print(p.status.value, p.message)
asyncio.run(main())Implementation: abstractcore/download.py. For provider setup and base URLs, see Prerequisites.
Tools are passed explicitly to generate() / agenerate():
from abstractcore import create_llm, tool
@tool
def get_weather(city: str) -> str:
return f"{city}: 22°C and sunny"
llm = create_llm("openai", model="gpt-4o-mini")
resp = llm.generate("Use the tool.", tools=[get_weather])
print(resp.tool_calls)See Tool Calling and Tool Syntax Rewriting.
If you want a ready-made toolset (web + filesystem helpers), install:
pip install "abstractcore[tools]"Then import from abstractcore.tools.common_tools (for example web_search, skim_websearch, skim_url, fetch_url). See Tool Calling for usage patterns and when to use skim_* vs fetch_*.
Pass a Pydantic model via response_model=... to receive a typed result:
from pydantic import BaseModel
from abstractcore import create_llm
class Answer(BaseModel):
title: str
bullets: list[str]
llm = create_llm("openai", model="gpt-4o-mini")
result = llm.generate("Summarize HTTP/3 in 3 bullets.", response_model=Answer)
print(result.bullets)See Structured Output.
Media handling is opt-in:
pip install "abstractcore[media]"Then pass media=[...] to generate() / agenerate() (or use the media pipeline). Media behavior is policy-driven:
- Images: use a vision-capable model, or configure vision fallback (caption → inject short observations).
- Video: controlled by
video_policy(native when supported; otherwise frame sampling viaffmpeg+ vision handling). - Audio: controlled by
audio_policy(native when supported; otherwise optional speech-to-text viaabstractvoice).
See Media Handling, Vision Capabilities, and Centralized Config.
Install the relevant optional plugin first:
pip install "abstractcore[vision]" # image/video generation/edit
pip install "abstractcore[voice]" # TTS/STT/voice clone when backend supports it
pip install "abstractcore[music]" # text-to-music via abstractmusicThen use output=... for simple media-generation tasks:
# Text-only generate remains unchanged.
text = llm.generate("Explain cache invalidation.")
# Image generation.
image = llm.generate("A red ceramic mug on a white table.", output="image")
# Image edit. One image media item plus output="image" infers image edit.
edited = llm.generate("Make the mug blue.", media="mug.png", output="image")
# Text-to-video. Progress callbacks are forwarded to AbstractVision.
video = llm.generate(
"A slow camera move through a luminous data center.",
on_progress=lambda event: print(event),
output={
"task": "text_to_video",
"provider": "mlx-gen",
"model": "Wan-AI/Wan2.2-TI2V-5B-Diffusers",
"num_frames": 121,
"fps": 24,
"extra": {"max_sequence_length": 256},
},
)
# Image-to-video. Mark the image as the source frame.
i2v = llm.generate(
"Slow camera push-in.",
media={"type": "image", "path": "first-frame.png", "role": "source"},
output={
"task": "image_to_video",
"provider": "mlx-gen",
"model": "Wan-AI/Wan2.2-TI2V-5B-Diffusers",
},
)
# TTS.
speech = llm.generate(text="Hello from AbstractCore.", output="voice")
# Music generation.
music = llm.generate(
text="A short calm piano loop.",
output={"modality": "music", "provider": "acemusic", "duration_s": 8, "format": "wav"},
)
# Voice clone/register. Audio media plus output="voice" returns a voice
# resource id when the selected AbstractVoice backend supports cloning.
clone = llm.generate(text="Optional transcript.", media="reference.wav", output="voice")
voice_id = clone.resources["voice"][0].resource_idgenerate(..., output=...) returns MultimodalGenerateResponse for non-text
outputs. Binary artifacts are grouped under outputs, while reusable resources
such as cloned voices are grouped under resources. Plain output="text" with
a prompt preserves the normal GenerateResponse path for compatibility.
If you want an OpenAI-compatible /v1 gateway, install and run the server:
pip install "abstractcore[server]"
python -m abstractcore.server.appSee Server.