AbstractCore Capabilities

This document clearly explains what AbstractCore can and cannot do, helping you understand when to use it and when to look elsewhere.

What AbstractCore IS

AbstractCore is production-ready LLM infrastructure. It provides a unified, reliable interface to language models with essential features built-in.

Core Philosophy

Infrastructure, not application logic
Reliability over features
Simplicity over complexity
Provider agnostic

Optional capability plugins (voice/audio/vision/music)

AbstractCore stays dependency-light by default. Deterministic modality APIs (STT/TTS, generative vision) live in optional packages and are exposed through the capability plugin layer:

Install abstractcore[voice] → llm.voice / llm.audio via abstractvoice (TTS/STT)
Install abstractcore[vision] → llm.vision via abstractvision (text→image, image→image, text→video, image→video)
Install abstractcore[music] → llm.music for text→music through abstractmusic

pip install "abstractcore[voice]"
pip install "abstractcore[vision]"
pip install "abstractcore[music]"

abstractvoice 0.10.17+ can install its base AbstractCore plugin path on Python 3.9 without OmniVoice, torch, or torchaudio. Python 3.10+ is recommended. Local voice engines and clone backends are installed through explicit local aggregate profiles such as abstractcore[all-apple] and abstractcore[all-gpu]; AEC requires Python 3.11+.

from abstractcore import create_llm

llm = create_llm("openai", model="gpt-4o-mini")  # example; pick a provider/model you have access to
print(llm.capabilities.status())  # availability + selected backend ids + install hints
print(llm.capabilities.list_backend_infos())  # registered backend metadata, no backend instantiation

# Voice/audio
wav_bytes = llm.voice.tts("Hello", format="wav")
text = llm.audio.transcribe("speech.wav")
voices = llm.voice.voice_catalog()
tts_models = llm.voice.list_tts_models()
# Optional: llm.voice.clone(...) is available only when the selected voice
# backend exposes cloning. The HTTP server also provides /v1/voice/clone.

# Vision via AbstractVision
# Configure AbstractVision's backend/default first, or pass backend-specific kwargs.
png_bytes = llm.vision.t2i("a red square", width=512, height=512, steps=20)
image_models = llm.vision.list_provider_models(task="text_to_image")
mp4_bytes = llm.vision.t2v(
    "A slow camera move through a luminous data center.",
    provider="mlx-gen",
    model="Wan-AI/Wan2.2-TI2V-5B-Diffusers",
    num_frames=121,
    fps=24,
    extra={"max_sequence_length": 256},
)

# Generic discovery for plugins that expose the shared contract.
music_providers = llm.capabilities.available_providers("music", task="text_to_music")
music_models = llm.capabilities.list_models("music", task="text_to_music")

# Music via AbstractMusic when installed.
wav_music = llm.music.generate(
    "A short calm piano loop.",
    provider="acemusic",
    duration_s=8,
    format="wav",
)

# Remote OpenAI-compatible path:
# export ABSTRACTVISION_BACKEND=openai
# export OPENAI_BASE_URL=http://localhost:8000/v1
# unset ABSTRACTVISION_MODEL_ID  # omit model and use the server's configured image default
# png_bytes = llm.vision.t2i("a red square")

Unified `generate(..., output=...)` convenience

For common cases, the normal generation API can route to these optional capabilities:

# Image generation.
image = llm.generate("A red square on a white background.", output="image")

# Image edit. One image media item plus output="image" infers image-to-image.
edited = llm.generate("Make it blue.", media="red-square.png", output="image")

# Text-to-video. Top-level progress callbacks are forwarded to AbstractVision.
video = llm.generate(
    "A slow camera move through a luminous data center.",
    on_progress=lambda event: print(event),
    output={
        "task": "text_to_video",
        "provider": "mlx-gen",
        "model": "Wan-AI/Wan2.2-TI2V-5B-Diffusers",
        "num_frames": 121,
        "fps": 24,
        "extra": {"max_sequence_length": 256},
    },
)

# Image-to-video. Mark the image as the source frame.
i2v = llm.generate(
    "Slow camera push-in.",
    media={"type": "image", "path": "first-frame.png", "role": "source"},
    output={
        "task": "image_to_video",
        "provider": "mlx-gen",
        "model": "Wan-AI/Wan2.2-TI2V-5B-Diffusers",
    },
)

# TTS. Text plus output="voice" returns generated audio.
speech = llm.generate(text="Hello from AbstractCore.", output="voice")

# Music. Text plus output="music" returns generated music/audio.
music = llm.generate(
    text="A short calm piano loop.",
    output={"modality": "music", "provider": "acemusic", "duration_s": 8, "format": "wav"},
)

# Voice clone/register. Audio media plus output="voice" returns a reusable voice id
# when the selected AbstractVoice backend supports local or remote cloning.
clone = llm.generate(text="Optional transcript.", media="reference.wav", output="voice")
voice_id = clone.resources["voice"][0].resource_id

Text-only generate(...) is unchanged. Ambiguous media cases require explicit roles, for example role="source" and role="mask" for image edits with masks. Use task="tts" when audio media is a temporary voice reference rather than a clone/register sample.

Direct llm.vision calls are provided by abstractvision. For local Diffusers, choose an explicit model/default in AbstractVision, pre-download model weights, or explicitly opt in to runtime downloads with ABSTRACTVISION_DIFFUSERS_ALLOW_DOWNLOAD=1. For local MLX-Gen, select exact model repo ids such as AbstractFramework/qwen-image-2512-4bit, briaai/FIBO, or Wan-AI/Wan2.2-TI2V-5B-Diffusers; quantized models are selected by repo id, not by a Core-side quant parameter. For server/OpenAI-compatible use, point OPENAI_BASE_URL at a media endpoint such as AbstractCore Server's /v1.

The server exposes the same deep catalogs through:

GET /v1/vision/providers/
GET /v1/vision/models
GET /v1/audio/voices
GET /v1/audio/speech/models
GET /v1/capabilities
GET /v1/capabilities/{capability}/providers
GET /v1/capabilities/{capability}/models
POST /v1/audio/music
POST /{provider}/v1/audio/music
GET /v1/audio/music/providers
GET /v1/audio/music/models
POST /v1/videos/generations
POST /v1/videos/edits
POST /v1/vision/jobs/videos/generations
POST /v1/vision/jobs/videos/edits

For abstractmusic>=0.1.12, the default lightweight music backend is the remote ACE Music API path (provider="acemusic" or /acemusic/v1/audio/music). Set ACEMUSIC_API_KEY in the server or Python environment. The server music route accepts wav, mp3, and flac; individual backends may support fewer formats.

Keep /v1/models for LLM/embedding provider discovery. Generated-media catalogs are intentionally separate so image and voice backends can expose their own provider-specific metadata without blurring the LLM model taxonomy.

Plugin host text service

Plugins that need text planning can use the host context supplied by AbstractCore instead of constructing their own LLM provider. The public seam is narrow:

owner.capability_host_context.text.generate_text(...)
owner.capability_host_context.text.generate_structured(..., response_model=...)

The service forces text-only generation (output=None, no media, no streaming) and prevents capability recursion. Plugins should treat it as an optional host service and keep their base package usable without importing AbstractCore.

Direct llm.voice / generate(..., output="voice") calls are provided by abstractvoice. For remote OpenAI TTS/STT, configure the provider before creating the LLM:

llm = create_llm(
    "openai",
    model="gpt-4o-mini",
    voice_tts_engine="openai",
    voice_stt_engine="openai",
)

speech = llm.generate(
    text="Hello from AbstractCore.",
    output={"modality": "voice", "voice": "coral", "format": "wav"},
)

For OpenAI-compatible audio servers, use voice_tts_engine="openai-compatible" and set voice_remote_base_url / voice_remote_api_key, or the equivalent OPENAI_BASE_URL / OPENAI_API_KEY environment variables. Voice clone/register also requires a backend that exposes a local or remote clone route.

What AbstractCore Does Well

1. Universal LLM Provider Interface

What it does: Provides identical APIs across all major LLM providers.

# Same code works with any provider
def ask_llm(provider_name, question):
    llm = create_llm(provider_name, model="default")
    return llm.generate(question)

# All of these use the same API surface
ask_llm("openai", "What is Python?")
ask_llm("anthropic", "What is Python?")
ask_llm("ollama", "What is Python?")

Why this helps: Provides consistent tool calling, streaming, and structured output across supported providers.

2. Production-Grade Reliability

What it does: Handles failures gracefully with retry logic, circuit breakers, and comprehensive error handling.

Automatic retries with exponential backoff for rate limits and network errors
Circuit breakers prevent cascade failures when providers go down
Smart error classification - retries recoverable errors, fails fast on auth errors
Event system for monitoring and alerting

Why this helps: Includes production reliability features like retry logic and error handling.

3. Universal Tool Calling

What it does: Tools work consistently across supported providers, even those without native tool support.

tools = [{"name": "get_weather", "description": "Get weather", ...}]

# Works with providers that have native tool support
openai_response = openai_llm.generate("Weather in Paris?", tools=tools)

# Also works with providers that don't (via intelligent prompting)
ollama_response = ollama_llm.generate("Weather in Paris?", tools=tools)

Why this helps: Tools work with any provider, including those without native tool support.

4. Tool Call Tag Rewriting for Agentic CLI Compatibility

What it does: Automatically rewrites tool call tags to match different agentic CLI requirements in real-time.

# Rewrite tool calls for different CLIs
# Use a prompted-tools provider (tool-call markup lives in assistant content)
llm = create_llm("ollama", model="qwen3:4b-instruct")

# For Codex CLI (Qwen3 format)
response = llm.generate("Weather in Paris?", tools=tools, tool_call_tags="qwen3")
# Output: <|tool_call|>{"name": "get_weather", "arguments": {"location": "Paris"}}</|tool_call|>

# For Crush CLI (LLaMA3 format)  
response = llm.generate("Weather in Paris?", tools=tools, tool_call_tags="llama3")
# Output: <function_call>{"name": "get_weather", "arguments": {"location": "Paris"}}</function_call>

# For Gemini CLI (XML format)
response = llm.generate("Weather in Paris?", tools=tools, tool_call_tags="xml")
# Output: <tool_call>{"name": "get_weather", "arguments": {"location": "Paris"}}</tool_call>

Why this helps: Works with different agentic CLIs without code changes.

5. Tool Execution Control

What it does: Control whether AbstractCore executes tools automatically or lets the agent handle execution.

# Default (recommended): passthrough mode (tools are *not* executed in AbstractCore)
llm = create_llm("openai", model="gpt-4o-mini")
response = llm.generate("Weather in Paris?", tools=tools)
# response.tool_calls contains structured tool call requests; host/runtime executes them

# Optional (deprecated): direct execution in AbstractCore for simple scripts only
# llm = create_llm("openai", model="gpt-4o-mini", execute_tools=True)

Why this helps: Allows flexible tool execution control for different deployment scenarios.

6. Structured Output with Automatic Retry

What it does: Gets typed Python objects from LLMs with automatic validation and retry on failures.

class Product(BaseModel):
    name: str
    price: float

# Automatically retries with error feedback if validation fails
product = llm.generate(
    "Extract: Gaming laptop for $1200",
    response_model=Product
)

Why this helps: Built-in validation retry reduces manual error handling.

See: Structured Output Guide for native vs prompted strategies, schema design, and production deployment

5. Streaming with Tool Support

What it does: Real-time response streaming that properly handles tool calls.

# Streams content in real-time, executes tools at the end
for chunk in llm.generate("Tell me about Paris weather", tools=tools, stream=True):
    print(chunk.content, end="", flush=True)

Why this helps: Streaming works correctly with tool calls.

6. Event-Driven Observability

What it does: Comprehensive events for monitoring, debugging, and control.

from abstractcore.events import EventType, on_global

def cost_monitor(event):
    if event.type != EventType.GENERATION_COMPLETED:
        return
    cost = event.data.get("cost_usd")
    if isinstance(cost, (int, float)) and cost > 0.10:
        # NOTE: `cost_usd` is a best-effort estimate based on token usage.
        alert(f"High estimated cost: ${cost:.2f}")

on_global(EventType.GENERATION_COMPLETED, cost_monitor)

Why this helps: Provides built-in observability for monitoring and debugging.

7. Built-in Production Applications

What it does: Provides ready-to-use command-line applications for common LLM tasks without any programming.

# Document summarization with multiple strategies
summarizer document.pdf --style executive --length brief
summarizer report.txt --focus "technical details" --output summary.txt

# Entity and relationship extraction
extractor research_paper.pdf --format json-ld --focus technology
extractor article.txt --entity-types person,organization,location

# Text evaluation and scoring
judge essay.txt --criteria clarity,accuracy,coherence --context "academic writing"
judge code.py --context "code review" --format plain

# Intent analysis and deception detection
intent conversation.txt --focus-participant user --depth comprehensive
intent email.txt --format plain --context document --verbose

Available Applications:

Summarizer: Document summarization with customizable styles and focus areas
Extractor: Entity and relationship extraction with multiple output formats
Judge: Text evaluation with custom criteria and scoring rubrics
Intent Analyzer: Psychological intent analysis with deception detection

Why this helps: Provides ready-to-use CLI tools that work with any LLM provider.

What AbstractCore Does NOT Do

Understanding limitations is crucial for choosing the right tool.

1. RAG Pipelines (Use Specialized Tools)

What AbstractCore provides: Vector embeddings via EmbeddingManager What it doesn't provide: Document chunking, vector databases, retrieval strategies

# AbstractCore gives you this
from abstractcore.embeddings import EmbeddingManager
embedder = EmbeddingManager()
similarity = embedder.compute_similarity("query", "document")

# You need to build this yourself
def rag_pipeline(query, documents):
    # 1. Chunk documents - YOU implement
    # 2. Store in vector DB - YOU implement
    # 3. Retrieve relevant chunks - YOU implement
    # 4. Construct prompt - YOU implement
    return llm.generate(prompt)

Better alternatives:

LlamaIndex - Full RAG framework
LangChain - RAG components and chains

2. Complex Agent Workflows (Use Agent Frameworks)

What AbstractCore provides: Single LLM calls with tool execution What it doesn't provide: Multi-step agent reasoning, planning, memory persistence

# AbstractCore is great for this
response = llm.generate("What's 2+2?", tools=[calculator_tool])

# AbstractCore is NOT for this
def complex_agent():
    # 1. Plan multi-step solution - NOT provided
    # 2. Execute steps with memory - NOT provided
    # 3. Reflect and re-plan - NOT provided
    # 4. Persist agent state - NOT provided
    pass

Better alternatives:

AbstractAgent - Built on AbstractCore
LangGraph - Agent orchestration
AutoGPT - Autonomous agents

3. Advanced Memory Systems (Use Memory Frameworks)

What AbstractCore provides: Basic conversation history via BasicSession What it doesn't provide: Semantic memory, long-term memory, knowledge graphs

# AbstractCore provides basic sessions
session = BasicSession(provider=llm)
session.generate("My name is Alice")
session.generate("What's my name?")  # Remembers within session

# For advanced memory, use specialized tools
temporal_graph = AbstractMemory()  # Persistent, semantic memory
temporal_graph.add_memory("Alice likes Python programming", context="conversation")

Better alternatives:

AbstractMemory - Temporal knowledge graphs
Mem0 - Personalized memory layer

4. Prompt Template Management (Use Template Libraries)

What AbstractCore provides: Direct prompt strings What it doesn't provide: Template engines, prompt optimization, A/B testing

# AbstractCore expects you to handle prompts
prompt = f"Translate '{text}' to {language}"
response = llm.generate(prompt)

# For advanced templating, use other tools
template = PromptTemplate("Translate '{text}' to {language}")  # Not provided

Better alternatives:

Jinja2 - Template engine
LangChain Prompts - Prompt management
Guidance - Prompt programming

5. Training and Fine-tuning (Use ML Frameworks)

What AbstractCore provides: Interface to existing models What it doesn't provide: Model training, fine-tuning, or optimization

Better alternatives:

Transformers - Model training
Axolotl - Fine-tuning framework
Unsloth - Fast fine-tuning

6. Multi-Agent Orchestration (Use Orchestration Frameworks)

What AbstractCore provides: Single agent with tools What it doesn't provide: Agent-to-agent communication, hierarchical agents

Better alternatives:

CrewAI - Multi-agent teams
AutoGen - Agent conversations
LangGraph - Agent networks

When to Choose AbstractCore

Choose AbstractCore When You Need:

Reliable LLM Infrastructure
- Production-ready error handling and retry logic
- Consistent interface across different providers
- Built-in monitoring and observability
Provider Flexibility
- Easy switching between OpenAI, Anthropic, Ollama, etc.
- Provider-agnostic code that runs anywhere
- Local and cloud provider support
Universal Tool Calling
- Tools that work across supported providers
- Consistent tool execution regardless of native support
- Event-driven tool control and monitoring
Structured Output Reliability
- Type-safe responses with automatic validation
- Built-in retry logic for validation failures
- Production-grade error handling
Streaming with Tools
- Real-time responses that handle tools correctly
- Proper streaming implementation across providers

Don't Choose AbstractCore When You Need:

Full RAG Frameworks → Use LlamaIndex or LangChain
Complex Agent Workflows → Use AbstractAgent or LangGraph
Advanced Memory Systems → Use AbstractMemory or Mem0
Prompt Template Management → Use Jinja2 or LangChain Prompts
Model Training/Fine-tuning → Use Transformers or Axolotl
Multi-Agent Systems → Use CrewAI or AutoGen

AbstractCore in the Ecosystem

AbstractCore is designed to be the foundation that other tools build on:

graph TD
    A[Your Application] --> B[AbstractAgent]
    A --> C[AbstractMemory]
    A --> D[Custom RAG Pipeline]

    B --> E[AbstractCore]
    C --> E
    D --> E

    E --> F[OpenAI]
    E --> G[Anthropic]
    E --> H[Ollama]
    E --> I[MLX]

    style E fill:#e1f5fe
    style A fill:#fff3e0

AbstractCore = The reliable foundation AbstractAgent = Agent workflows and planning AbstractMemory = Advanced memory and knowledge graphs Your Application = Business logic and user interface

Decision Tree

Need LLM functionality?
├── Simple LLM calls with reliability? → AbstractCore ✅
├── Complex agents with planning? → AbstractAgent (built on AbstractCore)
├── Advanced memory/knowledge graphs? → AbstractMemory (with AbstractCore)
├── Full RAG with document management? → LlamaIndex or LangChain
├── Multi-agent conversations? → CrewAI or AutoGen
└── Just API compatibility? → LiteLLM

Capabilities Summary

Capability	AbstractCore	When You Need More
LLM Provider Interface	✅ Universal	Covers most use cases
Production Reliability	✅ Built-in	Covers most use cases
Tool Calling	✅ Universal	Multi-step reasoning → AbstractAgent
Structured Output	✅ With retry	Complex validation → Custom logic
Streaming	✅ With tools	Covers most use cases
Basic Memory	✅ Sessions	Semantic memory → AbstractMemory
Vector Embeddings	✅ SOTA models	Full RAG → LlamaIndex
Events/Monitoring	✅ Comprehensive	Covers most use cases
Agent Workflows	❌ Single calls	Complex agents → AbstractAgent
Advanced Memory	❌ Session only	Knowledge graphs → AbstractMemory
RAG Pipelines	❌ Embeddings only	Document processing → LlamaIndex
Prompt Templates	❌ Raw strings	Template management → Jinja2

Next Steps

Based on your needs:

Start with AbstractCore: Getting Started Guide
Need agents: Check out AbstractAgent
Need advanced memory: Check out AbstractMemory
Compare frameworks: Read Framework Comparison
See real examples: Browse Examples

Remember: AbstractCore is infrastructure, not a full framework. It focuses on LLM provider abstraction and integrates with specialized tools for other needs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AbstractCore Capabilities

What AbstractCore IS

Core Philosophy

Optional capability plugins (voice/audio/vision/music)

Unified `generate(..., output=...)` convenience

Plugin host text service

What AbstractCore Does Well

1. Universal LLM Provider Interface

2. Production-Grade Reliability

3. Universal Tool Calling

4. Tool Call Tag Rewriting for Agentic CLI Compatibility

5. Tool Execution Control

6. Structured Output with Automatic Retry

5. Streaming with Tool Support

6. Event-Driven Observability

7. Built-in Production Applications

What AbstractCore Does NOT Do

1. RAG Pipelines (Use Specialized Tools)

2. Complex Agent Workflows (Use Agent Frameworks)

3. Advanced Memory Systems (Use Memory Frameworks)

4. Prompt Template Management (Use Template Libraries)

5. Training and Fine-tuning (Use ML Frameworks)

6. Multi-Agent Orchestration (Use Orchestration Frameworks)

When to Choose AbstractCore

Choose AbstractCore When You Need:

Don't Choose AbstractCore When You Need:

AbstractCore in the Ecosystem

Decision Tree

Capabilities Summary

Next Steps

FilesExpand file tree

capabilities.md

Latest commit

History

capabilities.md

File metadata and controls

AbstractCore Capabilities

What AbstractCore IS

Core Philosophy

Optional capability plugins (voice/audio/vision/music)

Unified generate(..., output=...) convenience

Plugin host text service

What AbstractCore Does Well

1. Universal LLM Provider Interface

2. Production-Grade Reliability

3. Universal Tool Calling

4. Tool Call Tag Rewriting for Agentic CLI Compatibility

5. Tool Execution Control

6. Structured Output with Automatic Retry

5. Streaming with Tool Support

6. Event-Driven Observability

7. Built-in Production Applications

What AbstractCore Does NOT Do

1. RAG Pipelines (Use Specialized Tools)

2. Complex Agent Workflows (Use Agent Frameworks)

3. Advanced Memory Systems (Use Memory Frameworks)

4. Prompt Template Management (Use Template Libraries)

5. Training and Fine-tuning (Use ML Frameworks)

6. Multi-Agent Orchestration (Use Orchestration Frameworks)

When to Choose AbstractCore

Choose AbstractCore When You Need:

Don't Choose AbstractCore When You Need:

AbstractCore in the Ecosystem

Decision Tree

Capabilities Summary

Next Steps

Unified `generate(..., output=...)` convenience