Skip to content

Latest commit

 

History

History
624 lines (458 loc) · 18.8 KB

File metadata and controls

624 lines (458 loc) · 18.8 KB

Basic Summarizer

The Basic Summarizer demonstrates how to build sophisticated text processing capabilities on top of AbstractCore using clean, zero-shot structured prompting techniques.

Recommended Setup: For optimal performance, use the free local model gemma3:1b-it-qat with Ollama, which provides fast processing (29s), high quality (95% confidence), and zero API costs.

Overview

The BasicSummarizer showcases AbstractCore's core strengths:

  • Structured Output: Uses Pydantic models for type-safe, validated responses
  • Provider Agnostic: Works with any LLM provider through AbstractCore's unified interface
  • Built-in Reliability: Inherits AbstractCore's retry mechanisms and error handling
  • Chunking Support: Automatically handles long documents using map-reduce approach
  • Event Integration: All operations emit events for monitoring and debugging

Quick Start

Prerequisites: For local processing, install Ollama and download the recommended model:

# Install Ollama, then download the fast, high-quality model
ollama pull gemma3:1b-it-qat
from abstractcore import create_llm
from abstractcore.processing import BasicSummarizer, SummaryStyle, SummaryLength

# Recommended: Fast local model for cost-effective processing
llm = create_llm("ollama", model="gemma3:1b-it-qat")

# Alternative: Cloud provider for highest quality
# llm = create_llm("openai", model="gpt-4o-mini")

# Create summarizer
summarizer = BasicSummarizer(llm)

# Basic usage
result = summarizer.summarize("Your long text here...")
print(result.summary)
print(f"Confidence: {result.confidence:.2f}")

Command Line Interface

The summarizer CLI provides direct terminal access for document summarization without any Python programming.

Quick CLI Usage

# Simple usage (after installing AbstractCore; add `pip install "abstractcore[media]"` for PDFs)
summarizer document.pdf

# With specific style and length
summarizer report.txt --style executive --length brief

# Focus on specific aspects
summarizer data.md --focus "technical details" --output summary.txt

# Use different provider
summarizer large.txt --provider openai --model gpt-4o-mini --verbose

CLI Parameters

Parameter Options Default Description
file_path Any text file Required Path to the file to summarize
--style structured, narrative, objective, analytical, executive, conversational structured Summary presentation style
--length brief, standard, detailed, comprehensive standard Summary length and depth
--focus Any text None Specific focus area for summarization
--output File path Console Output file path (prints to console if not provided)
--chunk-size 1000-32000 8000 Chunk size in characters for large documents
--provider openai, anthropic, ollama, etc. ollama LLM provider (requires --model)
--model Provider-specific gemma3:1b-it-qat LLM model (requires --provider)
--verbose Flag False Show detailed progress information

CLI Examples

# Basic document summarization
summarizer document.pdf
summarizer report.txt --verbose

# Executive summary for business documents
summarizer quarterly_report.pdf --style executive --length brief --output exec_summary.txt

# Technical focus with detailed analysis
summarizer technical_spec.md --focus "implementation details" --style analytical --length detailed

# Large document processing with custom chunking
summarizer large_manual.txt --chunk-size 15000 --verbose

# Using cloud providers for highest quality
summarizer important_doc.pdf --provider openai --model gpt-4o-mini --style executive

# Batch processing with shell scripting
for file in *.pdf; do
    summarizer "$file" --style structured --output "${file%.pdf}_summary.txt"
done

Alternative Usage Methods

# Method 1: Direct command (recommended after installation)
summarizer document.txt --style executive

# Method 2: Via Python module (always works)
python -m abstractcore.apps.summarizer document.txt --style executive

Supported File Types

The CLI supports most text-based file formats:

  • .txt, .md, .py, .js, .html, .json, .csv
  • Most other text-based files

Default Model Setup

The CLI uses gemma3:1b-it-qat by default for fast, cost-effective processing:

# Install Ollama: https://ollama.com/
# Download the default model
ollama pull gemma3:1b-it-qat

# Then use directly
summarizer document.txt

Memory and Token Management

Important: Just because a model can handle 100K tokens doesn't mean your deployment environment has the GPU memory for it.

Token Budget Control

Use max_tokens to control memory usage:

from abstractcore import create_llm
from abstractcore.processing import BasicSummarizer

# AUTO mode: Uses model's full capability (default)
summarizer = BasicSummarizer(llm, max_tokens=-1)

# Hard limit: 16K tokens (e.g., 8GB GPU constraint)
summarizer = BasicSummarizer(llm, max_tokens=16000)

# Hard limit: 24K tokens (e.g., 16GB GPU)
summarizer = BasicSummarizer(llm, max_tokens=24000)

Token Budget Modes

AUTO Mode (max_tokens=-1):

  • Uses model's full context window capability
  • Optimal performance when resources are available
  • Default behavior

Hard Limit (max_tokens=N):

  • Enforces specific token limit for deployment constraints
  • Example: max_tokens=16000 limits to 16K even if model supports 128K
  • Use when:
    • Memory-constrained environments (limited GPU/RAM)
    • Docker containers with hard memory limits
    • Shared infrastructure with strict resource limits
    • Batch processing requiring predictable memory usage
# Example: 8GB GPU deployment
llm = create_llm("ollama", model="qwen3:4b-instruct")
summarizer = BasicSummarizer(llm, max_tokens=16000)  # Hard limit

# Example: 16GB GPU with plenty of memory
llm = create_llm("ollama", model="qwen3:4b-instruct")
summarizer = BasicSummarizer(llm, max_tokens=-1)  # Use model's full capability

CLI Usage

# AUTO mode (default, uses model's full capability)
summarizer document.txt

# Explicit AUTO mode
summarizer document.txt --max-tokens auto

# Hard limit for 8GB GPU
summarizer document.txt --max-tokens 16000

# Hard limit for memory-constrained Docker
summarizer document.txt --max-tokens 12000

# Debug: See token decisions
summarizer document.txt --max-tokens 16000 --debug

Configuration Options

Summary Styles

Control how the summary is presented:

from abstractcore.processing import SummaryStyle

# Available styles
SummaryStyle.STRUCTURED     # Bullet points, clear sections
SummaryStyle.NARRATIVE      # Flowing, story-like prose
SummaryStyle.OBJECTIVE      # Neutral, factual tone
SummaryStyle.ANALYTICAL     # Critical analysis with insights
SummaryStyle.EXECUTIVE      # Business-focused, action-oriented
SummaryStyle.CONVERSATIONAL # Chat history preservation with context

Summary Lengths

Control the detail level:

from abstractcore.processing import SummaryLength

# Available lengths
SummaryLength.BRIEF         # 2-3 sentences, key point only
SummaryLength.STANDARD      # 1-2 paragraphs, main ideas
SummaryLength.DETAILED      # Multiple paragraphs, comprehensive
SummaryLength.COMPREHENSIVE # Full analysis with context

Advanced Usage

Focus Areas

Specify what aspect of the text to emphasize:

# Focus on specific aspects
result = summarizer.summarize(
    text,
    focus="business implications",
    style=SummaryStyle.EXECUTIVE,
    length=SummaryLength.DETAILED
)

print(f"Focus alignment: {result.focus_alignment:.2f}")

Different Providers

The same code works with any provider:

# OpenAI
llm_openai = create_llm("openai", model="gpt-4o-mini")
summarizer_openai = BasicSummarizer(llm_openai)

# Anthropic
llm_claude = create_llm("anthropic", model="claude-haiku-4-5")
summarizer_claude = BasicSummarizer(llm_claude)

# Local models
llm_ollama = create_llm("ollama", model="llama3.2")
summarizer_local = BasicSummarizer(llm_ollama)

# Same API surface (results vary by model/provider)
result = summarizer_openai.summarize(text)
result = summarizer_claude.summarize(text)
result = summarizer_local.summarize(text)

Long Document Processing

Automatically handles documents of any length:

# Works with short documents
short_result = summarizer.summarize(short_article)

# Automatically chunks long documents
long_result = summarizer.summarize(entire_book_text)

# Customize chunking and memory usage
summarizer = BasicSummarizer(
    llm,
    max_chunk_size=6000,      # Character-based chunks
    max_tokens=16000          # Token budget (hard limit for deployment)
)

# AUTO mode (uses model's full capability)
summarizer_auto = BasicSummarizer(
    llm,
    max_tokens=-1             # AUTO: use model's capability
)

# Memory-constrained environment
summarizer_constrained = BasicSummarizer(
    llm,
    max_tokens=8000           # Hard limit for 4GB GPU
)

Output Structure

The SummaryOutput provides rich, structured information:

result = summarizer.summarize(text)

# Main summary
print(result.summary)

# Key points (3-5 most important)
for point in result.key_points:
    print(f"• {point}")

# Quality metrics
print(f"Confidence: {result.confidence:.2f}")
print(f"Focus alignment: {result.focus_alignment:.2f}")

# Word counts
print(f"Original: {result.word_count_original} words")
print(f"Summary: {result.word_count_summary} words")
print(f"Compression: {result.word_count_original / result.word_count_summary:.1f}x")

Real-World Examples

Executive Summary

result = summarizer.summarize(
    quarterly_report,
    focus="financial performance and strategic initiatives",
    style=SummaryStyle.EXECUTIVE,
    length=SummaryLength.STANDARD
)

print("Executive Summary:")
print(result.summary)
print("\nKey Action Items:")
for point in result.key_points:
    print(f"• {point}")

Research Paper Analysis

result = summarizer.summarize(
    research_paper,
    focus="methodology and findings",
    style=SummaryStyle.ANALYTICAL,
    length=SummaryLength.DETAILED
)

if result.confidence > 0.8:
    print("High-confidence analysis:")
    print(result.summary)
else:
    print("Consider manual review - confidence low")

Technical Documentation

result = summarizer.summarize(
    technical_docs,
    focus="implementation details and requirements",
    style=SummaryStyle.STRUCTURED,
    length=SummaryLength.COMPREHENSIVE
)

print("Technical Overview:")
print(result.summary)

Event Monitoring

Monitor summarization progress with AbstractCore's event system:

from abstractcore.events import EventType, on_global

def monitor_summarization(event):
    if event.type == EventType.GENERATION_STARTED:
        print("Starting summarization...")
    elif event.type == EventType.GENERATION_COMPLETED:
        duration_ms = event.data.get("duration_ms")
        if isinstance(duration_ms, (int, float)):
            print(f"Completed in {float(duration_ms):.0f}ms")

on_global(EventType.GENERATION_STARTED, monitor_summarization)
on_global(EventType.GENERATION_COMPLETED, monitor_summarization)

result = summarizer.summarize(text)

Error Handling

Built-in reliability through AbstractCore:

from abstractcore.core.retry import RetryConfig

# Configure retry behavior
config = RetryConfig(max_attempts=3, initial_delay=1.0)
llm = create_llm("ollama", model="gemma3:1b-it-qat", retry_config=config)

summarizer = BasicSummarizer(llm)

# Automatic retry on failures
try:
    result = summarizer.summarize(text)
except Exception as e:
    print(f"Summarization failed after retries: {e}")

Performance Considerations

Document Length Guidelines

Token limits depend on your deployment environment's GPU/RAM:

  • < 8,000 tokens: Single-pass summarization (fastest)
  • 8,000-50,000 tokens: Automatic chunking with minimal overhead
  • > 50,000 tokens: Map-reduce approach, handles unlimited size

Memory Guidelines by Hardware:

  • 4-8GB GPU/RAM: Use max_tokens=8000-16000
  • 16GB GPU/RAM: Use max_tokens=24000-32000
  • 24GB+ GPU: Use max_tokens=48000+ or -1 (AUTO)
  • CPU-only: Use max_tokens=8000-16000
# Example: AUTO mode (default, uses model's full capability)
llm = create_llm("ollama", model="qwen3:4b-instruct")
summarizer = BasicSummarizer(llm)  # max_tokens=-1 by default

# Example: 8GB GPU deployment (hard limit)
llm = create_llm("ollama", model="qwen3:4b-instruct")
summarizer = BasicSummarizer(llm, max_tokens=16000)

# Example: CPU-only or constrained (hard limit)
llm = create_llm("ollama", model="qwen3:4b-instruct")
summarizer = BasicSummarizer(llm, max_tokens=8000)

Provider Selection

Recommended for Production:

  • Ollama gemma3:1b-it-qat: Fast (29s), high quality (95% confidence), cost-effective local processing
  • Ollama qwen3-coder:30b: Premium quality (98% confidence), slower (119s), suitable for critical tasks

Cloud Alternatives:

  • OpenAI GPT-4o-mini: Excellent quality with API costs, good for low-volume
  • Anthropic Claude: Great for analytical and narrative styles

Performance Comparison:

Model              Speed    Quality  Cost    Suitable For
gemma3:1b-it-qat  Fast     High     Free    Production, high-volume
qwen3-coder:30b   Slow     Premium  Free    Critical accuracy
GPT-4o-mini       Medium   High     Paid    Occasional use
Claude-3.5        Medium   High     Paid    Narrative summaries

Cost Optimization

# Free local processing with high quality
llm = create_llm("ollama", model="gemma3:1b-it-qat")  # Fast, free, high quality
summarizer = BasicSummarizer(llm)

# Brief summaries for even faster processing
result = summarizer.summarize(
    text,
    length=SummaryLength.BRIEF  # Fastest processing
)

# Cloud option for occasional use
# llm = create_llm("openai", model="gpt-4o-mini")  # vs gpt-4o

Implementation Details

Chunking Strategy

For long documents:

  1. Smart splitting: Breaks at sentence boundaries when possible
  2. Overlap: 200-character overlap between chunks to maintain context
  3. Map-reduce: Summarizes chunks independently, then combines
  4. Coherence: Final combination step ensures unified narrative

Prompt Engineering

The summarizer uses sophisticated prompts that:

  • Adapt to style: Different instructions for each presentation style
  • Scale with length: Appropriate guidance for brief vs comprehensive
  • Handle focus: Specific attention to user-specified focus areas
  • Validate quality: Self-assessment of confidence and focus alignment

Quality Assurance

  • Pydantic validation: Ensures structured output conforms to schema
  • Confidence scoring: LLM self-assesses summary accuracy
  • Focus alignment: Measures how well summary addresses specified focus
  • Word counting: Tracks compression ratios

Timeout Configuration

The summarizer supports flexible timeout configuration for different document processing scenarios:

Default Behavior (Unlimited Timeout)

# Runs as long as needed - recommended for large documents
python -m abstractcore.apps.summarizer document.txt

Custom Timeout

# Set specific timeout (useful for production environments)
python -m abstractcore.apps.summarizer document.txt --timeout 300   # 5 minutes
python -m abstractcore.apps.summarizer document.txt --timeout 600   # 10 minutes

# Explicit unlimited timeout
python -m abstractcore.apps.summarizer document.txt --timeout none

Programmatic Usage

from abstractcore.processing import BasicSummarizer

# Unlimited timeout (default)
summarizer = BasicSummarizer()

# Custom timeout
summarizer = BasicSummarizer(timeout=300)  # 5 minutes

# Explicit unlimited timeout
summarizer = BasicSummarizer(timeout=None)

When to Use Timeouts:

  • Production environments: Set reasonable timeouts (300-600 seconds) to prevent hanging
  • Large documents: Use unlimited timeout for documents >100KB
  • Batch processing: Consider timeouts to handle individual document failures gracefully
  • Development: Use unlimited timeout to avoid interruptions during testing

Integration Examples

With AbstractCore Session

from abstractcore import BasicSession

session = BasicSession(llm, system_prompt="You are an expert summarizer")
summarizer = BasicSummarizer(session)

# Maintains conversation context
result1 = summarizer.summarize(doc1, focus="technical aspects")
result2 = summarizer.summarize(doc2, focus="how this relates to the previous document")

Batch Processing

documents = [doc1, doc2, doc3, doc4]
summaries = []

for doc in documents:
    result = summarizer.summarize(
        doc.content,
        focus="key insights",
        style=SummaryStyle.STRUCTURED,
        length=SummaryLength.STANDARD
    )
    summaries.append({
        'title': doc.title,
        'summary': result.summary,
        'key_points': result.key_points,
        'confidence': result.confidence
    })

# Filter high-confidence summaries
high_quality = [s for s in summaries if s['confidence'] > 0.8]

Extending the Summarizer

The Basic Summarizer serves as a foundation for more advanced processing:

class CustomSummarizer(BasicSummarizer):
    def summarize_with_keywords(self, text: str, keywords: List[str]) -> SummaryOutput:
        focus = f"these specific keywords: {', '.join(keywords)}"
        return self.summarize(text, focus=focus, style=SummaryStyle.ANALYTICAL)

    def comparative_summarize(self, texts: List[str]) -> List[SummaryOutput]:
        focus = "comparative analysis and differences"
        return [self.summarize(text, focus=focus) for text in texts]

Best Practices

  1. Choose appropriate length: Match summary length to use case
  2. Use focus effectively: Specific focus areas improve relevance
  3. Monitor confidence: Low confidence may indicate need for manual review
  4. Provider selection: Match provider capabilities to content type
  5. Batch processing: Process similar documents together for consistency
  6. Error handling: Always handle potential failures gracefully

Conclusion

The Basic Summarizer demonstrates how AbstractCore's infrastructure enables building sophisticated text processing capabilities with minimal complexity. It showcases:

  • Clean API design with powerful customization options
  • Automatic reliability through built-in retry and error handling
  • Universal compatibility across all LLM providers
  • Scalable architecture handling documents of any size
  • Production readiness with comprehensive error handling and monitoring

This implementation serves both as a useful tool for real summarization needs and as a reference for building other text processing capabilities on top of AbstractCore.