Skip to content

Latest commit

 

History

History
345 lines (246 loc) · 7.1 KB

File metadata and controls

345 lines (246 loc) · 7.1 KB

CLI Reference

Overview

The docling-graph CLI provides command-line tools for document-to-graph conversion, configuration management, and graph visualization.

Available Commands:

  • init - Create configuration files
  • convert - Convert documents to graphs
  • inspect - Visualize graphs in browser

Quick Start

Installation

pip install docling-graph

# Verify installation
docling-graph --version

(If you installed from source with uv, use uv run docling-graph instead of docling-graph.)

Basic Usage

# 1. Initialize configuration
docling-graph init

# 2. Convert a document
docling-graph convert document.pdf \
    --template "templates.BillingDocument"

# 3. Visualize the graph
docling-graph inspect outputs/

Global Options

Available with all commands:

Option Short Description
--verbose -v Enable detailed logging
--version Show version and exit
--help -h Show help message

Examples

# Show version
docling-graph --version

# Enable verbose logging
docling-graph --verbose convert document.pdf -t "templates.BillingDocument"

# Show help
docling-graph --help
docling-graph convert --help

Command Overview

init

Create a configuration file with interactive prompts.

docling-graph init

Features:

  • Interactive configuration builder (processing mode, extraction contract, backend, inference, provider/model, export, output)
  • When you select delta as extraction contract, prompts for delta resolvers and quality gate tuning
  • Dependency validation
  • Provider/model identifiers use LiteLLM routing
  • API key guidance

Learn more: init Command →


convert

Convert documents to knowledge graphs.

docling-graph convert SOURCE --template TEMPLATE [OPTIONS]

Features:

  • Multiple backend support (LLM/VLM)
  • Flexible processing modes
  • Configurable chunking
  • Multiple export formats

Learn more: convert Command →


inspect

Visualize graphs in your browser.

docling-graph inspect PATH [OPTIONS]

Features:

  • Interactive HTML visualization
  • CSV and JSON import
  • Node/edge exploration
  • Self-contained output

Learn more: inspect Command →


Common Workflows

Workflow 1: First-Time Setup

# 1. Initialize configuration
docling-graph init

# 2. Install dependencies (if prompted)
uv sync

# 3. Set API key (if using remote)
export MISTRAL_API_KEY="your-key"

# 4. Convert first document
docling-graph convert document.pdf \
    --template "templates.BillingDocument"

Workflow 2: Batch Processing

# Process multiple documents
for pdf in documents/*.pdf; do
    docling-graph convert "$pdf" \
        --template "templates.BillingDocument" \
        --output-dir "outputs/$(basename $pdf .pdf)"
done

# Visualize results
for dir in outputs/*/; do
    docling-graph inspect "$dir" \
        --output "${dir}/visualization.html" \
        --no-open
done

Workflow 3: Development Iteration

# 1. Convert with verbose logging
docling-graph --verbose convert document.pdf \
    --template "templates.BillingDocument" \
    --output-dir "test_output"

# 2. Inspect results
docling-graph inspect test_output/

# 3. Iterate on template
# Edit templates/billing_document.py

# 4. Re-run conversion
docling-graph convert document.pdf \
    --template "templates.BillingDocument" \
    --output-dir "test_output"

Configuration Priority

The CLI uses the following priority order (highest to lowest):

  1. Command-line arguments (e.g., --backend llm)
  2. config.yaml (created by init)
  3. Built-in defaults (from PipelineConfig)

Example

# config.yaml
defaults:
  backend: llm
  inference: local
# This uses remote inference (CLI overrides config)
docling-graph convert doc.pdf \
    --template "templates.BillingDocument" \
    --inference remote

Environment Variables

API Keys

# Remote providers
export MISTRAL_API_KEY="your-key"
export OPENAI_API_KEY="your-key"
export GEMINI_API_KEY="your-key"
export WATSONX_API_KEY="your-key"

Local Providers

# vLLM base URL (default: http://localhost:8000/v1)
export VLLM_BASE_URL="http://custom-host:8000/v1"

# Ollama base URL (default: http://localhost:11434)
export OLLAMA_BASE_URL="http://custom-host:11434"

Output Structure

Default output directory structure:

outputs/
├── metadata.json          # Pipeline metadata
├── docling/               # Docling conversion output
│   ├── document.json      # Docling format
│   └── document.md        # Markdown export
└── docling_graph/         # Graph outputs
    ├── graph.json         # Complete graph
    ├── nodes.csv          # Node data
    ├── edges.csv          # Edge data
    ├── graph.html         # Interactive visualization
    └── report.md          # Summary report

Error Handling

Common Errors

Configuration Error:

[red]Configuration Error:[/red] Invalid backend type: 'invalid'

Solution: Use llm or vlm

Extraction Error:

[red]Extraction Error:[/red] Template not found: 'templates.Missing'

Solution: Check template path and ensure it's importable

Pipeline Error:

[red]Pipeline Error:[/red] API key not found for provider: mistral

Solution: Set MISTRAL_API_KEY environment variable

Verbose Mode

Enable verbose logging for debugging:

docling-graph --verbose convert document.pdf \
    --template "templates.BillingDocument"

Best Practices

👍 Use Configuration Files

# ✅ Good - Reusable configuration
docling-graph init
docling-graph convert document.pdf -t "templates.BillingDocument"

# ❌ Avoid - Repeating options
docling-graph convert document.pdf \
    --template "templates.BillingDocument" \
    --backend llm \
    --inference remote \
    --provider mistral \
    --model mistral-large-latest

👍 Organize Output

# ✅ Good - Organized by document
docling-graph convert invoice_001.pdf \
    --template "templates.BillingDocument" \
    --output-dir "outputs/invoice_001"

# ❌ Avoid - Overwriting outputs
docling-graph convert invoice_001.pdf \
    --template "templates.BillingDocument"

👍 Use Verbose for Development

# ✅ Good - Debug during development
docling-graph --verbose convert document.pdf \
    --template "templates.BillingDocument"

# ✅ Good - Silent in production
docling-graph convert document.pdf \
    --template "templates.BillingDocument"

Next Steps

Explore each command in detail:

  1. init Command → - Configuration setup
  2. convert Command → - Document conversion
  3. inspect Command → - Graph visualization
  4. CLI Recipes → - Common patterns

Or continue to: