Skip to content

christopherklint97/distill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distill

License: MIT Python 3.12+

Transform YouTube videos and podcast episodes into readable articles using LLM-powered summarization.

Features

  • YouTube processing — Extract captions or transcribe audio, then generate articles
  • Podcast support — Parse RSS feeds, browse episodes, download and transcribe audio
  • Podcast favorites — Save frequently used podcasts as favorites for quick interactive access
  • Language selection — Interactive language picker remembers recently used languages per podcast
  • Email selector — Interactive prompt to send articles via email after processing
  • Multiple article styles — Detailed, concise, summary, or bullet-point formats
  • Multiple output formats — Markdown, HTML, or EPUB
  • Subscription management — Subscribe to podcast feeds and sync for new episodes
  • Local caching — SQLite database stores transcripts and articles for fast regeneration
  • Email delivery — Send articles to your inbox via Resend
  • Configurable — TOML config file with environment variable overrides

Installation

Requires Python 3.12+ and uv.

git clone https://github.com/christopherklint97/distill.git
cd distill
uv sync

For local Whisper transcription (optional, large download):

uv sync --extra whisper

Environment Variables

export ANTHROPIC_API_KEY="your-key"    # Required — Claude API
export OPENAI_API_KEY="your-key"       # Required — Whisper API (default transcription backend)
export RESEND_API_KEY="re_xxx"         # Optional — Email delivery via Resend

Usage

YouTube

# Process a YouTube video (default: markdown, detailed style)
distill youtube "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

# Specify format and style
distill youtube "https://youtu.be/dQw4w9WgXcQ" --format html --style concise

# Save to a specific directory
distill youtube "https://youtube.com/watch?v=abc123" --output ./articles/

# Set transcription language
distill youtube "https://youtube.com/watch?v=abc123" --language sv

# Transcribe in Swedish but write the article in English
distill youtube "https://youtube.com/watch?v=abc123" --language sv --article-language en

# Send the article via email
distill youtube "https://youtube.com/watch?v=abc123" --send email

Podcasts

# Interactive mode — pick from favorites, recents, or add a new podcast
# After selecting a podcast and episode, you'll be prompted to choose a language
# and optionally send the article via email
distill podcast

# Browse and select an episode from a specific feed
distill podcast "https://example.com/feed.xml"

# Skip the language selector by passing --language directly
distill podcast "https://example.com/feed.xml" --language sv

# Process a direct audio URL
distill podcast-episode "https://example.com/episode.mp3" --title "Episode Name"

Favorites & Subscriptions

# Mark a podcast as a favorite (subscribes if needed)
distill favorite "https://example.com/feed.xml"

# Remove from favorites
distill unfavorite "https://example.com/feed.xml"

# Subscribe to a feed
distill subscribe "https://example.com/feed.xml" --auto-process

# List subscriptions (shows favorite status)
distill subscriptions

# Check for new episodes
distill sync

History & Regeneration

# View processing history
distill history

# Regenerate with a different style
distill regenerate <content-id> --style bullets --format epub

# Regenerate in a different language
distill regenerate <content-id> --article-language sv

Configuration

# Show current config
distill config show

# Change settings
distill config set whisper.backend local
distill config set claude.model claude-sonnet-4-6

Config file location: ~/.config/distill/config.toml

[general]
output_dir = "~/Documents/distill"
default_format = "markdown"
default_style = "detailed"

[whisper]
backend = "api"      # "api" or "local"
model = "base"       # tiny, base, small, medium, large
language = "en"

[claude]
model = "claude-sonnet-4-6"
max_tokens = 8192

[email]
to = "you@example.com"
from_addr = "Distill <distill@resend.dev>"

[subscriptions]
check_interval_hours = 24
auto_process = false

Environment variable overrides: DISTILL_EMAIL_TO, DISTILL_EMAIL_FROM.

Article Styles

Style Description
detailed Comprehensive article preserving most content
concise Key points and highlights (~30% of original)
summary Executive summary in 3-5 paragraphs
bullets Structured bullet-point notes

Development

# Run tests
uv run pytest

# Lint
uv run ruff check src/ tests/

# Type check
uv run mypy src/

# Format
uv run ruff format src/ tests/

Architecture

src/distill/
├── cli.py              # Typer CLI commands
├── config.py           # TOML config loading
├── db.py               # SQLite storage layer
├── models.py           # Pydantic data models
├── sources/
│   ├── youtube.py      # YouTube URL parsing & transcript fetching
│   └── podcast.py      # RSS feed parsing & audio download
├── transcription/
│   ├── base.py         # Abstract transcriber interface
│   ├── whisper_local.py # Local Whisper model
│   └── whisper_api.py  # OpenAI Whisper API
├── article/
│   ├── prompts.py      # LLM prompt templates
│   └── generator.py    # Article generation orchestration
└── output/
    ├── markdown.py     # Markdown renderer
    ├── html.py         # HTML renderer
    ├── epub.py         # EPUB renderer
    └── email.py        # Email delivery via Resend API

About

Transform YouTube videos and podcast episodes into readable articles using LLM-powered summarization

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages