Quick CLI for local text-to-speech with two backends: Qwen3-TTS (default) and Kokoro TTS.
Recommended (fast, reproducible):
uv tool install lttsRun without installing:
uvx ltts "hello world" --sayWith pip:
pip install lttsFor faster inference on NVIDIA GPUs:
pip install 'ltts[cuda]'# Generate speech (saves to output.mp3 by default)
ltts "Hello, world!"
# Play through speakers
ltts "Hello, world!" --say
# Save to specific file
ltts "Hello, world!" -o speech.wav
# Read from stdin
echo "Hello from pipe" | ltts --say
cat article.txt | ltts -o article.mp3Higher quality with voice cloning and emotional control. Supports 10 languages.
# Preset voices
ltts "Hello, world!" -v Ryan --say # English male (default)
ltts "Hello, world!" -v Aiden --say # English male
ltts "你好世界" -v Vivian --say # Chinese female
ltts "こんにちは" -v Ono_Anna --say # Japanese female
ltts "안녕하세요" -v Sohee --say # Korean female
# Voice cloning (3+ seconds of reference audio)
ltts "Hello in your voice" --ref-audio voice.wav --say
ltts "Hello" --ref-audio voice.wav --ref-text "transcript" --say
# Emotional control
ltts "I can't believe we won!" --instruct "speak with excitement" --say
# Smaller model for faster inference
ltts "Hello world" --model-size 0.6B --sayPreset voices: Ryan, Aiden (English), Vivian, Serena, Dylan, Eric, Uncle_Fu (Chinese), Ono_Anna (Japanese), Sohee (Korean)
Languages: en, zh, ja, ko, de, fr, es, pt, it, ru
Lightweight with 50+ voices. Supports streaming for faster time-to-first-audio.
# Use Kokoro backend
ltts "Hello world" -b kokoro -v af_heart --say
ltts "こんにちは" -b kokoro -v jf_alpha --say
# Stream chunks as generated (lower latency)
ltts "Hello world" -b kokoro --say --chunkVoices: af_heart, af_alloy, af_bella, am_adam, am_michael (American), bf_alice, bf_emma, bm_daniel (British), jf_alpha, jm_kumo (Japanese), zf_xiaobei, zm_yunxi (Chinese), ef_dora, em_alex (Spanish), ff_siwis (French), and more.
Full voice list: https://huggingface.co/hexgrad/Kokoro-82M/blob/main/VOICES.md
# Device selection
ltts "Hello" -d cpu --say # CPU (default)
ltts "Hello" -d cuda --say # NVIDIA GPU
ltts "Hello" -d mps --say # Apple Silicon
# Output formats
ltts "test" -o out.mp3 # MP3 (default)
ltts "test" -o out.wav # WAV
ltts "test" -o out.ogg # OGG
ltts "test" -o out.flac # FLAC
# Language override
ltts "Bonjour" -l fr --say- First run downloads models to
~/.cache/huggingface/(~3GB for Qwen 1.7B, ~330MB for Kokoro) - Audio playback (
--say) runs at 24 kHz - On Linux, ensure PulseAudio/PipeWire is running for audio playback
uv sync
uv run ltts "hello world" --say
uv run ltts "hello world" -b kokoro -v af_heart --say
./scripts/release.sh