Kitten TTS

New: Kitten TTS v0.8 is out -- 15M, 40M, and 80M parameter models now available.

Kitten TTS is an open-source, lightweight text-to-speech library built on ONNX. With models ranging from 15M to 80M parameters (25-80 MB on disk), it delivers high-quality voice synthesis on CPU without requiring a GPU.

Status: Developer preview -- APIs may change between releases.

Commercial support is available. For integration assistance, custom voices, or enterprise licensing, contact us.

Features

Ultra-lightweight -- Model sizes from 25 MB (int8) to 80 MB, suitable for edge deployment
CPU-optimized -- ONNX-based inference runs efficiently without a GPU
8 built-in voices -- Bella, Jasper, Luna, Bruno, Rosie, Hugo, Kiki, and Leo
Adjustable speech speed -- Control playback rate via the speed parameter
Text preprocessing -- Built-in pipeline handles numbers, currencies, units, and more
24 kHz output -- High-quality audio at a standard sample rate

Available Models

Model	Parameters	Size	Download
kitten-tts-mini	80M	80 MB	KittenML/kitten-tts-mini-0.8
kitten-tts-micro	40M	41 MB	KittenML/kitten-tts-micro-0.8
kitten-tts-nano	15M	56 MB	KittenML/kitten-tts-nano-0.8
kitten-tts-nano (int8)	15M	25 MB	KittenML/kitten-tts-nano-0.8-int8

Note: Some users have reported issues with the kitten-tts-nano-0.8-int8 model. If you encounter problems, please open an issue.

Demo

final_vid.mp4

Try it online

Try Kitten TTS directly in your browser on Hugging Face Spaces.

Quick Start

Prerequisites

Python 3.8 or later
pip

Installation

pip install https://github.com/KittenML/KittenTTS/releases/download/0.8.1/kittentts-0.8.1-py3-none-any.whl

Basic Usage

from kittentts import KittenTTS

model = KittenTTS("KittenML/kitten-tts-mini-0.8")
audio = model.generate("This high-quality TTS model runs without a GPU.", voice="Jasper")

import soundfile as sf
sf.write("output.wav", audio, 24000)

Advanced Usage

# Adjust speech speed (default: 1.0)
audio = model.generate("Hello, world.", voice="Luna", speed=1.2)

# Save directly to a file
model.generate_to_file("Hello, world.", "output.wav", voice="Bruno", speed=0.9)

# List available voices
print(model.available_voices)
# ['Bella', 'Jasper', 'Luna', 'Bruno', 'Rosie', 'Hugo', 'Kiki', 'Leo']

Using with GPU

pip install -r requirements_gpu.txt

m = KittenTTS("KittenML/kitten-tts-mini-0.8", backend="cuda")

Check out example_cuda.py

API Reference

`KittenTTS(model_name, cache_dir=None)`

Load a model from Hugging Face Hub.

Parameter	Type	Default	Description
`model_name`	`str`	`"KittenML/kitten-tts-nano-0.8"`	Hugging Face repository ID
`cache_dir`	`str`	`None`	Local directory for caching downloaded model files

`model.generate(text, voice, speed, clean_text)`

Synthesize speech from text, returning a NumPy array of audio samples at 24 kHz.

Parameter	Type	Default	Description
`text`	`str`	--	Input text to synthesize
`voice`	`str`	`"expr-voice-5-m"`	Voice name (see available voices)
`speed`	`float`	`1.0`	Speech speed multiplier
`clean_text`	`bool`	`False`	Preprocess text (expand numbers, currencies, etc.)

`model.generate_to_file(text, output_path, voice, speed, sample_rate, clean_text)`

Synthesize speech and write directly to an audio file.

Parameter	Type	Default	Description
`text`	`str`	--	Input text to synthesize
`output_path`	`str`	--	Path to save the audio file
`voice`	`str`	`"expr-voice-5-m"`	Voice name
`speed`	`float`	`1.0`	Speech speed multiplier
`sample_rate`	`int`	`24000`	Audio sample rate in Hz
`clean_text`	`bool`	`True`	Preprocess text (expand numbers, currencies, etc.)

`normalize_text(text, locale="en-US", return_spans=False)`

Normalize text for TTS without generating audio.

from kittentts import normalize_text

normalized = normalize_text("Dr. Rivera paid $12.50 at 3:05 p.m.")
# "Doctor Rivera paid twelve dollars and fifty cents at three oh five p m."

result = normalize_text("Fig. 2", return_spans=True)
print(result.text)
print(result.spans)

When return_spans=True, the result includes original-to-normalized character spans for changed segments such as abbreviations, dates, times, numbers, currency, URLs, and punctuation.

`model.available_voices`

Returns a list of available voice names: ['Bella', 'Jasper', 'Luna', 'Bruno', 'Rosie', 'Hugo', 'Kiki', 'Leo']

System Requirements

Operating system: Linux, macOS, or Windows
Python: 3.8 or later
Hardware: Runs on CPU; no GPU required
Disk space: 25-80 MB depending on model variant

A virtual environment (conda, venv, or similar) is recommended to avoid dependency conflicts.

Roadmap

Commercial Support

We offer commercial support for teams integrating Kitten TTS into their products. This includes integration assistance, custom voice development, and enterprise licensing.

Contact us or email info@stellonlabs.com to discuss your requirements.

Community and Support

Discord: Join the community
Website: kittenml.com
Custom support: Request form
Email: info@stellonlabs.com
Issues: GitHub Issues

License

This project is licensed under the Apache License 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
kittentts		kittentts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
example.py		example.py
example_cuda.py		example_cuda.py
example_streaming.py		example_streaming.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_gpu.txt		requirements_gpu.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kitten TTS

Table of Contents

Features

Available Models

Demo

Try it online

Quick Start

Prerequisites

Installation

Basic Usage

Advanced Usage

Using with GPU

API Reference

`KittenTTS(model_name, cache_dir=None)`

`model.generate(text, voice, speed, clean_text)`

`model.generate_to_file(text, output_path, voice, speed, sample_rate, clean_text)`

`normalize_text(text, locale="en-US", return_spans=False)`

`model.available_voices`

System Requirements

Roadmap

Commercial Support

Community and Support

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Kitten TTS

Table of Contents

Features

Available Models

Demo

Try it online

Quick Start

Prerequisites

Installation

Basic Usage

Advanced Usage

Using with GPU

API Reference

KittenTTS(model_name, cache_dir=None)

model.generate(text, voice, speed, clean_text)

model.generate_to_file(text, output_path, voice, speed, sample_rate, clean_text)

normalize_text(text, locale="en-US", return_spans=False)

model.available_voices

System Requirements

Roadmap

Commercial Support

Community and Support

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`KittenTTS(model_name, cache_dir=None)`

`model.generate(text, voice, speed, clean_text)`

`model.generate_to_file(text, output_path, voice, speed, sample_rate, clean_text)`

`normalize_text(text, locale="en-US", return_spans=False)`

`model.available_voices`

Packages