Voice Mode Installation Tools

Voice Mode now includes MCP tools to automatically install and configure whisper.cpp and kokoro-fastapi, making it easier to set up free, private, open-source voice services.

Overview

These tools handle:

System detection (macOS/Linux)
Dependency installation
GPU support configuration
Model downloads
Service configuration

Available Tools

install_whisper_cpp

Installs whisper.cpp for speech-to-text (STT) functionality.

Features

Automatic OS detection (macOS/Linux)
GPU acceleration (Metal on macOS, CUDA on Linux)
Model download management
Build optimization
Service configuration (launchd on macOS, systemd on Linux)
Environment variable support for model selection

Usage

# Basic installation with defaults
result = await install_whisper_cpp()

# Custom installation
result = await install_whisper_cpp(
    install_dir="~/my-whisper",
    model="large-v3",
    use_gpu=True,
    force_reinstall=False
)

Parameters

install_dir (str, optional): Installation directory (default: ~/.voicemode/whisper.cpp)
model (str, optional): Whisper model to download (default: large-v2)
- Available models: tiny, tiny.en, base, base.en, small, small.en, medium, medium.en, large-v2, large-v3
- Note: large-v2 is default for best accuracy (requires ~3GB RAM)
use_gpu (bool, optional): Enable GPU support (default: auto-detect)
force_reinstall (bool, optional): Force reinstallation (default: false)

Return Value

{
    "success": True,
    "install_path": "/Users/user/.voicemode/whisper.cpp",
    "model_path": "/Users/user/.voicemode/whisper.cpp/models/ggml-large-v2.bin",
    "gpu_enabled": True,
    "gpu_type": "metal",  # or "cuda" or "cpu"
    "performance_info": {
        "system": "Darwin",
        "gpu_acceleration": "metal",
        "model": "large-v2",
        "binary_path": "/Users/user/.voicemode/whisper.cpp/main",
        "server_port": 2022,
        "server_url": "http://localhost:2022"
    },
    "launchagent": "/Users/user/Library/LaunchAgents/com.voicemode.whisper-server.plist",  # macOS
    "systemd_service": "/home/user/.config/systemd/user/whisper-server.service",  # Linux
    "start_script": "/Users/user/.voicemode/whisper.cpp/start-whisper-server.sh"
}

install_kokoro_fastapi

Installs kokoro-fastapi for text-to-speech (TTS) functionality.

Features

Python environment management with UV
Automatic model downloads
Service configuration (launchd on macOS, systemd on Linux)
Auto-start capability

Usage

# Basic installation with defaults
result = await install_kokoro_fastapi()

# Custom installation
result = await install_kokoro_fastapi(
    install_dir="~/my-kokoro",
    models_dir="~/my-models",
    port=8881,
    auto_start=True,
    install_models=True,
    force_reinstall=False
)

Parameters

install_dir (str, optional): Installation directory (default: ~/.voicemode/kokoro-fastapi)
models_dir (str, optional): Models directory (default: ~/.voicemode/kokoro-models)
port (int, optional): Service port (default: 8880)
auto_start (bool, optional): Start service after installation (default: true)
install_models (bool, optional): Download Kokoro models (default: true)
force_reinstall (bool, optional): Force reinstallation (default: false)

Return Value

{
    "success": True,
    "install_path": "/home/user/.voicemode/kokoro-fastapi",
    "service_url": "http://127.0.0.1:8880",
    "service_status": "managed_by_systemd",  # Linux
    "service_status": "managed_by_launchd",  # macOS
    "systemd_service": "/home/user/.config/systemd/user/kokoro-fastapi-8880.service",  # Linux
    "launchagent": "/Users/user/Library/LaunchAgents/com.voicemode.kokoro-8880.plist",  # macOS
    "start_script": "/home/user/.voicemode/kokoro-fastapi/start-cpu.sh",
    "message": "Kokoro-fastapi installed. Run: cd /home/user/.voicemode/kokoro-fastapi && ./start-cpu.sh"
}

System Requirements

whisper.cpp

macOS

Xcode Command Line Tools
Homebrew (for cmake)
Metal support (built-in)

Linux

Build essentials (gcc, g++, make)
CMake
CUDA toolkit (optional, for NVIDIA GPU support)

kokoro-fastapi

All Systems

Python 3.10+
Git
~5GB disk space for models
UV package manager (installed automatically if missing)

Integration with Voice Mode

After installation, the services integrate automatically with Voice Mode:

whisper.cpp:
- Runs automatically on boot (port 2022)
- OpenAI-compatible API endpoint
- Model selection via VOICEMODE_WHISPER_MODEL environment variable
- View installed models: claude resource read whisper://models
kokoro-fastapi:
- Automatically detected by Voice Mode's provider registry when running
- 67 voices available
- OpenAI-compatible API endpoint

Examples

Complete Setup

# Install both services with defaults
whisper_result = await install_whisper_cpp()  # Uses large-v2 by default
kokoro_result = await install_kokoro_fastapi()

# Check installation status
if whisper_result["success"] and kokoro_result["success"]:
    print("Voice services installed successfully!")
    print(f"Whisper: {whisper_result['install_path']}")
    print(f"Whisper server: {whisper_result['performance_info']['server_url']}")
    print(f"Kokoro API: {kokoro_result['service_url']}")

Upgrade Existing Installation

# Force reinstall with larger model
result = await install_whisper_cpp(
    model="large-v3",
    force_reinstall=True
)

Custom Configuration

# Install kokoro-fastapi on different port
result = await install_kokoro_fastapi(
    port=9000,
    models_dir="/opt/models/kokoro"
)

Troubleshooting

Common Issues

Missing Dependencies
- The tools will report missing dependencies with installation instructions
- Follow the provided commands to install required packages
Port Conflicts
- If port 8880 is in use, specify a different port for kokoro-fastapi
- Check running services: lsof -i :8880
GPU Not Detected
- On Linux, ensure NVIDIA drivers and CUDA are installed
- Use nvidia-smi to verify GPU availability
- Force CPU mode with use_gpu=False if needed
Model Download Failures
- Check internet connection
- Verify sufficient disk space
- Try smaller models first (tiny, base)

Manual Service Management

macOS (launchd)

# Whisper
launchctl load ~/Library/LaunchAgents/com.voicemode.whisper-server.plist
launchctl unload ~/Library/LaunchAgents/com.voicemode.whisper-server.plist

# Kokoro
launchctl load ~/Library/LaunchAgents/com.voicemode.kokoro.plist
launchctl unload ~/Library/LaunchAgents/com.voicemode.kokoro.plist

Linux (systemd)

# Whisper
systemctl --user start whisper-server
systemctl --user stop whisper-server
systemctl --user status whisper-server

# Kokoro
systemctl --user start kokoro-fastapi-8880
systemctl --user stop kokoro-fastapi-8880
systemctl --user status kokoro-fastapi-8880

Change Whisper Model

# Set environment variable before restarting
export VOICEMODE_WHISPER_MODEL=base.en  # or tiny, small, medium, large-v2, large-v3

# Restart service to apply change
# macOS
launchctl unload ~/Library/LaunchAgents/com.voicemode.whisper-server.plist
launchctl load ~/Library/LaunchAgents/com.voicemode.whisper-server.plist

# Linux
systemctl --user restart whisper-server

Testing

Run the test suite to verify installation tools:

cd /path/to/voicemode
python -m pytest tests/test_installers.py -v

# Skip integration tests (no actual installation)
SKIP_INTEGRATION_TESTS=1 python -m pytest tests/test_installers.py -v

Contributing

When adding new installation tools:

Create a new function in voice_mode/tools/installers.py
Use the @mcp.tool() decorator
Follow the existing pattern for error handling and return values
Add comprehensive tests in tests/test_installers.py
Update this documentation

Security Notes

All installations are performed in user space (no sudo required)
Models are downloaded from official sources
Services bind to localhost only by default
No external network access without explicit configuration

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Voice Mode Installation Tools

Overview

Available Tools

install_whisper_cpp

Features

Usage

Parameters

Return Value

install_kokoro_fastapi

Features

Usage

Parameters

Return Value

System Requirements

whisper.cpp

macOS

Linux

kokoro-fastapi

All Systems

Integration with Voice Mode

Examples

Complete Setup

Upgrade Existing Installation

Custom Configuration

Troubleshooting

Common Issues

Manual Service Management

macOS (launchd)

Linux (systemd)

Change Whisper Model

Testing

Contributing

Security Notes

FilesExpand file tree

installation-tools.md

Latest commit

History

installation-tools.md

File metadata and controls

Voice Mode Installation Tools

Overview

Available Tools

install_whisper_cpp

Features

Usage

Parameters

Return Value

install_kokoro_fastapi

Features

Usage

Parameters

Return Value

System Requirements

whisper.cpp

macOS

Linux

kokoro-fastapi

All Systems

Integration with Voice Mode

Examples

Complete Setup

Upgrade Existing Installation

Custom Configuration

Troubleshooting

Common Issues

Manual Service Management

macOS (launchd)

Linux (systemd)

Change Whisper Model

Testing

Contributing

Security Notes