🔗 Official Repository: Gemini CLI on GitHub
📦 Download/Install: npm install -g @google/gemini-cli
🏷️ Version Requirements: Gemini CLI v0.1.0+
Gemini CLI is Google's command-line interface for their Gemini AI models, offering a Claude Code-like experience with generous free tier (1000 requests/day, 1M token context). Voice Mode enhances Gemini CLI by adding natural voice conversation capabilities, allowing you to speak your coding requests and hear Gemini's responses.
- Gemini CLI installed (
npm install -g @google/gemini-cli) - See npm setup guide to avoid sudo - Python 3.10 or higher
- uv package manager (
curl -LsSf https://astral.sh/uv/install.sh | sh) - OpenAI API key (or compatible service for STT/TTS)
- System audio dependencies installed (see main README)
# Install Voice Mode
uvx voice-mode
# Set your OpenAI API key (for voice services)
export OPENAI_API_KEY="your-openai-key"
# Configure Gemini CLI with Voice Mode
gemini-cli config set mcp.voice-mode.command "uvx voice-mode"
# Start voice-enabled Gemini CLI
gemini-climacOS/Linux Users: To avoid using sudo with npm, see our npm setup guide
# Install globally via npm
npm install -g @google/gemini-cli
# Configure with your Gemini API key
gemini-cli auth loginFind Gemini CLI's configuration file:
- macOS/Linux:
~/.gemini/settings.json - Windows:
%USERPROFILE%\.gemini\settings.json
Add the Voice Mode MCP server to your existing configuration:
{
"mcpServers": {
"voice-mode": {
"command": "uvx",
"args": [
"voice-mode"
]
}
}
}For advanced configuration:
# Required for voice services
export OPENAI_API_KEY="your-key"
# Optional - Use local services
export VOICEMODE_TTS_BASE_URL="http://127.0.0.1:8880/v1"
export VOICEMODE_STT_BASE_URL="http://127.0.0.1:2022/v1"
# Optional - Preferred voice
export VOICEMODE_TTS_VOICE="nova"-
Check Installation:
# Verify Gemini CLI gemini-cli --version # Test Voice Mode directly uvx voice-mode --help
-
Test Voice Mode:
# Start Gemini CLI gemini-cli # Try voice commands: # "Hello, can you hear me?" # "Let's have a voice conversation"
-
Verify MCP Connection:
- Check Gemini CLI logs for MCP server initialization
- Look for "voice-mode" in active servers list
# In Gemini CLI:
"Hey Gemini, let's talk about this code"
"Can you explain this error message?"
"What's the best approach for this feature?"
# Navigate to your project
cd my-project
# Start Gemini CLI
gemini-cli
# Voice commands:
"Review this pull request"
"Suggest improvements for this function"
"Help me write tests for this module"
- Ensure MCP support is enabled in Gemini CLI
- Check configuration file syntax
- Try running
uvx voice-modedirectly to test
- Grant terminal microphone permissions
- Check system audio settings
- Run:
VOICEMODE_DEBUG=true gemini-clifor detailed logs
- Gemini CLI may be slower than Claude Code
- Voice processing adds ~2-3s latency
- Consider using local STT/TTS for better performance
- Grant terminal app microphone access in System Preferences
- Gemini CLI config may be in
~/Library/Application Support/gemini-cli/
- Ensure PulseAudio/PipeWire is running
- Check audio permissions:
groups $USER | grep audio
- Best experience with WSL2
- Native Windows support may have audio limitations
For privacy and offline usage:
-
Start local services:
# Kokoro TTS docker run -p 8880:8880 kokoro-tts # Whisper STT whisper-cpp-server -p 2022
-
Configure endpoints:
{ "env": { "VOICEMODE_TTS_BASE_URL": "http://127.0.0.1:8880/v1", "VOICEMODE_STT_BASE_URL": "http://127.0.0.1:2022/v1", "VOICEMODE_PREFER_LOCAL": "true" } }
Take advantage of Gemini's 1M token context:
# Enable verbose mode for large codebases
export VOICEMODE_DEBUG="trace"
# Increase audio duration for longer responses
export VOICEMODE_LISTEN_DURATION="30"- 📚 Voice Mode Documentation
- 🔧 Configuration Reference
- 🎤 Demo: Voice Mode with Gemini CLI
- 🏠 LiveKit Integration
- 💬 Gemini CLI Documentation
- 🐛 Troubleshooting Guide
- 🌟 Gemini API Documentation
Need Help? Join our Discord community or post in r/GeminiCLI