| icon | lucide/apple |
|---|
Native macOS setup with full Metal GPU acceleration for optimal performance.
Tip
🍎 Recommended for macOS — ~10x better performance than Docker via Metal GPU acceleration.
- macOS 12 Monterey or later
- 8GB+ RAM (16GB+ recommended)
- 10GB free disk space
- Homebrew installed
-
Clone and run the setup script:
git clone https://github.com/basnijholt/agent-cli.git cd agent-cli ./scripts/setup-macos.sh -
Start all services:
scripts/start-all-services.sh
-
Install agent-cli:
uv tool install agent-cli -p 3.13 # or: pip install agent-cli -
Test the setup:
agent-cli autocorrect "this has an eror"
The setup-macos.sh script:
- ✅ Checks for Homebrew
- ✅ Installs
uvif needed - ✅ Installs/checks Ollama (native macOS app)
- ✅ Installs Zellij for session management
- ✅ Installs Whisper and TTS as launchd daemons via
agent-cli daemon install
| Service | Implementation | Port | GPU Support |
|---|---|---|---|
| Ollama | Native macOS app | 11434 | ✅ Metal GPU |
| Whisper | MLX Whisper (via daemon) | 10300 | ✅ Apple Silicon MLX |
| TTS (Kokoro) | Kokoro TTS (via daemon) | 10200 | ✅ Metal GPU (MPS) |
| OpenWakeWord | Wyoming OpenWakeWord | 10400 | N/A |
Note
Whisper and TTS run as launchd daemons via agent-cli daemon install, using MLX and Metal for GPU acceleration on Apple Silicon.
The setup uses Zellij for managing all services in one session:
scripts/start-all-services.shCtrl-O d- Detach (services keep running)zellij attach agent-cli- Reattach to sessionzellij list-sessions- List all sessionszellij kill-session agent-cli- Stop all servicesAlt + arrow keys- Navigate between panesCtrl-Q- Quit (stops all services)
If you prefer running services individually:
# Ollama (brew service recommended)
brew services start ollama
# Or run in foreground:
ollama serve
# Whisper and TTS are installed as launchd daemons
agent-cli daemon status # Check all daemon status
agent-cli daemon install whisper # Install whisper daemon
agent-cli daemon install tts-kokoro # Install TTS daemon
# Or run in foreground (without daemon):
agent-cli server whisper
agent-cli server tts --backend kokoro
# OpenWakeWord
scripts/run-openwakeword.shIntel Macs: prefer Docker or a Linux-style Wyoming Faster Whisper setup; MLX Whisper is Apple Silicon only.
- 10x faster than Docker - Full Metal GPU acceleration
- Better resource usage - Native integration with macOS
- Automatic model management - Services handle downloads
- Ensure Settings > Notifications > terminal-notifier > Allow Notifications is enabled.
- For a persistent “Listening…” badge, set the Alert style to Persistent (or choose Alerts on macOS versions that still offer Alert/Banner). This keeps the recording indicator visible while other notifications still auto-dismiss automatically.
# Check if Ollama is running
ollama list
# Pull a model manually
ollama pull gemma3:4b
# Check Ollama logs
tail -f ~/.ollama/logs/server.log# Check what's using a port
lsof -i :11434
lsof -i :10300
lsof -i :10200
lsof -i :10400# Reinstall uv
brew reinstall uv
# Check uv installation
uv --version# Kill stuck sessions
zellij kill-all-sessions
# Check session status
zellij list-sessions
# Start without Zellij (manual)
# Run each script in separate terminals- Close other apps to free RAM
- Check Activity Monitor for high CPU/Memory usage
- Services will automatically download required models
If you prefer Docker despite performance limitations:
- Docker Setup Guide
- Note: ~10x slower due to no GPU acceleration