A lightweight, local push-to-talk dictation app for macOS using OpenAI's Whisper model.
- 🎤 Push-to-talk: Hold Right Command key to record, release to transcribe
- 🔒 100% Local: All transcription happens on your Mac, no internet required
- 🚀 Multiple Models: Choose from tiny/base/small/medium/large models (speed vs accuracy)
- 💾 Model Persistence: Your model selection is remembered across app restarts
- 📝 Auto-type: Types transcribed text directly (preserves your clipboard)
- 📋 Long Transcript Log: Automatically saves transcriptions >30 seconds to
~/Library/Logs/Dictation_Transcripts.log(access via menu) - 🎨 Menu Bar App: Runs quietly in the background with a clean menu bar interface
- 💭 Visual Feedback: Icon changes to show transcription status (💭 thinking, 🎤 ready)
- ⏱️ Timeout Protection: Automatic timeout prevents hangs on problematic audio (note: timed-out transcriptions continue in background - see CHANGELOG)
- 🔄 Auto-retry: Failed transcriptions automatically retry up to 3 times
- 🛡️ Single Instance: Prevents conflicts from multiple app instances running simultaneously
- ⚡ Auto-start: Can be configured to launch on login
- macOS 13.0+ (tested on macOS 15+)
The Swift version is faster, more native, and fully self-contained. No ffmpeg required!
- Clone and build:
git clone https://github.com/sayhar/dictation-app.git
cd dictation-app
./build-swift.sh- Install the app:
cp -R "dist/Swift Dictation.app" ~/Applications/
open ~/Applications/"Swift Dictation.app"The original Python implementation.
Prerequisites:
- Python 3.12+
- uv package manager
- ffmpeg
- Install dependencies:
brew install uv ffmpeg- Clone and build:
git clone https://github.com/sayhar/dictation-app.git
cd dictation-app
uv sync
uv run python setup.py py2app- Install the app:
cp -R dist/Dictation.app ~/Applications/
open ~/Applications/Dictation.appGrant permissions when prompted:
- Accessibility (required for keyboard monitoring)
- Microphone (required for audio recording)
If the app doesn't request permissions automatically:
- Go to System Settings → Privacy & Security → Accessibility
- Click the "+" button and add the app
- Do the same for Microphone
First run: The app will download the selected Whisper model (~500MB for "small") on first use. This happens in the background and is cached to ~/.cache/huggingface/.
- Click the 🎤 icon in the menu bar to access settings
- Choose your preferred model from the Model submenu
- Hold Right Command key and speak
- Release to transcribe and auto-type
| Model | Size | Speed | Accuracy |
|---|---|---|---|
| Tiny | ~40MB | Fastest | Lowest |
| Base | ~150MB | Fast | Good |
| Small | ~500MB | Balanced | Better |
| Medium | ~1.5GB | Slower | Very Good |
| Large | ~3GB | Slowest | Best |
Models are automatically downloaded to ~/.cache/whisper/ on first use.
Built with:
- mlx-whisper: Metal-accelerated Whisper (30-40% faster on Apple Silicon)
- AVFoundation: Native audio recording (16kHz mono PCM)
- CoreGraphics: Event tap for keyboard monitoring and text injection
- AppKit: Menu bar interface
- Bundled Python: Self-contained Python 3.13 environment (no system dependencies)
Built with:
- openai-whisper: OpenAI's speech recognition model
- PyObjC: Native macOS APIs for keyboard monitoring
- rumps: Menu bar app framework
- sounddevice: Audio recording
- py2app: macOS app bundling
Standard Python keyboard libraries (like pynput) don't work properly in bundled macOS apps due to accessibility permission issues. Both versions use native CGEventTap APIs which macOS properly recognizes and trusts.
dictation.py- Main application codesetup.py- py2app build configurationcreate_icon.py- Icon generation script~/Library/Logs/Dictation.log- Debug logs~/Library/Logs/Dictation_Transcripts.log- Long transcriptions (>30s)
System Settings → General → Login Items → Add Dictation.app
App not receiving keyboard events:
- Remove Dictation from Accessibility permissions
- Quit the app completely
- Re-launch and grant permissions fresh
Permissions show as "uv" or "Python":
- This is normal when running via
uv run - Build with py2app for proper app attribution
Event tap fails:
- Ensure you've granted Accessibility permissions
- Try removing and re-adding the app to permissions
- Check logs:
tail -f ~/Library/Logs/Dictation.log
MIT