Dictation App

A lightweight, local push-to-talk dictation app for macOS using OpenAI's Whisper model.

Features

🎤 Push-to-talk: Hold Right Command key to record, release to transcribe
🔒 100% Local: All transcription happens on your Mac, no internet required
🚀 Multiple Models: Choose from tiny/base/small/medium/large models (speed vs accuracy)
💾 Model Persistence: Your model selection is remembered across app restarts
📝 Auto-type: Types transcribed text directly (preserves your clipboard)
📋 Long Transcript Log: Automatically saves transcriptions >30 seconds to ~/Library/Logs/Dictation_Transcripts.log (access via menu)
🎨 Menu Bar App: Runs quietly in the background with a clean menu bar interface
💭 Visual Feedback: Icon changes to show transcription status (💭 thinking, 🎤 ready)
⏱️ Timeout Protection: Automatic timeout prevents hangs on problematic audio (note: timed-out transcriptions continue in background - see CHANGELOG)
🔄 Auto-retry: Failed transcriptions automatically retry up to 3 times
🛡️ Single Instance: Prevents conflicts from multiple app instances running simultaneously
⚡ Auto-start: Can be configured to launch on login

Installation

Prerequisites

macOS 13.0+ (tested on macOS 15+)

Swift Version (Current)

The Swift version is faster, more native, and fully self-contained. No ffmpeg required!

Clone and build:

git clone https://github.com/sayhar/dictation-app.git
cd dictation-app
./build-swift.sh

Install the app:

cp -R "dist/Swift Dictation.app" ~/Applications/
open ~/Applications/"Swift Dictation.app"

Python Version (Legacy)

The original Python implementation.

Prerequisites:

Python 3.12+
uv package manager
ffmpeg

Install dependencies:

brew install uv ffmpeg

Clone and build:

git clone https://github.com/sayhar/dictation-app.git
cd dictation-app
uv sync
uv run python setup.py py2app

Install the app:

cp -R dist/Dictation.app ~/Applications/
open ~/Applications/Dictation.app

Post-Installation

Grant permissions when prompted:

Accessibility (required for keyboard monitoring)
Microphone (required for audio recording)

If the app doesn't request permissions automatically:

Go to System Settings → Privacy & Security → Accessibility
Click the "+" button and add the app
Do the same for Microphone

First run: The app will download the selected Whisper model (~500MB for "small") on first use. This happens in the background and is cached to ~/.cache/huggingface/.

Usage

Click the 🎤 icon in the menu bar to access settings
Choose your preferred model from the Model submenu
Hold Right Command key and speak
Release to transcribe and auto-type

Model Comparison

Model	Size	Speed	Accuracy
Tiny	~40MB	Fastest	Lowest
Base	~150MB	Fast	Good
Small	~500MB	Balanced	Better
Medium	~1.5GB	Slower	Very Good
Large	~3GB	Slowest	Best

Models are automatically downloaded to ~/.cache/whisper/ on first use.

Technical Details

Swift Version

Built with:

mlx-whisper: Metal-accelerated Whisper (30-40% faster on Apple Silicon)
AVFoundation: Native audio recording (16kHz mono PCM)
CoreGraphics: Event tap for keyboard monitoring and text injection
AppKit: Menu bar interface
Bundled Python: Self-contained Python 3.13 environment (no system dependencies)

Python Version

Built with:

openai-whisper: OpenAI's speech recognition model
PyObjC: Native macOS APIs for keyboard monitoring
rumps: Menu bar app framework
sounddevice: Audio recording
py2app: macOS app bundling

Why Native APIs?

Standard Python keyboard libraries (like pynput) don't work properly in bundled macOS apps due to accessibility permission issues. Both versions use native CGEventTap APIs which macOS properly recognizes and trusts.

Files

dictation.py - Main application code
setup.py - py2app build configuration
create_icon.py - Icon generation script
~/Library/Logs/Dictation.log - Debug logs
~/Library/Logs/Dictation_Transcripts.log - Long transcriptions (>30s)

Auto-start on Login

System Settings → General → Login Items → Add Dictation.app

Troubleshooting

App not receiving keyboard events:

Remove Dictation from Accessibility permissions
Quit the app completely
Re-launch and grant permissions fresh

Permissions show as "uv" or "Python":

This is normal when running via uv run
Build with py2app for proper app attribution

Event tap fails:

Ensure you've granted Accessibility permissions
Try removing and re-adding the app to permissions
Check logs: tail -f ~/Library/Logs/Dictation.log

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
Dictation.xcodeproj		Dictation.xcodeproj
Dictation		Dictation
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dictation.entitlements		Dictation.entitlements
Package.swift		Package.swift
README.md		README.md
SWIFT_ARCHITECTURE.md		SWIFT_ARCHITECTURE.md
Swift_Dictation.icns		Swift_Dictation.icns
TESTING.md		TESTING.md
TODO.md		TODO.md
build-swift.sh		build-swift.sh
create_icon.py		create_icon.py
dictation.py		dictation.py
pyproject.toml		pyproject.toml
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dictation App

Features

Installation

Prerequisites

Swift Version (Current)

Python Version (Legacy)

Post-Installation

Usage

Model Comparison

Technical Details

Swift Version

Python Version

Why Native APIs?

Files

Auto-start on Login

Troubleshooting

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Dictation App

Features

Installation

Prerequisites

Swift Version (Current)

Python Version (Legacy)

Post-Installation

Usage

Model Comparison

Technical Details

Swift Version

Python Version

Why Native APIs?

Files

Auto-start on Login

Troubleshooting

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages