[NOTE] vibe coded local STT because Handy microphone settings are really bad and and I can't configure it to work better so I made my own with codex agents - basically 100% ai-written
- hotkey client that records mic audio, cleans it, and posts it to a FastAPI ASR server using NVIDIA Parakeet; the transcript can be auto-typed into the active window.
- python 3.12+, cuda-capable gpu for the server model download/load
- microphone access; speakers for feedback beeps
- deps tracked in
pyproject.toml+uv.lock
uv venv
source .venv/bin/activate
uv sync # installs from the lockfilepython asr_server.py- downloads and loads
nvidia/parakeet-tdt-0.6b-v3to cuda, exposesPOST /asrat port 8000.
python asr_client.py --server http://localhost:8000/asr- hold
ctrl + command(left or right) to record; release to stop - default flow: bandpass + rms normalize, save temp wav, send to server, print transcript, type it into the focused app
- useful flags:
--raw(skip audio cleanup),--no-type,--type-on keyword(only type if keyword present),--delayseconds between keystrokes,--delay-per-word,--debug-keys