Voice messages are transcribed locally using Whisper via the nodejs-whisper package. No audio is sent to external services.
Transcription is configured under globals.transcription and can be overridden per project in the projects map (e.g. projects.my-bot.transcription). Set these in your config file (e.g. hal.config.yaml or hal.config.local.yaml).
| Key | Description | Default |
|---|---|---|
model |
Whisper model name (see Whisper models below). | "base.en" |
mode |
Transcript UX mode: confirm (transcript + Use it/Cancel), inline (show transcript while processing), silent (no transcript shown). |
"confirm" |
Legacy compatibility (deprecated): sticky and showTranscription are still accepted and mapped to a mode when mode is not set.
Example — global defaults:
globals:
transcription:
model: base.en
mode: confirmExample — override for one project (e.g. use a larger model and hide transcription):
projects:
backend:
cwd: ./backend
telegram:
botToken: "${BACKEND_BOT_TOKEN}"
transcription:
model: small
mode: silentFor where these keys sit in the full config (globals table, projects table), see Configuration.
-
ffmpeg — for audio conversion
brew install ffmpeg # macOS sudo apt install ffmpeg # Ubuntu/Debian
-
CMake — for building the Whisper executable
brew install cmake # macOS sudo apt install cmake # Ubuntu/Debian
-
Download and build Whisper — run once after installation:
npx nodejs-whisper download
| Model | Size | Speed | Quality |
|---|---|---|---|
tiny |
~75 MB | Fastest | Basic |
tiny.en |
~75 MB | Fastest | English-only |
base |
~142 MB | Fast | Good |
base.en |
~142 MB | Fast | English-only (default) |
small |
~466 MB | Medium | Good multilingual |
medium |
~1.5 GB | Slower | Very good multilingual |
large-v3-turbo |
~1.5 GB | Fast | Near-large quality |