Skip to content

panda850819/murmur-voice

Repository files navigation

Murmur

Release CI License Platform

English | 繁體中文

Buy Me A Coffee

Your voice, unheard by others.

Privacy-first voice-to-text for macOS and Windows, built with Rust.

Settings Recording

What is Murmur?

Murmur is a voice dictation tool that transcribes your speech and inserts polished text at your cursor position -- in any app. It supports both local (on-device) and cloud transcription, with optional LLM post-processing to clean up filler words, fix punctuation, and convert Simplified Chinese to Traditional Chinese.

Features

  • Push-to-Talk -- Hold a modifier key to speak, release to insert text
  • Toggle Mode -- Press once to start recording, press again to stop (with 5-min auto-stop and debounce protection)
  • Custom Hotkey -- Single modifier key or combo (e.g. Option+Z, Control+Space) with two-phase recording
  • Dual Engine -- Local Whisper (Metal GPU) or Groq cloud API
  • Multi-Provider LLM -- Groq (cloud), Ollama (local), or any OpenAI-compatible endpoint for text enhancement
  • Fully Offline Mode -- Local Whisper + Ollama for complete privacy (no data leaves your machine)
  • LLM Post-Processing -- Clean up filler words, add punctuation, Simplified-to-Traditional Chinese conversion
  • Translate Hotkey -- Select text in any app, press a hotkey (default: Option+T) to translate and replace in-place via LLM
  • Smart Clipboard -- Auto-pastes when a text field is focused; copies to clipboard only when no text input is detected (e.g. on Desktop)
  • App-Aware Style -- Automatically adjusts output tone based on the active app (e.g. formal in Slack, technical in VS Code)
  • Personal Dictionary -- Add custom terms to improve transcription accuracy; inline dictionary chips appear in real-time while editing
  • Transcription Preview -- Floating preview window with copy button, editable text, character count, and detected app name
  • Live Preview -- See partial transcription while you speak (local engine only)
  • Mixed-Language Support -- English words in mixed CJK-English speech are preserved as-is (never translated)
  • 15 Languages -- Auto-detect or manually select from 15 supported languages
  • Cross-Platform -- macOS and Windows support with platform-native hotkey and app detection
  • System-wide -- Works in any text field across all apps
  • Lightweight -- Tauri-based, ~30-50MB vs 200MB+ Electron apps
  • Open Source -- Fully auditable, no telemetry, no tracking

Download

Download the latest release from the Releases page.

Platform File Notes
macOS (Apple Silicon) .dmg Requires quarantine removal
Windows .exe / .msi CPU-only, works on all hardware
Windows (NVIDIA GPU) -cuda.exe / -cuda.msi GPU-accelerated via CUDA

How It Works

Voice:     Hotkey -> Record (cpal) -> Transcribe (Whisper) -> LLM Clean-up (optional) -> Smart Clipboard
Translate: Option+T -> Copy selection -> LLM Translate -> Paste back (replace selection)

Each recording triggers at most 2 API calls (when using Groq): one for Whisper transcription, one for LLM post-processing. Translation triggers 1 API call.

Setup Guide

1. Install & Run

git clone https://github.com/panda850819/murmur-voice.git
cd murmur-voice
pnpm install
pnpm tauri dev

2. First Launch

On first launch, Murmur will guide you through:

  1. Granting Microphone and Accessibility permissions
  2. Choosing a transcription engine (Local or Groq)
  3. Setting your Push-to-Talk key

If you choose the local engine, the Whisper model (~1.5GB) will download automatically on your first recording.

3. Transcription Engine

Engine Speed Quality Privacy Setup
Local (Whisper) ~1-3s Good Audio stays on device Model auto-downloads on first use (~1.5GB)
Groq API <1s Good Audio sent to Groq servers Free API key (get one below)

To switch engines: Settings > Transcription > Engine

Getting a Groq API Key

  1. Go to console.groq.com and sign up (Google/GitHub login supported)
  2. Navigate to API Keys in the left sidebar
  3. Click Create API Key, give it a name (e.g. "murmur")
  4. Copy the key (starts with gsk_) and paste it into Murmur's settings

Groq's free tier includes generous rate limits for personal use. The same API key is used for both Whisper transcription and LLM post-processing.

4. LLM Post-Processing (Recommended)

Choose a provider for AI text enhancement:

Provider Speed Privacy Setup
Groq Fast Cloud Free API key from console.groq.com
Ollama Varies Local Install Ollama, pull a model
Custom Varies Varies Any OpenAI-compatible endpoint

What it does:

  • Removes filler words (um, uh, etc.)
  • Removes false starts and self-corrections
  • Adds proper punctuation (full-width for Chinese, half-width for English)
  • Converts Simplified Chinese to Traditional Chinese (Taiwan standard)
  • Adds spaces between Chinese and English text
  • Formats lists and paragraphs when appropriate

To enable: Settings > AI Processing > LLM Post-Processing

5. Translation

Select text in any app and press Option+T (default) to translate it in-place. The translated text replaces your selection and appears in the preview window.

  • Uses the same LLM provider configured in AI Processing (Groq, Ollama, or Custom)
  • Target language configurable in Settings > Translation > Target Language
  • Translate hotkey customizable in Settings > Translation > Translate Hotkey
  • Preview window stays visible until you close it or click Copy

6. Personal Dictionary

Add frequently used terms (names, jargon, acronyms) to improve transcription accuracy. These are injected into Whisper's initial prompt.

To configure: Settings > Transcription > Dictionary (type a term, press Enter to add)

7. App-Aware Style

When enabled, Murmur detects the foreground app and adjusts the LLM output tone:

App Style
Slack, Discord, LINE, Telegram Casual
VS Code, Terminal, Cursor Technical
Pages, Word, Google Docs Formal
Others Default (natural)

To enable: Settings > AI Processing > App-Aware Style

Recommended Settings

For the best experience with Chinese dictation:

Setting Value Why
Engine Groq Fastest transcription (<1s)
Language Mandarin Chinese More accurate than Auto for Chinese
LLM Post-Processing On Cleans up filler words + Traditional Chinese
LLM Model Llama 3.3 70B Best quality for Chinese text processing
App-Aware Style On Adapts tone to context

Tech Stack

Component Technology Purpose
App Framework Tauri 2 Lightweight desktop app
Audio Capture cpal Microphone input -> 16kHz mono
Speech-to-Text whisper-rs / Groq API Local or cloud transcription
LLM Processing Groq / Ollama / Custom Text cleanup and formatting
Hotkey Detection CGEventTap / SetWindowsHookEx Global hotkey listener (modifier or modifier+key combo)
Text Insertion arboard + rdev Clipboard write + Cmd+V / Ctrl+V simulation
App Detection NSWorkspace / Win32 API Foreground app detection (per-platform)

Requirements

macOS

  • macOS 12.0+ (Apple Silicon recommended for local Whisper)
  • Microphone permission
  • Accessibility permission (for global hotkey + text insertion)

Windows

  • Windows 10+
  • Microphone permission

Both Platforms

  • Groq API key (free, for cloud engine and Groq LLM) or Ollama (for local LLM)

FAQ

macOS: "Murmur Voice is damaged and can't be opened"

This happens because the app is not signed with an Apple Developer certificate. macOS Gatekeeper quarantines unsigned apps by default. To fix:

  1. Move Murmur Voice to /Applications
  2. Open Terminal and run:
    xattr -d com.apple.quarantine /Applications/Murmur\ Voice.app
  3. Open the app normally

Windows: Which version should I download?

Your GPU Download Why
NVIDIA (with CUDA drivers) -cuda version GPU-accelerated transcription, much faster
AMD / Intel / integrated Standard version CPU transcription, works on all hardware
Not sure Standard version Always works, just slower for local engine

Why is the app unsigned?

Murmur is a free, open-source project. Apple Developer Program costs $99/year. Code signing may be added in the future, but for now the workaround above is required on macOS.

Privacy

Murmur was born from a security audit of a commercial voice-to-text app that was found to:

  • Capture browser URLs and window titles
  • Monitor all keystrokes via CGEventTap
  • Send application context to remote servers
  • Include session recording analytics (Microsoft Clarity)

Murmur does none of this. When using the local engine, your audio never leaves your machine. When using Groq, audio is sent only to Groq's API for transcription -- no other data is collected or transmitted.

Donate

If you find Murmur useful, consider supporting the project:

Buy Me A Coffee

Crypto:

Network Address
EVM (Ethereum, Base, etc.) 0x9ae8954201b2fce97b124887e415df02e8e06a8d
Solana Eod4VqvMmmMnY3EinN6Zo5xzt9Wq5S2dFZutob1VBvMf

License

MIT

About

Privacy-first voice-to-text for macOS and Windows. Local Whisper (Metal/CUDA) or Groq cloud, with LLM post-processing. Built with Rust + Tauri 2.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors