Skip to content

feat: history tab, configurable hotkey, clipboard control#7

Open
nicremo wants to merge 12 commits into
giusmarci:mainfrom
nicremo:feat/history-clipboard-settings
Open

feat: history tab, configurable hotkey, clipboard control#7
nicremo wants to merge 12 commits into
giusmarci:mainfrom
nicremo:feat/history-clipboard-settings

Conversation

@nicremo
Copy link
Copy Markdown

@nicremo nicremo commented Apr 13, 2026

Summary

Adds a persistent dictation history, configurable hotkey with Fn key support, optional clipboard copying, and several critical bug fixes. All features are backwards-compatible.

Depends on: #6 (right Option key, handsfree, UI upgrades), #5 (dictionary), #4 (cloud transcription)

History Tab

  • New History tab in the sidebar (between Dictionary and Preferences)
  • Every dictation is saved automatically with: final text, raw transcription, transcription source, style mode, enhancement level, target app name, timestamp
  • Expandable entries: click to reveal raw transcription diff, style/level info, and action buttons
  • Copy / Copy raw / Remove buttons per entry, Clear all to wipe history
  • Live refresh: History tab updates instantly after each dictation via history:updated IPC broadcast to all windows
  • Thread-safe JSON persistence with async file lock, auto-capped at 500 entries
  • Stored in history.json in the Electron userData directory

Configurable Dictation Hotkey

  • Fn key as default (Preferences toggle: "Use Fn key")
  • Custom hotkey recorder: click the button, press any key or combination (e.g. Cmd+Option, Shift+Space, right Option)
  • Supports three modes: single modifier key, modifier combo, and regular key with modifiers
  • Swift helper dynamically accepts keyCode + modifiers as CLI arguments
  • Hotkey listener auto-restarts when the shortcut changes
  • src/shared/hotkeys.ts: macOS key code mappings, modifier flags, label builder

Optional Clipboard Copying

  • New "Copy to clipboard" toggle in Preferences (default: off)
  • When off and auto-paste is on: clipboard is used temporarily for Cmd+V, then cleared after 500ms
  • When off and auto-paste is off: clipboard is never touched (text only in history)
  • Status messages reference history instead of clipboard

Bug Fixes

  • Clipboard race condition: clipboard restore executed before target app processed Cmd+V. Root cause: CGEvent post(tap:) is fire-and-forget, the Swift process exits before the target app reads the clipboard. Fix: clear clipboard after 500ms delay instead of immediate restore.
  • Listening stuck active: short key taps (< 300ms) started recording even after key release. Root cause: double-click timer ignored the up event, then fired handleHotkeyDown with no subsequent up to stop it. Fix: track physical key state via keyHeldRef and cancel recording if key was released when timer fires.
  • Double-click regression: the listening fix initially broke double-click detection for handsfree mode. Root cause: timer was cancelled on up event, preventing the second down from recognizing a double-click. Fix: only set keyHeldRef = false on up, do not cancel the timer.

What changed

File Change
src/shared/hotkeys.ts New: key code mappings, modifier flags, Fn key constants, label builder
src/shared/types.ts HotkeyConfig, HistoryEntry, copyToClipboard in AppSettings
src/main/history.ts New: load/add/remove/clear with async lock, 500 entry cap
src/main/dictation.ts Clipboard logic rewrite (race fix), style/enhancement in result
src/main/ipc.ts History handlers, broadcast dependency, hotkey restart on settings change
src/main/native-helper.ts HotkeyConfig passed to Swift listener as CLI args
src/main/index.ts restartHotkeyListener, broadcast to IPC dependencies
src/main/defaults.ts Fn key as default hotkey, copyToClipboard: false
src/preload/index.ts History bridge, onHistoryUpdated listener
src/renderer/env.d.ts Window types for history and hotkey
src/renderer/App.tsx HistoryPage, HotkeyRecorder with Fn toggle, live history refresh, keyHeldRef fix
src/renderer/styles.css History entries, hotkey recorder styles
swift/OpenWhispHelper.swift Configurable keyCode/modifiers, Fn key (maskSecondaryFn), 3 hotkey modes

Test plan

  • History: dictate something, entry appears in History tab immediately (live refresh)
  • History: click entry to expand, shows raw transcription and style info
  • History: Copy / Copy raw / Remove buttons work
  • History: Clear all removes everything, shows empty state
  • History: quit and reopen, entries persist
  • Hotkey Fn: toggle "Use Fn key" on, hold Fn to dictate
  • Hotkey custom: toggle Fn off, click recorder, press right Option, verify it works
  • Hotkey combo: record Cmd+Option as hotkey, verify dictation starts with combo
  • Hotkey restart: change hotkey in Preferences, verify new key works immediately
  • Clipboard off + paste on: dictation pastes correctly, clipboard is empty after 500ms
  • Clipboard on + paste on: dictation pastes correctly, text stays in clipboard
  • Clipboard off + paste off: clipboard untouched, text only in history
  • Listening bug: short tap (< 300ms) must NOT start recording
  • Double-click: double-tap hotkey enters handsfree mode, single tap to exit
  • Hold: hold hotkey > 300ms, speak, release to process

nicremo added 12 commits April 13, 2026 00:29
- Cloud transcription via Groq/OpenAI-compatible APIs (whisper-large-v3)
- Auto/Cloud/Local transcription mode with automatic offline fallback
- API key encrypted via macOS Keychain (Electron safeStorage)
- Default text model changed from gemma4:e4b (9.6GB) to qwen3.5:2b (2.7GB)
- Configurable API base URL (Groq, OpenAI, Lemonfox, any compatible provider)
- Language selector (German default, 11 languages available)
- Stronger same-language prompt to prevent LLM translation
- Built-in microphone preferred over external devices (AirPods fix)
- New TranscriptionCard UI with source selector, API key management
- Setup wizard with cloud/local transcription choice
- Relaxed hotkey validation: Ollama not required when enhancement is off
Dictionary & Corrections:
- Custom vocabulary tab with words and misspelling corrections
- Words sent as Whisper prompt hints for better transcription
- Corrections injected into LLM system prompt for auto-replacement
- Two-column layout: Words (left) + Misspellings (right)
- Async file lock prevents race conditions on concurrent writes
- Whisper prompt truncated at ~800 chars (224 token limit)
- IPC handlers with runtime input validation

LLM Performance:
- Disable thinking mode (think: false) for qwen3.5 models
- Reduces rewrite time from ~14s to ~0.3s
- Strip <think> tags from output as safety fallback

Pipeline logging:
- Log transcription settings, raw text, and final text for debugging
App Rules:
- Auto-detect active app and switch style/enhancement level
- 31 pre-configured dev tools (Ghostty, VS Code, Cursor, Zed, JetBrains, etc.)
- All terminals, IDEs, git clients, API/DB tools default to Vibe Coding High
- Non-dev apps use the user's manual default setting
- Editable per-app style and level on the Style page
- Rules stored in app-rules.json, add/remove/update via UI

Credits:
- Sidebar footer shows "Original by @giusmarci" and "Enhanced by Fabian Bitz"
Hotkey:
- Switched from Fn to right Option key (keyCode 61)
- Hold to record (release to transcribe)
- Double-press for Handsfree mode (records until next single press)

Handsfree Mode:
- Debounced double-click detection (300ms window)
- Red pulsing dot indicator in overlay when active
- Red border around overlay bar during handsfree recording
- Label shows "Handsfree" to distinguish from hold mode
- Single press while handsfree stops recording and transcribes

Idle Overlay:
- Minimal 5-dot indicator always visible at bottom of screen
- Shows app is ready without being intrusive
- Same border-radius (14px) as active overlay
- Semi-transparent with backdrop blur

German Language Fix:
- Language-specific reinforcement prompts written in target language
- German prompt instructs LLM in German to output German
- Preserves English technical terms when present in input
- Dynamic prompt generation based on cloudLanguage setting

Build:
- Moved electron/electron-builder to devDependencies for packaging
App Icon:
- Generated .icns from OpenWhisp logo (339px PNG -> all icon sizes)
- Placed in build/icon.icns for electron-builder to pick up

Tray Icon:
- Added build/icons to extraResources so tray icons are bundled
- Fixes missing menubar icon in packaged app

Microphone Permission:
- Check current status before requesting
- If already denied, open System Settings directly (askForMediaAccess
  shows no dialog when previously denied)
- Extended wait timeout to 15 attempts for manual permission grant
Root cause: Hardened Runtime blocks microphone access without explicit
com.apple.security.device.audio-input entitlement. The app never appeared
in macOS Microphone privacy list because the OS silently denied the request.

- Added build/entitlements.mac.plist with audio-input entitlement
- Configured electron-builder to use entitlements for both main and child processes
- Improved mic permission flow: opens System Settings when previously denied
- Cloud rewrite as primary, Ollama as fallback (same pattern as transcription)
- Default model: openai/gpt-oss-20b (1000 tokens/sec, practically free)
- 5 cloud models available: GPT-OSS 20B/120B, Qwen3 32B, Llama 3.3 70B, Llama 3.1 8B
- Cloud/Local toggle on Models page under Text Enhancement
- Auto-fallback to local Ollama when cloud is unavailable
- Ollama no longer required when rewrite mode is Cloud
- Same API key and base URL as transcription (Groq)
History:
- New History tab with persistent storage of all dictations
- Each entry stores final text, raw transcription, style, app name, timestamp
- Expandable entries with copy/remove actions and raw transcription diff
- Thread-safe JSON persistence with async lock (max 500 entries)

Clipboard:
- New "Copy to clipboard" toggle in Preferences (default: off)
- Clipboard write only happens when explicitly enabled or when auto-paste needs it
- Status messages updated to reference history instead of clipboard
- Add HotkeyConfig type with keyCode, modifiers, and label
- Swift helper accepts keyCode + modifiers as CLI arguments
- Support three hotkey modes: single modifier (e.g. Fn), modifier combo
  (e.g. Cmd+Option), and regular key with modifiers (e.g. Cmd+Space)
- Fn key (keyCode 63) as default with maskSecondaryFn detection
- Hotkey recorder UI in Preferences with Fn toggle and custom key capture
- Listener auto-restarts when hotkey setting changes
- Escape to cancel recording, unknown keys ignored
Clipboard: previous fix restored clipboard content immediately after
triggerPaste, but CGEvent key posts are fire-and-forget. The target app
had not processed Cmd+V yet, so it read the old clipboard content.
Fix: clear clipboard after 500ms delay instead of restoring.

Listening: short key taps (< 300ms) started recording via the
double-click timer even after key release. Fix: track physical key
state and skip recording if key was already released when timer fires.

History: added live refresh via history:updated IPC broadcast so the
History tab updates immediately after each dictation.
- IPC handler broadcasts history:updated to all windows after dictation
- Preload exposes onHistoryUpdated listener for live UI refresh
- Settings update triggers hotkey listener restart when hotkey changes
- Add broadcast dependency to IPC handler interface
@nicremo nicremo force-pushed the feat/history-clipboard-settings branch from f7794e5 to 7c7dfcb Compare April 13, 2026 12:46
@nicremo nicremo changed the title feat: add history tab and optional clipboard copying feat: history tab, configurable hotkey, clipboard control Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant