Ci/GitHub workflows by AlleyBo55 · Pull Request #30 · AlleyBo55/VoiceBridge

AlleyBo55 · 2026-04-23T15:05:17Z

No description provided.

…le output - Real mic capture via ffmpeg (avfoundation/pulse/dshow) - ElevenLabs Scribe v2 Realtime STT with auto-reconnect - Streaming LLM translation (OpenRouter/OpenAI/Anthropic) - ElevenLabs Flash v2.5 TTS with on-demand reconnect - BlackHole virtual mic output via ffmpeg audiotoolbox - Auto-calibrate VAD threshold from mic noise floor - Stable-partial timer triggers translation after 1.2s pause - Delayed TTS flush batches translations (300ms window) - Device selection by name (survives plug/unplug) - Mic selector in settings, BlackHole disabled as input - VAD sensitivity selector (low/medium/high) - Curated model dropdown per LLM provider - Backpressure handling for ffmpeg output pipe

…0ms for init

…onnect issues - REST TTS: POST /v1/text-to-speech/{voiceId} with pcm_24000 output - Each utterance is an independent HTTP request — no shared WebSocket state - No flush/isFinal/reconnect dance — just request → audio → play - Resample 24kHz→48kHz and write to BlackHole in one shot - Remove all TTS WebSocket code (connect, heartbeat, queue, flush timer) - LLM still streams tokens but collects full translation before TTS - Simpler, more reliable, works every time

… OBS frame buffer - Every utterance queued and processed one at a time - No concurrent translations/TTS — eliminates all race conditions - Queue drains serially: translate → TTS → write → next - Delay is OK — but nothing gets lost - Queue cleared on session stop - Log shows queue depth for monitoring

…onboarding - Add push-to-talk mode (default ON) — hold SPACE or button to talk - PTT uses manual commit strategy, prevents double-commit/throttle - Fix STT: proper query params (audio_format, commit_strategy, language_code) - Fix STT: add commit field to input_audio_chunk (required by API) - Fix STT: auto-reconnect on close + 10s watchdog for silent connections - Fix STT: handle commit_throttled gracefully without crash loop - Remove dead TTS WebSocket code (now REST-only) - Add speaker playback via afplay (temp WAV per chunk, macOS built-in) - Add prerequisites onboarding step (ffmpeg, sox, BlackHole per platform) - Add prerequisite check/install IPC handlers - Fix auto-calibration: only use silence samples for noise floor - Fix VAD default to high sensitivity - Clean up unused imports and fields

- Auto-restart session on language change (stop + start with new lang) - Reset STT connection after each PTT release (1.5s delay for transcript) - Prevents accumulated buffer from concatenating across PTT presses - Disable watchdog in PTT mode (silence between presses is expected) - Default VAD sensitivity to high

- Replace persistent ffmpeg pipe with per-utterance short-lived processes - Each TTS chunk spawns fresh ffmpeg → BlackHole (stdin + EOF = play + exit) - Speaker playback via afplay temp WAV (unchanged) - Eliminates audiotoolbox internal buffering that caused delayed/stacked playback - Each PTT press-release is fully independent: fresh STT, fresh audio output

- Add screenshots (onboarding, voice clone, main view, settings) - Add demo.mov video - Rewrite README with demo/screenshots at top, push-to-talk docs - Add Remotion reel project (reel/) for IG/TikTok promo video - Add public/ folder for media assets - Exclude reel/out/ from git

…talk button

…errors

…voice ID

writeVirtualMic was playing TTS through both BlackHole (virtual mic) AND system speakers via afplay. The speaker output caused double sound since the meeting app already plays the virtual mic audio back to the user. Now only writes to BlackHole — user hears translation through the meeting app's audio output.

In PTT mode, the STT WebSocket was kept alive between presses, receiving no audio, causing ElevenLabs to close it every ~5s. The auto-reconnect handler would immediately reconnect, creating an infinite connect/close cycle that wasted API quota and caused race conditions when PTT was pressed during a reconnect. Now: STT connects on PTT press, disconnects after getting the committed transcript on release. No reconnect loop when idle.

… to BlackHole The old approach piped raw PCM to ffmpeg stdin with an empty output filename, causing ffmpeg to misinterpret the stream and produce garbled/blurpy audio. Now writes a proper WAV file with correct headers and lets ffmpeg read it, which also handles resampling from 24kHz (TTS output) to BlackHole's native rate internally. Removed the manual linear interpolation resampler — ffmpeg's built-in sinc resampler produces much cleaner output.

- Added dist:mac, dist:win, dist:linux scripts using electron-builder - Added electron-builder config in package.json (dmg, nsis, deb) - Workflows now trigger on ci/github-workflows branch for testing - GitHub Releases only created on master push - All platforms upload build artifacts for branch builds

Root cause: raw PCM piped through ffmpeg's audiotoolbox output was producing garbled/blurpy audio due to format mismatches and pipe buffering issues. Fix: - Request mp3_44100_128 from ElevenLabs instead of pcm_24000 - Write MP3 to temp file (proper encoded format, no header issues) - Use sox with coreaudio output to play directly to BlackHole - sox handles decoding, resampling, and device output natively - Falls back to afplay if sox is unavailable - Linux: uses sox with pulseaudio output to voicebridge sink

…itoring - Fixed ffmpeg audiotoolbox output (empty string arg works from Node spawn) - Fixed device discovery command to use proper dummy input - Added debug logging for ffmpeg exit codes - After playing to BlackHole, also plays through speakers via afplay so user can hear the translation locally without a meeting app

… MacBook speakers

…l monitoring

…2kbps - stability: 0.5 → 0.75 (less random variation, more natural for clones) - similarity_boost: 0.75 → 0.85 (closer to original voice) - style: 0.3 → 0.05 (minimal expressiveness, avoids pitch distortion) - Added use_speaker_boost: true (normalizes volume) - Bumped MP3 to 192kbps for cleaner audio

…ic prosody - Model: eleven_flash_v2_5 → eleven_multilingual_v2 (better pitch/tone for Japanese, Korean, Hindi, etc.) - stability: 0.75 → 0.5 (allow natural intonation variation) - style: 0.05 → 0.35 (enable language-appropriate expressiveness) - similarity_boost: 0.85 → 0.8 (balanced clone fidelity vs naturalness)

ElevenLabs output is too quiet by default. Added: - ffmpeg -af volume=3.0 for BlackHole output (3x amplification) - afplay --volume 2 for speaker output (2x) - Same boost on Linux pulse output

Previous approach: volume filter on ffmpeg audiotoolbox output didn't actually boost the audio heard through speakers (afplay was separate). New approach: first convert MP3 → boosted WAV using loudnorm (broadcast standard -14 LUFS) + volume=2.0, then play the boosted WAV through both BlackHole and speakers. Both outputs get the same loud audio.

Playing to both BlackHole (ffmpeg) and speakers (afplay) simultaneously created a reverb effect due to timing offset between the two processes. Now only plays to BlackHole. Dropped loudnorm filter (added latency), kept simple volume=3.0 boost.

Create a volume-boosted MP3 (6x) first, then play that same loud file to both BlackHole and speakers. Both outputs get the same boosted audio. Previous approach only boosted BlackHole (ffmpeg volume filter) but afplay got the quiet original.

…-addon - rootDir: src/main → src (includes shared/ and native/ directories) - Removed string fallback returns from #findBlackHoleIndex (was returning 'BlackHole 2ch' string where number | null was expected)

… BlackHole Root cause of double playback: audio went to both BlackHole (ffmpeg) and speakers (afplay). With headphones, both outputs were audible. With MacBook speakers, only afplay was heard (nothing reading BlackHole). Fix: removed BlackHole ffmpeg output entirely. Now only plays through the default audio output (afplay) — works the same on speakers and headphones. One play, no reverb, no double sound. For meetings: user sets up a macOS Multi-Output Device (BlackHole + speakers) in Audio MIDI Setup, or the meeting app monitors BlackHole separately.

- settings-store: use Record<string, unknown> cache to avoid Partial type issues - driver-installer: remove unused exec import and #nativeAddon field - electron-ipc: remove unused VALID_RENDERER_CHANNELS import - main.ts: fix unused vars, exactOptionalPropertyTypes, IPC type mismatches - native-addon: suppress unused #bhIdx and #findBlackHoleIndex warnings

- Remove unused _p progress helper in driver-installer.ts #installLinux - Replace loose _tray/_autoStart vars with retained object to satisfy noUnusedLocals - Widen tsconfig.renderer.json rootDir from src/renderer to src (includes src/shared)

- electron-builder requires electron in devDependencies, not dependencies - Add missing author field to satisfy electron-builder validation

CI was running tsc which output to dist/main/main/main.js due to rootDir: src, but electron-builder expected dist/main/main.js. Preload script was also not built in CI at all. - Add desktop/scripts/build.mjs (esbuild production bundler) - Update build:main script to use esbuild instead of tsc - Update all three CI workflows (macOS, Windows, Linux)

- Add homepage, repository, author email (required for Linux .deb maintainer) - Set publish: null to prevent auto-update info generation errors on all platforms

- Output main process as .cjs to avoid 'require is not defined' in ES module scope (package.json has type: module, esbuild outputs CJS) - Fix production renderer load path (was dist/renderer/src/renderer/index.html, now dist/renderer/index.html) - Add custom VoiceBridge app icon (Nothing design: black bg, bridge motif, red accent dot, dot grid, monospace VB) - Generate icon.icns (macOS), icon.ico (Windows), icon.png (Linux) - Add generate-icons.mjs script for regenerating from SVG

Vite's default base: '/' produces absolute paths (/assets/...) which resolve to filesystem root under file:// in Electron. Set base: './' for relative paths.

- Production window: 420x680 (was 360x480), min 380x520 - Enable frame, resizable, and hiddenInset title bar for macOS - Show in taskbar in both dev and production

Electron apps launched from Finder/Dock don't inherit the user's shell PATH. This caused createNativeAddon() to fall back to MockNativeAddon (silence) because 'which ffmpeg' failed without /opt/homebrew/bin or /usr/local/bin. Prepend common install locations to process.env.PATH at startup.

- Add NSMicrophoneUsageDescription to Info.plist via extendInfo - Add entitlements.mac.plist with audio-input entitlement - Update CI workflow to rename and upload both x64 and arm64 DMGs - Release notes now show architecture table for downloads

AlleyBo55 added 30 commits April 22, 2026 20:58

fix(pipeline): TTS drops audio after reconnect — delay queue flush 50…

12c427c

…0ms for init

fix(pipeline): wait 500ms after TTS reconnect before sending text

ac32c63

fix: remove stale #translateAndSpeak reference — use #enqueueUtterance

853433c

chore: remove demo.mov from repo, use GitHub Issues upload instead

877b083

feat(ui): add listening indicator with spectrum animation to push-to-…

2894455

…talk button

ci: add GitHub Actions workflows for macOS, Windows, and Linux builds

fca7644

Merge branch 'master' into ci/github-workflows

acba896

fix(ci): use project-specific tsconfigs for typecheck to resolve JSX …

df0c3bc

…errors

fix(ui): restore PTT listening indicator with spectrum animation

9764cf2

fix(tts): reload voice profile before each TTS call to prevent stale …

3567aae

…voice ID

Merge branch 'master' into ci/github-workflows

cb75e08

chore: update package-lock.json with electron-builder dependency

a26d489

fix(audio): remove afplay speaker playback to prevent double audio on…

2394072

… MacBook speakers

fix(audio): play through speakers in parallel with BlackHole for loca…

54cf238

…l monitoring

fix(audio): boost TTS output volume 3x via ffmpeg volume filter

1a72756

ElevenLabs output is too quiet by default. Added: - ffmpeg -af volume=3.0 for BlackHole output (3x amplification) - afplay --volume 2 for speaker output (2x) - Same boost on Linux pulse output

AlleyBo55 added 22 commits April 23, 2026 18:34

Merge branch 'master' into ci/github-workflows

e7e6cc0

fix(audio): restore #bhIdx field declaration and fix destroy cleanup

6cd04b1

fix(ci): fix rootDir for tsconfig.main.json and type errors in native…

61169a8

…-addon - rootDir: src/main → src (includes shared/ and native/ directories) - Removed string fallback returns from #findBlackHoleIndex (was returning 'BlackHole 2ch' string where number | null was expected)

Merge branch 'master' into ci/github-workflows

e775c20

Merge branch 'master' into ci/github-workflows

d8b409c

Merge branch 'master' into ci/github-workflows

de03ea4

fix(ci): move electron to devDependencies and add author field

028ca3c

- electron-builder requires electron in devDependencies, not dependencies - Add missing author field to satisfy electron-builder validation

fix(build): add missing metadata for electron-builder packaging

0a735d6

- Add homepage, repository, author email (required for Linux .deb maintainer) - Set publish: null to prevent auto-update info generation errors on all platforms

Merge branch 'master' into ci/github-workflows

c87e086

fix(build): use relative asset paths for Electron file:// protocol

d82f1da

Vite's default base: '/' produces absolute paths (/assets/...) which resolve to filesystem root under file:// in Electron. Set base: './' for relative paths.

fix(ui): make production window resizable with proper sizing

269de67

- Production window: 420x680 (was 360x480), min 380x520 - Enable frame, resizable, and hiddenInset title bar for macOS - Show in taskbar in both dev and production

docs(ci): add Gatekeeper xattr instructions to macOS release notes

3e98152

AlleyBo55 merged commit f877180 into master Apr 23, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ci/GitHub workflows#30

Ci/GitHub workflows#30
AlleyBo55 merged 52 commits into
masterfrom
ci/github-workflows

AlleyBo55 commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AlleyBo55 commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant