Replace BlackHole with ScreenCaptureKit for audio capture#2
Open
morellid wants to merge 20 commits into
Open
Conversation
ScreenCaptureKit helper that captures audio from a specific app by bundle ID and writes raw float32 PCM (16kHz mono) to stdout. - sck-capture <bundle-id>: capture audio, pipe PCM to stdout - sck-capture --check: exit 0 if Screen Recording permission granted - Status/errors to stderr, READY signal when capture starts - Handles SIGTERM/SIGINT for clean shutdown - Requires macOS 13+, built with: cd swift/sck-capture && swift build
ctypes bindings to CGPreflightScreenCaptureAccess / CGRequestScreenCaptureAccess from CoreGraphics. No PyObjC dependency needed.
Manages the sck-capture Swift helper subprocess. Reads raw float32 PCM from stdout in 1024-sample chunks, collects frames under a lock — same format as sounddevice callbacks for easy mixing.
When app_bundle_id is provided, captures two streams in parallel: - ScreenCaptureKit for meeting app audio (device-independent) - sounddevice for mic (user's voice) Streams are mixed on stop() with clipping. Falls back to mic-only when no bundle ID, no SCK binary, or no permission. Removes BlackHole detection (find_blackhole_device, list_input_devices).
If sounddevice reports a device error (e.g. Bluetooth earbuds disconnect mid-meeting), auto-restart the mic stream with the new default input device. Brief ~1s gap in mic audio, no crash. SCK capture is unaffected by device changes.
detect_meeting() now returns (name, bundle_id) tuple. Native apps map to static bundle IDs. Browser-based meetings return the bundle ID of whichever browser (Chrome, Safari) the meeting tab was found in. on_start callback signature updated to (meeting_name, bundle_id).
- menu_bar: pass bundle_id from watcher, use audio_source_description - mcp_server: use Recorder() without device arg - cli watch: pass bundle_id from watcher on_start callback Removes all BlackHole references from consumer sites.
- Replace BlackHole install step with Xcode CLI tools check and SCK binary build (swift build -c release, copies to ~/.local/share/trnscrb/) - Add Screen Recording permission check to install wizard - Add helper functions: _xcode_cli_installed, _sck_binary_built, _build_sck_helper - Remove _blackhole_installed helper - Fix devices command to use sounddevice directly (Recorder.list_input_devices removed) - Fix mic-status command for detect_meeting() returning (name, bundle_id) tuple - Replace pip install with uv add in package install step
- Replace BlackHole references with ScreenCaptureKit audio capture - Document dual-source recording (SCK + mic), BLE device resilience - Add Xcode CLI tools to requirements - Update install guide to reflect new setup steps - Add sck.py and screen_capture.py to CLAUDE.md architecture
Normalize mic RMS to match SCK RMS before mixing so Whisper hears both the user's voice and remote participants at similar levels. Gain is capped at 5x to avoid amplifying background noise.
Only transition from warming to recording when a meeting app or browser meeting tab is actually detected. Prevents false triggers from YouTube, Spotify, or other non-meeting mic usage.
After leaving a Google Meet call, Chrome navigates to meet.google.com/landing — the browser tab script was still matching this as an active meeting, preventing auto-stop. Exclude /landing and root Meet URLs in both Chrome and Safari tab checks.
- Add Firefox window-title based detection for Meet, Teams, Zoom - Only match "Meet – <code>" pattern (active call), not bare "Google Meet" title (landing page after leaving) - Firefox bundle ID (org.mozilla.firefox) passed to SCK for capture
- Copy sck-capture Swift source into trnscrb/sck-capture/ so it ships with pip/uv installs (setuptools package-data) - Update _build_sck_helper() to find bundled source first, fall back to repo root swift/ for development - Update sck.py find_binary() to check bundled build dir - Fix package install step to use pip (uv add won't work for end users) - Add .build/ to .gitignore for Swift build artifacts
Both mic and SCK audio are now written to temp files during recording instead of accumulating in Python lists. This keeps memory usage constant regardless of meeting length (~4 KB vs ~460 MB for 1h). On stop, temp files are read back for mixing and transcription. Files use /tmp with trnscrb_ prefix and are cleaned up after use. On crash, the OS eventually purges them.
Firefox window titles don't change reliably after leaving a call
("Meet – code" stays even after leaving), which prevents auto-stop.
Split browser scripts into broad (all browsers, for start) and
narrow (Chrome/Safari only, for stop). Firefox meetings now stop
via mic-idle detection instead.
Back-to-back meetings: if Meeting A is still transcribing when Meeting B starts, the recorder was blocked. Now only blocks if already recording (not transcribing), since transcription runs in a separate thread and doesn't need the recorder.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Key changes
swift/sck-capture/— New Swift helper binary for ScreenCaptureKit audio capturetrnscrb/sck.py— Python subprocess wrapper for the Swift helpertrnscrb/recorder.py— Rewritten with dual-source SCK+mic capturetrnscrb/screen_capture.py— Screen Recording permission check via CoreGraphicstrnscrb/watcher.py— Enhanced meeting detection with app gating, Firefox support, grace period fixestrnscrb/menu_bar.py— Updated for new recorder API and auto-enrichmenttrnscrb/settings.py— Addedauto_enrichsettingWhy ScreenCaptureKit over BlackHole
BlackHole requires installing a kernel extension / virtual audio device, which is fragile across macOS updates and requires user configuration. ScreenCaptureKit is a first-party Apple API (macOS 13+) that captures app audio directly — no extra install, no audio routing changes, and it captures the specific meeting app rather than all system audio.