feat: IDE context awareness for dictation prompts#326
Draft
gabrielste1n wants to merge 8 commits intomainfrom
Draft
feat: IDE context awareness for dictation prompts#326gabrielste1n wants to merge 8 commits intomainfrom
gabrielste1n wants to merge 8 commits intomainfrom
Conversation
Capture the active app name and window title when dictation starts, then inject that context into the AI system prompt so responses are more relevant to what the developer is working on. - Swift one-shot binary (macos-context-capture) using AXUIElement API - Parses file names from VS Code, Xcode, JetBrains, Sublime, Vim titles - Graceful degradation: exit code 2 when Accessibility not granted - Settings toggle (default on, macOS only) in General section - Context flows: main → preload → useAudioRecording → audioManager → ReasoningService - Hash-based build caching with cross-arch compilation support
- Add Windows context capture via C binary (Win32 GetForegroundWindow/ GetWindowTextW/QueryFullProcessImageName) - Add Linux support with auto-detected strategy: hyprctl (Hyprland), swaymsg (Sway), xdotool+xprop (X11/XWayland), gdbus (GNOME) - Move filename parsing from Swift binary to shared JS (DRY across all 3 platforms) with IDE detection for VS Code, Xcode, JetBrains, Sublime Text, and Vim/Neovim - Simplify macOS Swift binary (~80 lines removed) - Remove macOS-only gate on Context Awareness settings toggle
Extend Swift binary to walk the Accessibility tree for VS Code family editors (VS Code, Cursor, Windsurf) to extract open tab names and sidebar file tree items. Add project name parsing from window titles. Enrich system prompt with project context and @project/filename tagging.
Resolve project name to filesystem path via VS Code/Cursor/Windsurf storage.json, then scan the directory for code files. Falls back from AX tree walking when sidebar items are unavailable (VS Code on macOS, all editors on Windows/Linux). Cached with TTL for zero-latency repeats.
VS Code extension panels (e.g., Claude Code) use a 2-part title format without the app name suffix: "Panel — project". The parser now checks if the last segment is a known app name to determine which part holds the project name.
- Persist contextAwarenessEnabled to .env and cache in windowManager so captureContext() is skipped when disabled (avoids ~5-50ms of native binary spawn per hotkey press when feature is off). - Extract sendToggleDictation() and captureAppContext() helpers in windowManager, replacing five duplicated call sites across main.js and windowManager.js. - Rename _commandCache to _cache in ContextCaptureManager since it holds polymorphic value shapes keyed by prefix.
Integrate 769 commits from main while preserving IDE context-awareness changes. Key integrations: - Wire contextAwarenessEnabled into the expanded settingsStore / useSettings alongside new startMinimized, gcalAccounts, meetingDetection, and panel-position settings. - Merge into windowManager's new sendToggleDictation/sendStartDictation helpers (which gained meetingDetectionEngine integration in main), preserving captureAppContext() gating. - Pass config.appContext through the new processWithEnterprise reasoning path so Azure/Vertex/Bedrock providers also receive IDE context. - Keep CONTEXT_AWARENESS_ENABLED in environment.js alongside the new START_MINIMIZED and PANEL_START_POSITION env keys. - Update electron-builder.json to bundle windows-context-capture* alongside the new mic/text/AEC binaries.
Flagged by CodeQL. Inputs are internal constants so not exploitable, but the primitive is now correct for any future caller.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this does
When you press the OpenWhispr hotkey, the app captures what you're working on — the foreground app, window title, current file, project name, open tabs, and a scan of project files — and includes that as context in the system prompt sent to the reasoning model.
Say "refactor this file to use async/await" in your editor and the model now actually knows which file and which project you mean. Say "where is the database config" and the model can reference real file paths from your project, formatted as
@project/path/to/file.Defaults to on, opt-out via Settings → General → Context Awareness.
How it works
Capture — on every hotkey press, the main process spawns a small native helper (
spawnSync, 500ms timeout) that returns JSON and exits:macos-context-capture) usesNSWorkspace+ the Accessibility API. For known Electron editors (VS Code, Cursor, Windsurf) it walks the AX tree to extract open tab names (AXTabGroup→AXRadioButton) and sidebar file names (AXOutline→AXRow). 400ms internal deadline so it can't block.windows-context-capture.exe) usingGetForegroundWindow+QueryFullProcessImageNameA. Returns window title + process exe name.hyprctl activewindow -j(Hyprland),swaymsg get_tree(Sway),xdotool + xprop(X11), orgdbus→org.gnome.Shell.Eval(GNOME Wayland).Parse — window-title parsers per editor family (VS Code em-dash format, JetBrains en-dash format, Xcode reversed, Sublime hyphen format, Vim/Neovim). Extracts file name + project name from the title alone.
Fallback file list — for VS Code/Cursor/Windsurf, if AX tree returned no sidebar (e.g. accessibility permission not granted, or Windows/Linux with no equivalent API), the JS-side reads the editor's
storage.jsonfrom the app-support directory, finds the project's absolute path, then does a depth-4 scan for code files (capped at 200 files). Results cached 1-2min.Prompt injection —
getSystemPromptappends aContext (the user is currently working in):block with app / project / file / open tabs / project file list, plus an instruction to format file references as@project/filename. Threads through OpenAI, Anthropic (IPC), Gemini, local, and enterprise (Azure/Vertex/Bedrock) providers.Performance + privacy
windowManager._contextAwarenessEnabledshort-circuits and the native binary is never spawned — zero overhead..envasCONTEXT_AWARENESS_ENABLEDso it survives restart and main knows before the renderer loads.Packaging
extraResourcesin electron-builder.json:resources/bin/macos-context-capture(top-level) +windows-context-capture*(Windows filter). No ASAR unpack needed — binaries live alongside every other native listener inresources/bin/.compile:context-capture+compile:wincontextwired intocompile:native(runs onprestart,predev, everyprebuild:*).Why this matters
OpenWhispr sits on top of every reasoning model — Claude, GPT, Gemini, local — and the single biggest gap vs. dictating directly into Cursor/Claude Code is that the ambient model has no idea what the user is looking at. This closes that gap with ~1400 lines of code and no new dependencies.