Skip to content

Agent Bridge v0 + build-speed/TTS sprint#17

Merged
master5d merged 19 commits into
mainfrom
feat/agent-bridge
Jun 14, 2026
Merged

Agent Bridge v0 + build-speed/TTS sprint#17
master5d merged 19 commits into
mainfrom
feat/agent-bridge

Conversation

@master5d

Copy link
Copy Markdown
Owner

Two stacked epics in one PR (agent-bridge is branched off build-speed).

Epic 1 — Build speed + TTS v0 (6 commits)

  • [profile.fast] (thin LTO, cu=16) for daily release-grade cargo build --profile fast; dev profile tuned (line-tables-only, deps at opt-level 2); sccache groundwork + scripts/build-fast.ps1 + scripts/dev-env.ps1.
  • cargo feature diarization (default ON) — --no-default-features drops the heavy Intel MKL dep from dev builds; --diarize then errors with a rebuild hint.
  • TTS v0TtsEngine trait + TtsManager (rodio playback, cancel-on-speak) + WinRT SpeechSynthesizer engine; tts_list_voices/speak/stop commands. Smoke-tested: 5 system voices incl. RU.

Epic 2 — Agent Bridge v0 (11 commits)

Cenno-style agent↔user Q&A inside Echo, voice-first, Windows.

  • localhost HTTP API 127.0.0.1:4123 (bearer token, portable-aware path): POST /v1/ask (blocks until answered/dismissed/timeout), /v1/notify, GET /v1/answers.
  • Separate agent_bridge.db journal (not history.db). One-question-at-a-time queue with poison-safe locks + leak-safe Drop.
  • Floating panel (src/agent-panel/, focusable, always-on-top, cold-window race handled) with text/choice/confirm + countdown; speak:true reads the question via TTS, answers can be dictated.
  • CLI echo --ask "..." [--ask-options a,b] [--ask-timeout] [--ask-speak] for cron loops; skill skills/echo-ask/.

Test Plan

  • cargo test --lib — 155 pass, 2 ignored (opt-in WinRT/TTS smoke)
  • cargo fmt --check, npm run build, npm run lint, check-translations.mjs (i18n parity ×21) green
  • Backend live-smoke vs running app: token/401/notify/timeout/journal all pass
  • Human-click answer path + TTS audio (needs a person at the keyboard)
  • CI green on push

Built via multi-agent delegation (Gemini CLI waves + agy adversarial review); see commits for attribution.

🤖 Generated with Claude Code

master5d and others added 19 commits June 12, 2026 01:54
- dev: line-tables-only debuginfo, deps at opt-level 2 (compiled once, cached)
- new `fast` profile: thin LTO + 16 codegen units for daily release-grade
  builds; shipping keeps fat-LTO `release` untouched
- sccache 0.15 installed, RUSTC_WRAPPER=sccache set at user level (verified
  passthrough-safe with incremental dev builds)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Documents the fast cargo profile, sccache, and --no-default-features dev
builds; adds a PowerShell wrapper that imports the VS/LLVM env and runs the
chosen profile. Authored by Gemini CLI under spec, reviewed; fixed missing
--manifest-path on the bare cargo path.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
speakrs -> ndarray-linalg -> Intel MKL static is the heaviest piece of every
build but only serves --diarize. Now optional behind a default-on feature:
shipping builds unchanged, daily dev skips it via --no-default-features.
Without the feature, --diarize fails fast with a clear rebuild hint.

Verified: cargo check both feature sets, 144 lib tests pass without feature.
Implemented by Codex CLI under spec (it burned its monthly quota mid-task);
warning polish by Claude.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- env-import regex [^=]+ (cmd hidden =C:= vars crashed under -Stop)
- Vulkan SDK picked by [version] sort, not alphabetical
- direct invocation instead of Invoke-Expression
- -Bundle always uses release profile: tauri-cli appends --release itself
  and the bundler expects target/release artifacts (docs updated to match)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
TtsEngine trait + TtsManager (rodio playback, cancel-on-new-speak) with the
zero-dependency Windows engine: voice listing/selection and text->WAV via
Windows.Media.SpeechSynthesis. Commands tts_list_voices / tts_speak / tts_stop
registered through tauri-specta. Non-Windows builds get a clear runtime error.

Smoke-tested against real WinRT: 5 voices (incl. RU Irina/Pavel), 148 KB WAV.
Run with: cargo test --lib tts -- --ignored --nocapture

Scaffold by Gemini CLI under spec; Windows engine implementation, Media_Core
feature-gate discovery and smoke test by Claude. First brick of assistant/tutor
phases: next engines (Piper/Kokoro via ort) implement the same trait.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
tiny_http/getrandom (+ureq dev-dep) and the agent_bridge module stubs per
plan Task 1; scripts/dev-env.ps1 extracts the BUILD.md toolchain env so any
shell (incl. delegate agents) can dot-source it before cargo.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
- wait/resolve race: late answer is collected via try_recv instead of lost
- Drop impl on PendingQuestion prevents sender leaks on unwind
- poisoned-lock recovery (lock_ok) so the bridge never wedges
- set_status errors on unknown id (was silent Ok)
- pending_id returns oldest (min), not arbitrary HashMap key
- token file written owner-only (0600) on Unix

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…d-window fix

- settings: agent_bridge_enabled (default on) + agent_bridge_port (4123)
- lib.rs init: token + agent_bridge.db + server start; sink emits
  agent-question, shows the panel and speaks the question via TtsManager
  when speak:true
- commands agent_bridge_answer/dismiss/answers/current (specta, bindings
  regenerated)
- cold-window race: QuestionEvent moved to state with current-question
  tracking; a freshly created panel pulls the active question on mount;
  panel always hides on answer/dismiss even after server-side timeout
- ask_serial recovers from poisoned lock; open_in_memory test-gated

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
agent_panel webview (always-on-top, undecorated, focusable) + AgentPanel.tsx:
text/choice/confirm kinds, countdown with self-dismiss, cold-window pull via
agentBridgeCurrent, dictation-friendly text input. i18n keys in en+ru, parity
across all 21 locales. Implemented by Gemini CLI under spec; always-hide
ordering in answer/dismiss commands restored in review.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…y review)

- server now resolves app data via portable::app_data_dir so the token lives
  where the CLI reads it (portable mode previously split %APPDATA% vs Data/,
  401ing every --ask)
- CLI surfaces non-2xx server responses (401/429/400) with the error body
  instead of exiting 3 silently
- CLI request timeout uses saturating_add (no overflow on huge --ask-timeout)

agy also flagged two "compile errors" (tiny_http to_ip/as_str) — false
positives, 155 tests + release build prove the API is correct; and a
localhost timing-attack on the bearer token — out of threat model (token
file is per-user ACL-protected; a local attacker reads the file directly).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
BUILD.md, echo-ask SKILL.md, AgentPanel.tsx were not prettier-clean (eslint
passes them but format:check is a separate gate). Other locally-flagged files
are CRLF working-tree false positives (LF blobs are clean on CI).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ent)

Multi-line inline-code inside a numbered list item is a prettier edge case
that --write did not stabilize; moved the JSON/CLI examples into proper
fenced blocks so format:check passes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@master5d master5d merged commit 5f093c6 into main Jun 14, 2026
4 checks passed
@master5d master5d deleted the feat/agent-bridge branch June 14, 2026 06:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant