转写端到端接线：whisper 启用 + 模型下载 + 命令 + MCP get_transcript by appergb · Pull Request #183 · appergb/OpenTake

appergb · 2026-07-02T10:17:53Z

引擎已建成但出厂不可达的根治：whisper-backend 在 src-tauri 启用（crate 级仍可选，304/305 双态测试；whisper.cpp 一次性 ~42s 编译，CI runner 依赖齐备）；ggml-base 多语模型下载（SHA-1 校验，复用既有下载机器）；4 个 Tauri 命令（状态/下载+进度/转写+缓存/读取）；MCP get_transcript 1:1 上游（词中点归属、span 夹取、speed 下限、10000 词分页、linked-video 让位音频伙伴），纯映射 20 边界测试 + 10 dispatcher 测试。门禁：workspace 1447 + web 330 全绿。注意：本 PR 起 CI 会编译 whisper.cpp，首跑变慢属预期。

… MCP get_transcript The transcription engine (whisper.cpp backend, cache, locale, search) was fully built and tested but compile-gated off with no consumer: the shipped app could never transcribe. Wire it: - Features: whisper-backend stays optional at opentake-media (crate tests pass with and without: 304/305) and is ON for src-tauri. Build cost measured: ~42s one-time whisper.cpp C++ compile, cmake+libclang only (preinstalled on GitHub ubuntu runners), CPU build, no CUDA. - Model management: transcribe/model.rs mirrors search/model_download.rs's download->verify->atomic-rename machinery for a single ggml file. Default ggml-base MULTILINGUAL (~142MB, HuggingFace ggerganov/whisper.cpp, SHA-1 verified) — upstream uses Apple's multilingual SpeechTranscriber with OS auto-install (Transcription.swift:119-147), so multilingual base is the faithful equivalent. - Commands (src-tauri/src/transcribe.rs, camelCase DTOs + serde tests): transcribe_model_status / download_transcribe_model (async, transcribe:// progress events like export) / transcribe_media (blocking on worker thread, TranscriptCache-backed so re-runs are instant) / transcript_get (cache-only). - MCP get_transcript (upstream ToolExecutor+Timeline.swift:548-651, 1:1): post-edit timeline transcript in PROJECT frames — per-clip word midpoint assignment within [visStart,visEnd), span clamping, round() to_timeline with speed floor 0.0001, clips sorted by startFrame, 10000-word cap with nextStartFrame paging, skipped[] report, linked-video-drops-for-audio-partner eligibility (EditorViewModel+Captions.swift:52-90). Pure port in transcribe/timeline.rs with 20 edge-case tests (trim/speed/straddlers/seam midpoint/paging); bridge method transcribe_sources on MediaBridge; 10 dispatcher tests via FakeBridge. Deferred (next item): Captions generation tab + add_captions; inspect_media / search_media stubs; frontend bindings for the 4 new commands. Gates: fmt/clippy -D warnings clean; cargo test --workspace 1447; pnpm build clean; pnpm test 330.

appergb merged commit 748698f into main Jul 2, 2026
2 checks passed

appergb deleted the feat/whisper-wiring branch July 2, 2026 10:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

转写端到端接线：whisper 启用 + 模型下载 + 命令 + MCP get_transcript#183

转写端到端接线：whisper 启用 + 模型下载 + 命令 + MCP get_transcript#183
appergb merged 1 commit into
mainfrom
feat/whisper-wiring

appergb commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

appergb commented Jul 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant