Open
Conversation
Add product requirements document for ScreenTranslate app: - US-001: Base architecture (menu bar + screenshot) - already implemented - US-002: Local OCR using Vision framework - US-003: Local translation using Apple Translation API - US-004/005: Overlay rendering (in-place and below modes) - US-006/007: Settings panel (engine selection + language config) - US-008/009: Optional PaddleOCR and MTranServer integration - US-010: Translation history - US-011: First-launch onboarding Architecture prioritizes local processing (Vision + Apple Translation) with optional external engines for advanced users. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
实现使用 macOS 原生 Vision 框架的本地 OCR 引擎: ## 新增功能 ### 数据模型 (OCRResult.swift) - OCRResult: OCR 识别结果容器,包含文字观察集合 - OCRText: 单个文字观察,包含文字内容、边界框和置信度 - 支持置信度过滤、区域筛选、坐标转换等功能 ### OCR 引擎 (OCREngine.swift) - 使用 VNRecognizeTextRequest 实现文字识别 - 支持中英文混合识别和自动语言检测 - 支持多种语言(英语、简繁体中文、日语、韩语等) - 异步执行,不阻塞主线程 - Actor 并发保护,线程安全 ### 错误处理 - 新增 OCR 相关错误类型到 ScreenCaptureError - ocrOperationInProgress: 操作进行中 - ocrInvalidImage: 无效图像 - ocrRecognitionFailed: 识别失败 - ocrNoTextFound: 未找到文字 ### 代码质量 - 添加 SwiftLint 配置文件 - 创建测试文件(OCRResultTests, OCREngineTests) - 测试验证脚本 (run_tests.sh) ## 验收标准 - ✅ 使用 Vision 框架 VNRecognizeTextRequest 实现 OCR - ✅ 支持中英文混合识别 - ✅ 支持自动语言检测 - ✅ OCR 结果包含:文字内容、置信度、边界框坐标 - ✅ 异步执行 OCR,不阻塞主线程 - ✅ OCR 失败时显示友好错误提示 - ✅ swift build passes - ✅ swiftlint passes (OCR 相关文件) - ✅ 测试文件已创建 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
实现使用 macOS Translation 框架的本地翻译功能: - 添加 TranslationEngine actor 使用 Translation 框架 - 支持 26+ 种语言的自动检测和翻译 - 可配置的目标语言(跟随系统或手动设置) - 异步翻译请求,支持 10 秒超时处理 - 完善的错误处理和友好错误提示 - 添加 TranslationResult 模型存储翻译结果 - 在 AppSettings 中添加 translationTargetLanguage 和 translationAutoDetect 配置 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
实现透明覆盖窗口,将译文显示在原文的精确位置上。 功能: - TranslationOverlayWindow: 透明全屏覆盖窗口 - TranslationOverlayView: 根据 OCR 边界框定位并绘制译文 - 动态字体大小计算,基于原文区域高度 - 点击覆盖层外部或按 Esc 键关闭 - 黑色半透明背景框,白色文字,圆角边框 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
实现翻译浮窗功能,在选区下方展示完整译文: - 新增 TranslationPopoverWindow 和 TranslationPopoverView - 浮窗样式美观,带阴影和圆角 - 显示原文和译文对照(原文灰色,译文黑色) - 支持复制译文到剪贴板 - 支持点击外部或按 Esc 关闭 - 集成到主流程,选择截图区域后自动显示 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
实现设置面板中的引擎选择和基础配置功能: - 添加 OCREngineType 枚举(Vision、PaddleOCR) - 添加 TranslationEngineType 枚举(Apple Translation、MTranServer) - 添加 TranslationMode 枚举(原位替换、原文下方) - 更新 AppSettings 添加新设置属性,支持持久化 - 更新 SettingsViewModel 添加新设置绑定 - 更新 SettingsView 添加 Engines 设置区块 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
实现源语言和目标语言配置功能: - 添加源语言选择器(支持自动检测) - 添加目标语言选择器(支持跟随系统) - 在设置面板新增 Languages 分区 - 支持根据翻译引擎动态显示语言列表 - 配置通过 UserDefaults 持久化并立即生效 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
实现 PaddleOCR 作为可选的 OCR 引擎,支持用户在设置中切换。 主要更改: - 新增 PaddleOCREngine.swift:PaddleOCR 引擎适配器 * 支持中英文混合识别 * 异步执行 OCR,不阻塞主线程 * OCR 结果格式与 Vision 引擎一致 * PaddleOCR 未安装时给出友好提示 - 新增 OCREngineProtocol.swift:统一的 OCR 服务协议 * OCRService 路由到用户选择的引擎 * 支持语言自动转换 - 更新 OCREngineType.swift: * 添加 PaddleOCR 可用性检测 * PaddleOCRChecker 检测命令是否可用 - 更新 SettingsView.swift: * PaddleOCR 不可用时显示警告图标 * 选择不可用引擎时自动回退到 Vision - 更新 AppDelegate.swift:使用 OCRService 代替 OCREngine.shared - 更新 OCREngine.swift:添加 engineNotAvailable 错误情况 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
实现 MTranServer 翻译引擎适配器,支持自托管翻译服务。 - 新增 MTranServerEngine 服务,提供 HTTP API 翻译功能 - 默认服务器地址 localhost:8989,支持配置 - 支持自动检测源语言(auto 模式) - 实现超时处理(默认 10 秒) - 添加 MTranServerChecker 可用性检测(/health 端点) - 更新 TranslationEngineType 使 MTranServer 可用 - 友好的错误提示和恢复建议 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
实现翻译历史记录功能,支持保存、查看和管理翻译记录。 每次翻译自动保存时间、原文、译文和截图缩略图。 新增文件: - TranslationHistory: 翻译历史记录数据模型 - HistoryStore: 历史记录管理服务,支持搜索、分页和持久化 - HistoryView: 历史记录窗口 UI,支持搜索、滚动加载和右键菜单 - HistoryWindowController: 历史记录窗口控制器 功能特性: - 最多保存 50 条历史记录 - 支持按原文或译文内容搜索 - 支持单条删除和清空全部 - 自动生成 128px JPEG 缩略图 - 右键菜单复制原文/译文/全部 - 菜单栏快捷键 Cmd+Shift+H 打开历史窗口 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- 检测首次启动,显示欢迎窗口 - 说明本地 OCR 和翻译功能已自动启用 - 可选配置:引导用户配置 PaddleOCR 和 MTranServer 地址 - 请求屏幕录制权限(macOS 隐私权限) - 请求辅助功能权限(用于全局快捷键) - 提供测试翻译按钮验证配置 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rename directory: ScreenCapture/ → ScreenTranslate/ - Rename Xcode project: ScreenCapture.xcodeproj → ScreenTranslate.xcodeproj - Update bundle identifier: com.screencapture.app → com.screentranslate.app - Update marketing version: 1.0 → 0.1.0 - Rename error type: ScreenCaptureError → ScreenTranslateError - Update all string references and accessibility labels - Update README.md with new project name and description
feat: rename project from ScreenCapture to ScreenTranslate
- 优化 Onboarding 流程与体验 - 改进翻译引擎配置 - 增强预览功能与视图模型 - 完善应用设置与快捷键管理 - 更新本地化字符串 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
feat: 屏幕翻译功能优化与增强
… Simplified Chinese - Add .lproj directory structure (en.lproj, zh-Hans.lproj) - Create comprehensive Localizable.strings for both languages (~410 entries) - Add LanguageManager for runtime language switching - Add AppLanguagePicker in Settings for user language selection - Replace hardcoded strings with L() localization helper - Fix PaddleOCRChecker to use async availability check (prevents SIGABRT) - Fix AttributeGraph cycle in SettingsView by deferring permission checks - Menu bar rebuilds automatically when language changes - Settings view refreshes in real-time on language change
feat(i18n): Complete internationalization support for English and Simplified Chinese
…language support - Add PaddleOCR installation detection in onboarding and settings - Support pyenv/pip installed paddleocr via shell execution - Fix coordinate system for ScreenCaptureKit (use points not pixels) - Parse PaddleOCR output from stderr with proper JSON conversion - Add Chinese/English localization for PaddleOCR UI - Route OCR calls through OCRService based on user settings
feat: PaddleOCR integration with installation detection
- Redesigned SettingsView with sidebar navigation and tabbed sections - Updated SettingsWindowController with full-size content view and glass effects - Added DesignSystem.swift for Liquid Glass and macOS 26 visual effects - Enhanced settings layout with MeshGradient background and modern styling
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Hubert <hubo@HubertdeMacBook-Pro.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…pt, ru) (#64) * feat(i18n): add 8 new language translations (de, es, fr, it, ja, ko, pt, ru) Add complete localization support for 8 additional languages: - German (de): 542 strings - Spanish (es): 542 strings - French (fr): 542 strings - Italian (it): 542 strings - Japanese (ja): 542 strings - Korean (ko): 542 strings - Portuguese (pt): 542 strings - Russian (ru): 542 strings All translations follow professional UI terminology standards and maintain consistency with the existing Chinese translation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(i18n): correct translation errors from PR review - de: Fix typo "Üetzungstext" → "Übersetzungstext" - es: Remove erroneous " %@" suffix from key name - it: Fix typo "tradova" → "traduci" - ja: Replace Chinese "确定" → "確認", "开始" → "開始" - ja: Translate "All rights reserved" → "全著作権所有", "by %@" → "作者: %@" - ko: Fix mixed script "自由형" → "자유형" - pt: Translate "Translation test successful" → "Teste de tradução bem-sucedido" Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(i18n): correct localization key errors and improve Japanese phrasing - de/it/ja: Remove erroneous " %@" suffix from textTranslation.error.translationFailed key - ja: Improve "onboarding.complete.start" phrasing to natural Japanese Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Hubert <hubo@HubertdeMacBook-Pro.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…ecording permission issues (#65) * fix: resolve Google Translate connection test and onboarding screen recording permission issues - Fix TranslationService.testConnection() to create provider lazily for third-party engines (Google, DeepL, Baidu, LLM) that require credentials before the provider can be instantiated - Fix OnboardingViewModel.requestScreenRecordingPermission() to trigger ScreenCaptureKit API before opening System Settings, ensuring the app appears in the Screen Recording permission list Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address PR review nitpicks - Remove redundant MainActor.run wrapper in OnboardingViewModel (class is already @mainactor) - Add logging for missing built-in engine providers and return true instead of false to avoid false connection failure in UI Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Hubert <hubo@HubertdeMacBook-Pro.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* Fix translation language pipeline regressions * Fix auto-detect language availability handling * Remove trailing commas in regression tests * Remove unreachable catch in connection check --------- Co-authored-by: Hubert <hubo@HubertdeMacBook-Pro.local>
* Fix translation language pipeline regressions * Fix auto-detect language availability handling * Remove trailing commas in regression tests * Remove unreachable catch in connection check * Fallback to OCR for leaked VLM prompt output * feat: add cloud and local modes for GLM OCR * fix: address review feedback for GLM OCR modes * fix: tighten GLM OCR review follow-ups * fix: clear stale GLM OCR test state --------- Co-authored-by: Hubert <hubo@HubertdeMacBook-Pro.local>
- Remove custom permission explanation dialogs, use system prompts directly - Fix double dialog issue when requesting accessibility permission - Add fuzzy matching for language codes (zh → zh-Hans, zh-TW → zh-Hant) - Fix localeLanguage to preserve script subtags for Apple Translation - Remove unreliable LanguageAvailability pre-check, rely on TranslationSession errors - Use full BCP 47 identifiers in error messages instead of minimalIdentifier - Improve translation error display with dedicated showTranslationError method
- Fix prefix match direction in fromTranslationCode (en-US → .english now works) - Fix dirty data fallback to return [.apple] consistently - Fix filterEnabledEngines to truly always include Apple engine - Fix showTranslationError fallback key to avoid %@ placeholder issue
System prompt is shown first (no custom dialog). If user denies, show "Open System Settings" guidance to help them grant permission later.
Re-check hasAccessibilityPermission after requestAccessibilityPermission(). If user granted in the system prompt, return true and continue the flow. Only show System Settings guidance when still denied after refresh.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brief description of the changes in this PR.
Type of Change
Related Issues
Fixes #(issue number)
Changes Made
Testing
Describe the tests you ran to verify your changes:
Screenshots
If applicable, add screenshots showing the changes.
Checklist