Skip to content

Kar1Seed/Voice-Assistant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

语音助理 / Voice Assistant

一个面向个人工作流的语音入口项目。
A local-first voice intake project for personal workflows.

它把 iPhone / Siri / 网页输入进来的自然语言,分流到 Google Calendar、Apple Reminders,或者作为命令去操作已有的日历与提醒。
It routes natural-language voice input from iPhone / Siri / the web into Google Calendar, Apple Reminders, or command actions that operate on existing calendar and reminder items.

项目定位 / What This Project Is

这不是一个“只会记日历”的机器人。它更像是一个个人语音收件箱和执行层:
This is not just a calendar bot. It is closer to a personal voice inbox plus an execution layer:

  • 有明确时间的事项进日历
    Time-specific items go to Google Calendar
  • 没有明确时间的待办、想法、提醒进 Reminders
    Time-ambiguous tasks, ideas, and reminders go to Apple Reminders
  • 对已有事项的“删除 / 修改 / 合并”等操作走命令分支
    Operations like delete / update / merge go through a command branch
  • 所有语音记录都会保留处理结果,方便回看
    All voice records keep processing results for review

当前能力 / Current Features

  • iPhone Siri / 快捷指令可直接把语音文本发到本地服务
    iPhone Siri / Shortcuts can send dictated text directly to the local service
  • 本地 Web 页面支持手动输入和浏览器语音输入
    Local web UI supports manual input and browser dictation
  • 支持 LLM 语义解析,也支持本地规则 fallback
    Supports LLM-based semantic parsing with local-rule fallback
  • 明确日期时间的内容可自动写入 Google Calendar
    Items with clear date/time can be written to Google Calendar automatically
  • 生日、纪念日这类无具体时刻但日期明确的事项会作为全天事件写入
    Birthdays and date-only events can be created as all-day calendar events
  • 不适合直接进日历的内容会进入 Apple Reminders
    Items not suitable for direct calendar scheduling go into Apple Reminders
  • 支持语音命令操作已有的 Calendar / Reminders
    Supports voice commands for existing Calendar / Reminders entries
  • 当前已覆盖的命令方向包括:新增、删除、修改、合并
    Current command directions include: create, delete, update, merge
  • Google Calendar 与 Reminders 都有失败重试机制
    Both Google Calendar and Reminders have retry queues
  • 桌面 md 日志会记录原始语音、系统理解和执行结果
    A desktop markdown log records raw voice input, system interpretation, and execution results

核心流程 / Core Flow

  1. 采集:iPhone、快捷指令、网页输入进入同一个入口
    Capture: iPhone, Shortcuts, and web input enter through the same intake
  2. 理解:LLM 优先做语义判断,本地规则作为兜底
    Interpret: LLM does the primary semantic routing, local rules act as fallback
  3. 分流:进入 Calendar、Reminders 或 Command
    Route: send to Calendar, Reminders, or Command
  4. 执行:创建、删除、修改、合并
    Act: create, delete, update, merge
  5. 回看:日志、重试队列、桌面 md 可追踪
    Review: logs, retry queues, and desktop markdown stay traceable

适用场景 / Typical Use Cases

  • “明天上午 9 点去修车” -> Google Calendar
    "Tomorrow at 9am, go fix the car" -> Google Calendar
  • “今天是我妈的生日” -> Google Calendar 全天事项
    "Today is my mom's birthday" -> Google Calendar all-day event
  • “下周看一下航班信息” -> Apple Reminders
    "Check flight info next week" -> Apple Reminders
  • “删除那个北京航班的提醒” -> 命令分支删除提醒
    "Delete that Beijing flight reminder" -> command branch deletes the reminder
  • “把车辆保养和车辆维护事项合并成一个提醒” -> 命令分支合并提醒
    "Merge the vehicle maintenance reminders into one" -> command branch merges reminders

本地运行 / Local Setup

1) 安装依赖 / Install dependencies

npm install

2) 配置私有环境变量 / Create private env config

推荐把敏感信息放在仓库外面:
Recommended: keep secrets outside the repo:

mkdir -p ~/.config/voice-assistant
cp .env.example ~/.config/voice-assistant/.env

然后编辑:
Then edit:

~/.config/voice-assistant/.env

服务启动时会按这个顺序读取配置:
The server loads env files in this order:

  1. VOICE_ASSISTANT_ENV_FILE 指向的文件
    file pointed to by VOICE_ASSISTANT_ENV_FILE
  2. ~/.config/voice-assistant/.env
  3. 项目目录下的 .env(兼容旧用法)
    project-local .env as a legacy fallback

3) 启动服务 / Start the server

npm start

默认地址:
Default URL:

http://localhost:8787

桌面小组件页面:
Widget page:

http://localhost:8787/widget.html

如需换端口:
To use another port:

APP_PORT=8790 npm start

环境变量 / Environment Variables

示例配置见:
See the sample config:

.env.example

常用项包括:
Common settings include:

APP_PORT=8787
APP_TIMEZONE=Asia/Manila
DESKTOP_MEMO_PATH=/Users/your-name/Desktop/语音助理待确认.md

GOOGLE_CLIENT_ID=
GOOGLE_CLIENT_SECRET=
GOOGLE_REDIRECT_URI=http://localhost:8787/auth/google/callback
GOOGLE_CALENDAR_ID=primary

HTTP_PROXY=http://127.0.0.1:7890
HTTPS_PROXY=http://127.0.0.1:7890

LLM_API_KEY=
LLM_API_BASE=https://openrouter.ai/api/v1
LLM_MODEL=deepseek/deepseek-v3.2

REMINDERS_LIST_NAME=语音助理待处理
REMINDER_RETRY_INTERVAL_MS=60000
REMINDER_RETRY_MAX_ATTEMPTS=8

如果没有配置 LLM_API_KEY,服务会退回本地规则解析。
If LLM_API_KEY is missing, the service falls back to local parsing rules.

Google Calendar 授权 / Google Calendar Auth

启动服务后,打开:
After starting the server, open:

http://localhost:8787/auth/google/start

完成授权后,本地会生成 Google OAuth token 文件。
After authorization, local Google OAuth token files will be created.

首次运行后自动生成的文件 / Files Generated After First Run

以下内容属于本机运行态,不会提交到 Git:
These are local runtime files and should not be committed to Git:

  • data/
  • data/server.log
  • data/voice-intake-inbox.jsonl
  • data/calendar-actions.jsonl
  • data/reminder-actions.jsonl
  • data/voice-command-actions.jsonl
  • data/calendar-retry-queue.json
  • data/reminder-retry-queue.json
  • data/google-token.json
  • data/google-oauth-state.json
  • ~/Desktop/语音助理待确认.md

桌面上的这两个 .command 文件是作者本机上的便捷启动脚本,不属于仓库必需文件:
These two desktop .command files are local convenience launchers on the author's machine and are not required by the repository:

  • ~/Desktop/启动语音助理后台.command
  • ~/Desktop/停止语音助理后台.command

目录结构 / Project Structure

src/server.mjs                  # 主服务 / main server
public/                         # 本地网页入口 / local web UI
docs/                           # 说明文档 / docs
scripts/launch-server.zsh       # 本地启动脚本 / local launch script
data/                           # 运行时数据(不提交) / runtime data (ignored)

相关文档 / Related Docs

开源建议 / Open Source Notes

如果你要 fork 或一起开发,建议遵循这几个约定:
If you want to fork or collaborate, these conventions help:

  • 不要提交 .env、token、日志、data/
    Do not commit .env, tokens, logs, or data/
  • 把你自己的私有配置放在仓库外
    Keep your private config outside the repo
  • 对语义分流优先依赖 LLM,本地规则只做兜底
    Prefer LLM routing for semantics and keep local rules as fallback
  • 提交前尽量用真实中文语音句子做回归测试
    Before shipping changes, test with real Chinese voice phrases

下一步 / Roadmap

  • 提升“修改 / 合并”命令的稳定性
    Improve reliability of update / merge commands
  • 增加更多真实语料测试样本
    Add more real-world Chinese voice test cases
  • 增加可视化日志与回看界面
    Add better visual history and review UI
  • 提供可导入的 iPhone Shortcuts 模板
    Provide importable iPhone Shortcuts templates

参考 / References

About

A local-first voice intake project for personal workflows.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors