Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 2 additions & 6 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,6 @@ VOCALIZE_HOST=127.0.0.1
VOCALIZE_PORT=8080
ORCHESTRATOR_LISTEN_PORT=8080

# X-Invite-Token shared secret. Required when VOCALIZE_HOST != 127.0.0.1.
VOCALIZE_INVITE_TOKEN=

# Public WebSocket base URL. Required when VOCALIZE_HOST != 127.0.0.1.
VOCALIZE_WS_BASE_URL=

Expand All @@ -49,6 +46,5 @@ LOG_DIR=logs
# -------------------------------------------------------------------------
# Frontend (Next.js — baked into the JS bundle at build time)
# -------------------------------------------------------------------------
NEXT_PUBLIC_VOCALIZE_API_BASE_URL=https://vocalize-api.dgpisces.com
NEXT_PUBLIC_VOCALIZE_WS_BASE_URL=wss://vocalize-api.dgpisces.com
NEXT_PUBLIC_VOCALIZE_INVITE_TOKEN=
NEXT_PUBLIC_VOCALIZE_API_BASE_URL=https://api.example.com
NEXT_PUBLIC_VOCALIZE_WS_BASE_URL=wss://api.example.com
8 changes: 5 additions & 3 deletions .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,11 @@ body:
- type: markdown
attributes:
value: |
Thank you for reporting a bug! Before submitting, please search existing
issues to avoid duplicates. For security vulnerabilities, see
[SECURITY.md](../../SECURITY.md) instead of filing a public issue.
Thank you for reporting a bug! Before submitting, please search
existing issues to avoid duplicates. VocalizeAI is a self-deploy
project — every operator runs their own backend on their own
infrastructure — so all reports (including security-relevant ones)
go through GitHub Issues here.

- type: textarea
id: what-happened
Expand Down
20 changes: 13 additions & 7 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,10 @@ VocalizeAI follows an out-of-band contribution model (D-16):
3. This keeps the private repo as the single source of truth and preserves the
`.planning/` workflow.

If your contribution is security-related, see [SECURITY.md](SECURITY.md) first.
VocalizeAI is a self-deploy project, so there is no centrally hosted
instance to defend. Security-relevant fixes go through the same PR flow
as any other contribution; flag the concern in the PR description so
reviewers know to prioritise it.

## Setting up the dev environment

Expand Down Expand Up @@ -67,7 +70,7 @@ test(<area>): <verb> <noun>
refactor(<area>): <verb> <noun>
```

Examples: `feat(server): add X-Invite-Token gate`, `fix(frontend): handle 401 on session create`.
Examples: `feat(server): bound task length`, `fix(frontend): handle 401 on session create`.

## Branches

Expand All @@ -77,8 +80,11 @@ directly to `main`.
## Issue triage / vulnerability reporting

- Ordinary bugs and feature requests: file a GitHub issue.
- Security vulnerabilities: follow the process in [SECURITY.md](SECURITY.md).
Do NOT file public GitHub issues for security topics.
- Security-relevant findings: file a GitHub issue here. VocalizeAI is a
self-deploy project (no centrally hosted instance), so there is no
separate private disclosure channel. Each operator is responsible for
their own deployment; report findings publicly so every operator can
pick up the fix.

## CI behavior for external PRs

Expand Down Expand Up @@ -110,9 +116,9 @@ VocalizeAI's CI pipeline has two tiers depending on where your PR originates:

VocalizeAI does not adopt a formal Code of Conduct at this stage. Standard
professional conduct is expected: be respectful, assume good faith, focus on
the technical content. Disputes that cannot be resolved in-thread escalate to
the maintainer via email (see SECURITY.md for the contact channel; for
non-security disputes use the same address).
the technical content. Disputes that cannot be resolved in-thread escalate
by opening a GitHub issue with a clear summary; the maintainer will follow
up there.

## License

Expand Down
24 changes: 9 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,6 @@ inquiries, status checks, and more. An OSS mirror is available at
[github.com/DGPisces/VocalizeAI](https://github.com/DGPisces/VocalizeAI)
under Apache 2.0.

## Why VocalizeAI is invite-only right now

v1 uses a shared invite token (`X-Invite-Token` header on `POST /api/sessions`)
distributed out-of-band. This is a long-lived shared secret — no rotation flow
exists in v1; rotation requires a deploy-time env-var change plus frontend
rebuild. Per-user authentication is v1.x scope (requirement AUTH-01).

Anyone who holds the token can use the service. To request an invite token,
email [gaodingyun2@gmail.com](mailto:gaodingyun2@gmail.com).

## Quick Start

**Prerequisites:** Python 3.11+, Node 20+, git, curl. Optional: `uv` (auto-installed by the script).
Expand Down Expand Up @@ -121,9 +111,8 @@ VocalizeAI/

| Variable | Purpose |
|----------|---------|
| `VOCALIZE_INVITE_TOKEN` | Shared invite token for `POST /api/sessions`; required when `VOCALIZE_HOST != 127.0.0.1` |
| `VOCALIZE_WS_BASE_URL` | WebSocket base URL returned to clients (e.g., `wss://vocalize-api.example.com`); required in non-localhost mode to prevent Host-header spoofing |
| `VOCALIZE_CORS_ORIGINS` | Comma-separated allowed CORS origins; defaults to `https://vocalize.example.com` in non-localhost mode |
| `VOCALIZE_WS_BASE_URL` | WebSocket base URL returned to clients (e.g., `wss://api.example.com`); required in non-localhost mode to prevent Host-header spoofing |
| `VOCALIZE_CORS_ORIGINS` | Comma-separated allowed CORS origins; **required** in non-localhost mode (no default) |

See `.env.example` for the full env-var inventory including LLM, GPU service,
and frontend build-time variables.
Expand Down Expand Up @@ -205,8 +194,13 @@ follow code style, and submit contributions. Issue + PR templates live under `.g

## Security

See [SECURITY.md](SECURITY.md) for the vulnerability reporting channel,
threat model summary, and emergency rollback procedure.
VocalizeAI is self-deploy: every operator runs their own backend on
their own infrastructure, and there is no centrally hosted instance to
defend. Report any security-relevant finding via GitHub Issues — same
as any other bug — so every operator can pick up the fix. Self-deploy
operators are responsible for restricting reachability at the network
or proxy layer (Cloudflare Access, VPN, reverse-proxy auth, etc.).
Per-user authentication is v1.x scope (requirement `AUTH-01`).

## License

Expand Down
20 changes: 7 additions & 13 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,15 +15,6 @@ clarification_collector / relay)可处理任何电话任务 —— 餐厅订
[github.com/DGPisces/VocalizeAI](https://github.com/DGPisces/VocalizeAI),
协议 Apache 2.0。

## 为什么 VocalizeAI 当前是邀请制

v1 采用共享邀请令牌机制:`POST /api/sessions` 须携带 `X-Invite-Token` 请求头,
令牌通过线下渠道发放。这是长效共享密钥 —— v1 没有轮换流程;轮换需要修改部署环境变量
并重新构建前端。每用户认证属于 v1.x 范畴(需求 AUTH-01)。

持有令牌的人均可使用本服务。如需申请邀请令牌,请发邮件至
[gaodingyun2@gmail.com](mailto:gaodingyun2@gmail.com)。

## 快速开始

**前提条件:** Python 3.11+、Node 20+、git、curl。可选:`uv`(安装脚本会自动安装)。
Expand Down Expand Up @@ -114,9 +105,8 @@ VocalizeAI/

| 变量 | 用途 |
|------|------|
| `VOCALIZE_INVITE_TOKEN` | `POST /api/sessions` 所需的共享邀请令牌;`VOCALIZE_HOST != 127.0.0.1` 时必填 |
| `VOCALIZE_WS_BASE_URL` | 返回给客户端的 WebSocket 基地址(如 `wss://vocalize-api.example.com`);非 localhost 模式必填,防止 Host 头欺骗 |
| `VOCALIZE_CORS_ORIGINS` | 允许的 CORS 来源(逗号分隔);非 localhost 模式默认为 `https://vocalize.example.com` |
| `VOCALIZE_WS_BASE_URL` | 返回给客户端的 WebSocket 基地址(如 `wss://api.example.com`);非 localhost 模式必填,防止 Host 头欺骗 |
| `VOCALIZE_CORS_ORIGINS` | 允许的 CORS 来源(逗号分隔);非 localhost 模式**必填**(无默认值) |

完整环境变量清单(含 LLM、GPU 服务、前端构建变量)见 `.env.example`。

Expand Down Expand Up @@ -195,7 +185,11 @@ Issue 模板和 PR 模板位于 `.github/` 目录。

## 安全

漏洞上报渠道、威胁模型摘要和紧急回滚流程,见 [SECURITY.md](SECURITY.md)。
VocalizeAI 是自部署项目 —— 每个运维者在自己的基础设施上跑自己的后端,
没有"统一托管实例"可被攻击。任何安全相关发现请通过 GitHub Issues 上报,
与普通 bug 同一通道,这样每个运维者都能拿到修复。自部署运维者负责在网络/
代理层(Cloudflare Access、VPN、反向代理认证等)限制可达性。每用户认证属于
v1.x 范畴(需求 `AUTH-01`)。

## 许可证

Expand Down
97 changes: 0 additions & 97 deletions SECURITY.md

This file was deleted.

34 changes: 17 additions & 17 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ the AI audio pipeline and the PSTN call.

### End-to-End Request Flow (Quick Reference)

1. **User creates a session** — `POST /api/sessions` with `X-Invite-Token` → receives
1. **User creates a session** — `POST /api/sessions` → receives
`session_id` + `ws_url`.
2. **User sets the task** — `POST /api/sessions/{id}/task` with `{"task": "..."}` → Phase
transitions `draft → task_planning → collecting`.
Expand Down Expand Up @@ -374,16 +374,17 @@ See: `src/vocalize/server/frames.py`

## REST Surface

The REST API is mounted at `/api/sessions`. Authentication is via a shared
invite token (`X-Invite-Token` header on session creation). In localhost-dev
mode (`VOCALIZE_HOST=127.0.0.1` and `VOCALIZE_INVITE_TOKEN` unset), the token
gate is disabled.
The REST API is mounted at `/api/sessions`. The backend ships no
request-level authentication in v1; self-deploy operators restrict
reachability at the network or proxy layer (per-user auth is v1.x
scope — requirement `AUTH-01`).

See: `src/vocalize/server/sessions.py`, `src/vocalize/server/health.py`

### `POST /api/sessions`

**Auth:** `X-Invite-Token` header (design constraint D-08; uses `secrets.compare_digest`)
**Auth:** None at the backend layer in v1 (network/proxy restriction is the
operator's responsibility).

**Request body** (`CreateSessionRequest`, all optional):
```json
Expand All @@ -405,7 +406,7 @@ See: `src/vocalize/server/sessions.py`, `src/vocalize/server/health.py`
}
```

See: `src/vocalize/server/sessions.py` — `_check_invite_token`, `CreateSessionRequest`
See: `src/vocalize/server/sessions.py` — `CreateSessionRequest`

---

Expand Down Expand Up @@ -565,17 +566,17 @@ See: `src/vocalize/server/ws.py`, `frontend/lib/audio*`, `frontend/components/Br
## Security Posture

The security controls relevant to the architecture are documented here for
API consumers and security researchers. For the full threat model and disclosure
channel, see [SECURITY.md](../SECURITY.md).
API consumers and security researchers. VocalizeAI is a self-deploy
project (no centrally hosted instance); report security-relevant findings
via GitHub Issues — same channel as any other bug.

### Invite-Token Gate (D-08)
### Authentication (D-08, retired)

`POST /api/sessions` requires `X-Invite-Token: <token>` matching the server-side
`VOCALIZE_INVITE_TOKEN` env var. The comparison uses `secrets.compare_digest` to
prevent timing attacks. In localhost-dev mode (`VOCALIZE_HOST=127.0.0.1` and
`VOCALIZE_INVITE_TOKEN` unset), the gate is disabled for development convenience.

See: `src/vocalize/server/sessions.py` — `_check_invite_token`
The original D-08 shared-invite-token gate has been removed; v1 ships no
backend-level auth on `POST /api/sessions` or the WebSocket. Self-deploy
operators are expected to restrict reachability at the network or proxy
layer (Cloudflare Access, VPN, reverse-proxy auth, etc.). Per-user
authentication is v1.x scope (requirement `AUTH-01`).

### Task Length Bound (D-09)

Expand Down Expand Up @@ -612,4 +613,3 @@ See: `src/vocalize/server/ws.py`, `src/vocalize/server/sessions.py`
- **[docs/deploy/local.md](docs/deploy/local.md)** — Mac/Linux dev environment setup and env-var reference
- **[docs/deploy/pi.md](docs/deploy/pi.md)** — End-to-end Pi production deployment runbook
- **[CONTRIBUTING.md](../CONTRIBUTING.md)** — Contributor flow, code style, commit conventions
- **[SECURITY.md](../SECURITY.md)** — Vulnerability reporting, threat model, emergency rollback
16 changes: 7 additions & 9 deletions docs/deploy/local.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,19 +78,18 @@ $EDITOR .env
| `VOCALIZE_HOST` | default ok | uvicorn bind host; `127.0.0.1` for local dev, `0.0.0.0` for production |
| `VOCALIZE_PORT` | default ok | uvicorn bind port; default `8080` (note: dev `main.py` defaults to 8000) |
| `ORCHESTRATOR_LISTEN_PORT` | default ok | Pi service port; default `8080` (legacy; mirrors `VOCALIZE_PORT`) |
| `VOCALIZE_INVITE_TOKEN` | required when non-localhost | Shared invite secret for `POST /api/sessions`; **gate is disabled when `VOCALIZE_HOST=127.0.0.1` and this var is unset** — localhost-dev shortcut |
| `VOCALIZE_WS_BASE_URL` | required when non-localhost | Public WS base URL (e.g. `wss://vocalize-api.example.com`); startup raises if missing in non-localhost mode (D-11) |
| `VOCALIZE_WS_BASE_URL` | required when non-localhost | Public WS base URL (e.g. `wss://api.example.com`); startup raises if missing in non-localhost mode (D-11) |
| `VOCALIZE_CORS_ORIGINS` | default ok | Comma-separated allowed CORS origins; auto-picked from VOCALIZE_HOST in dev mode |
| `DEFAULT_LANGUAGE` | default ok | Session default language; `zh` or `en`; default `zh` |
| `LOG_DIR` | default ok | Log directory; default `logs` |
| `NEXT_PUBLIC_VOCALIZE_API_BASE_URL` | yes for frontend | Frontend API base URL baked into the Next.js JS bundle at build time |
| `NEXT_PUBLIC_VOCALIZE_WS_BASE_URL` | optional | Frontend WS base; derived from `NEXT_PUBLIC_VOCALIZE_API_BASE_URL` if absent |
| `NEXT_PUBLIC_VOCALIZE_INVITE_TOKEN` | yes for frontend in prod | Invite token baked into Next.js JS bundle; required for the frontend to authenticate session creation |

**Localhost-dev shortcut:** when `VOCALIZE_HOST` is `127.0.0.1` and
`VOCALIZE_INVITE_TOKEN` is unset, the `X-Invite-Token` gate on `POST /api/sessions`
is disabled. This means you can `curl http://127.0.0.1:8000/api/sessions` without
supplying a token — convenient for local development.
**Backend auth posture:** v1 ships no request-level auth on
`POST /api/sessions` or the WebSocket. For non-localhost deployments,
restrict reachability at the network or proxy layer (Cloudflare Access,
VPN, reverse-proxy auth, etc.). Per-user auth is v1.x scope
(requirement `AUTH-01`).

**Minimum for local dev (no GPU):**
```bash
Expand Down Expand Up @@ -156,8 +155,7 @@ Exit code 0 = the development environment is working. The smoke script exercises
6 round-trips: health check, create session, set task, WS upgrade + send/recv,
delete session. Total runtime is ~20 seconds.

The smoke script uses `VOCALIZE_API_BASE` (default `http://127.0.0.1:8000`) and
`VOCALIZE_INVITE_TOKEN` (default empty, gate disabled in localhost-dev mode).
The smoke script uses `VOCALIZE_API_BASE` (default `http://127.0.0.1:8000`).

---

Expand Down
Loading