Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,10 @@ body:
label: "Environment"
description: "Your setup details."
placeholder: |
- OS: macOS 15 / Ubuntu 24.04 / Raspberry Pi OS
- OS: macOS 15 / Ubuntu 24.04 / Debian 12 / Raspberry Pi OS
- Python version: 3.11.x
- Node version: 20.x
- Deployment mode: local dev / Pi production / other
- Deployment mode: local dev / Linux-host production / other
- VocalizeAI version / commit hash:
validations:
required: true
Expand All @@ -61,7 +61,7 @@ body:
id: logs
attributes:
label: "Relevant logs"
description: "Paste any relevant log output. Use `journalctl -u vocalize` on Pi, or the uvicorn terminal output locally."
description: "Paste any relevant log output. Use `journalctl -u vocalize` on a systemd-installed host, or the uvicorn terminal output locally."
render: shell
validations:
required: false
2 changes: 1 addition & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ cd frontend && npm run test:integration
```

Note: `tests/integration/` release-audio cases require a physical audio setup
(microphone + speaker) and a live Pi orchestrator. These are gated behind
(microphone + speaker) and a live Linux-host orchestrator. These are gated behind
`--release-audio` and do not run in PR CI.

All checks must pass on your PR before merge. CI runs lint (ruff + mypy + tsc),
Expand Down
16 changes: 9 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,9 @@ across languages when needed.

## Current Status

**v1 ships** the universal phone-task engine, Web console, and Raspberry Pi
orchestrator deployment. The backend 5-layer prompt architecture
**v1 ships** the universal phone-task engine, Web console, and a Linux-host
orchestrator deployment (Raspberry Pi was the original reference target;
any modern Linux host with systemd works). The backend 5-layer prompt architecture
(task_planner / preflight / merchant_agent / clarification_collector / relay)
handles any phone task — restaurant bookings, service appointments, balance
inquiries, status checks, and more. An OSS mirror is available at
Expand Down Expand Up @@ -91,12 +92,12 @@ VocalizeAI/
│ ├── messages/ # next-intl zh/en bundles
│ └── tests/ # vitest unit tests
├── demos/ # runnable demos
├── infra/ # deployment scripts (GPU node, Pi orchestrator)
├── infra/ # deployment scripts (GPU node, Linux orchestrator)
├── tests/ # pytest suite
│ └── integration/ # Playwright laptop-loopback + AI-merchant harness
├── install/ # one-shot install scripts
│ ├── dev-install.sh # Mac/Linux local dev setup
│ └── pi-install.sh # Raspberry Pi production deploy
│ └── install.sh # Linux production deploy (Raspberry Pi is one example target)
├── docs/ # architecture, deploy guides, release evidence
├── scripts/ # smoke test and utility scripts
│ └── smoke.sh # post-install end-to-end verification
Expand All @@ -117,13 +118,14 @@ VocalizeAI/
See `.env.example` for the full env-var inventory including LLM, GPU service,
and frontend build-time variables.

For the full production Pi deployment runbook, see [docs/deploy/pi.md](docs/deploy/pi.md).
For the full Linux-host production deployment runbook (Raspberry Pi is one
example target), see [docs/deploy/linux.md](docs/deploy/linux.md).

### GPU node requirements

SenseVoice (STT) and CosyVoice (TTS) run as separate GPU services and connect
to the Pi orchestrator over Tailscale. GPU services are optional for local dev
(the LLM path works without them). See [docs/deploy/pi.md](docs/deploy/pi.md)
to the orchestrator host over Tailscale. GPU services are optional for local dev
(the LLM path works without them). See [docs/deploy/linux.md](docs/deploy/linux.md)
for the GPU node setup.

## Run the dev server
Expand Down
8 changes: 4 additions & 4 deletions README.zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,12 +85,12 @@ VocalizeAI/
│ ├── messages/ # next-intl zh/en bundles
│ └── tests/ # vitest unit tests
├── demos/ # runnable demos
├── infra/ # deployment scripts (GPU node, Pi orchestrator)
├── infra/ # 部署脚本(GPU 节点、Linux 编排器)
├── tests/ # pytest suite
│ └── integration/ # Playwright laptop-loopback + AI-merchant harness
├── install/ # 一键安装脚本
│ ├── dev-install.sh # Mac/Linux 本地开发环境安装
│ └── pi-install.sh # 树莓派生产部署安装
│ └── install.sh # Linux 生产部署安装(树莓派是一种受支持的目标)
├── docs/ # 架构文档、部署指南、发布记录
├── scripts/ # smoke 测试和工具脚本
│ └── smoke.sh # 安装后端到端验证脚本
Expand All @@ -110,13 +110,13 @@ VocalizeAI/

完整环境变量清单(含 LLM、GPU 服务、前端构建变量)见 `.env.example`。

完整的树莓派生产部署手册,见 [docs/deploy/pi.md](docs/deploy/pi.md)。
完整的树莓派生产部署手册,见 [docs/deploy/linux.md](docs/deploy/linux.md)。

### GPU 节点要求

SenseVoice(STT)和 CosyVoice(TTS)作为独立 GPU 服务运行,通过 Tailscale
与树莓派编排器连接。本地开发不需要 GPU(只需 LLM 路径即可运行)。GPU 节点配置见
[docs/deploy/pi.md](docs/deploy/pi.md)。
[docs/deploy/linux.md](docs/deploy/linux.md)。

## 跑开发服务器

Expand Down
4 changes: 2 additions & 2 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -556,7 +556,7 @@ See: `src/vocalize/server/ws.py`, `frontend/lib/audio*`, `frontend/components/Br
| Env/config loading | `src/vocalize/config.py` |
| Asyncio main pipeline | `src/vocalize/pipeline.py` |
| Frontend (Next.js 14) | `frontend/` |
| Pi deployment assets | `infra/pi-orchestrator/` |
| Pi deployment assets | `infra/orchestrator/` |
| GPU services setup | `infra/gpu-services/` |
| Backend tests (pytest) | `tests/` |
| Integration tests (Playwright) | `tests/integration/` |
Expand Down Expand Up @@ -611,5 +611,5 @@ See: `src/vocalize/server/ws.py`, `src/vocalize/server/sessions.py`
## Further Reading

- **[docs/deploy/local.md](docs/deploy/local.md)** — Mac/Linux dev environment setup and env-var reference
- **[docs/deploy/pi.md](docs/deploy/pi.md)** — End-to-end Pi production deployment runbook
- **[docs/deploy/linux.md](docs/deploy/linux.md)** — End-to-end Pi production deployment runbook
- **[CONTRIBUTING.md](../CONTRIBUTING.md)** — Contributor flow, code style, commit conventions
127 changes: 84 additions & 43 deletions docs/deploy/pi.md → docs/deploy/linux.md
Original file line number Diff line number Diff line change
@@ -1,57 +1,62 @@
# Deploying VocalizeAI on a Raspberry Pi
# Deploying VocalizeAI on a Linux Host

This runbook covers end-to-end production deployment of VocalizeAI on a
Raspberry Pi: the orchestrator runs on the Pi; GPU services (SenseVoice STT +
CosyVoice TTS) run on a separate machine reachable over Tailscale; a Cloudflare
Tunnel fronts the Pi to the public internet.
This runbook covers end-to-end production deployment of VocalizeAI on any
modern Linux host with systemd: the orchestrator runs on this host, GPU
services (SenseVoice STT + CosyVoice TTS) run on a separate machine reachable
over Tailscale, and a Cloudflare Tunnel fronts the orchestrator host to the
public internet.

Tested on **Debian 12**, **Ubuntu 22.04 / 24.04**, and **Raspberry Pi OS
(Bookworm)**. A Raspberry Pi was the original reference target — see
["Hardware example: Raspberry Pi"](#hardware-example-raspberry-pi) below for
the BOM, OS imaging, and SSH bootstrap steps for that specific target.

---

## Hardware Bill of Materials
## Bill of Materials

**Raspberry Pi:**
- Raspberry Pi 4 or Pi 5, **8 GB RAM recommended** (4 GB works for the orchestrator
alone but is tight if other services run alongside)
- 32 GB+ microSD card or USB SSD (SSD strongly recommended for production)
- Reliable internet connection (Cloudflare Tunnel requires outbound HTTPS)
**Orchestrator host:**
- Any 64-bit Linux box with systemd, ≥ 2 GB RAM, ≥ 16 GB free disk.
- Python 3.11 (installed by step 1 of `install/install.sh`).
- Persistent internet connection (Cloudflare Tunnel requires outbound HTTPS).

**GPU node (separate machine):**
- NVIDIA RTX-class GPU (GTX 1080 or better; RTX 30/40 series recommended)
- Windows + WSL2 or Linux (PyTorch 2.7.1+cu128)
- Reachable from the Pi over Tailscale on the configured `GPU_HOST` IP/hostname
- NVIDIA RTX-class GPU (GTX 1080 or better; RTX 30/40 series recommended).
- Windows + WSL2 or Linux (PyTorch 2.7.1+cu128).
- Reachable from the orchestrator host over Tailscale on the configured
`GPU_HOST` IP/hostname.

**Network:**
- Tailscale account (free tier is sufficient) with both the Pi and GPU node enrolled
- Cloudflare account with a domain pointed at Cloudflare DNS (free tier is sufficient)
- Tailscale account (free tier is sufficient) with both the orchestrator host
and the GPU node enrolled.
- Cloudflare account with a domain pointed at Cloudflare DNS (free tier is
sufficient).

---

## OS Preparation

```bash
# Flash Raspberry Pi OS Lite (64-bit) to the SD card / SSD using Raspberry Pi Imager.
# In Imager, pre-configure:
# - hostname
# - SSH enabled
# - SSH public key (paste your ~/.ssh/id_ed25519.pub or generate one first)
# - Wi-Fi credentials (if not using Ethernet)

# After first boot, SSH in and update the system:
ssh pi@<pi-hostname>
# On the orchestrator host (any modern 64-bit Linux with systemd):
ssh <user>@<host>
sudo apt-get update && sudo apt-get upgrade -y

# Ensure git and curl are present:
sudo apt-get install -y git curl
```

For the Raspberry Pi-specific imaging / first-boot steps, see
["Hardware example: Raspberry Pi"](#hardware-example-raspberry-pi).

---

## Tailscale Setup

Tailscale provides the encrypted overlay network between the Pi and the GPU node.
Tailscale provides the encrypted overlay network between the orchestrator
host and the GPU node.

```bash
# Install Tailscale on the Pi:
# Install Tailscale on the orchestrator host:
curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up

Expand All @@ -66,28 +71,28 @@ tailscale status

Set `GPU_HOST` in `/opt/vocalize/.env` to the GPU node's Tailscale IP.

If the GPU services are not yet running, use `install/pi-install.sh --skip-gpu`
If the GPU services are not yet running, use `install/install.sh --skip-gpu`
to proceed with installation without the GPU-reachability check.

---

## Clone and Install

```bash
# Clone the repository to the Pi:
# Clone the repository to the orchestrator host:
git clone https://github.com/DGPisces/VocalizeAI.git /opt/vocalize
cd /opt/vocalize

# Dry-run first to preview all 7 steps:
bash install/pi-install.sh --dry-run
bash install/install.sh --dry-run

# Run the full installer:
bash install/pi-install.sh
bash install/install.sh

# Or run selectively:
bash install/pi-install.sh --steps "1,2,6" # apt + venv + systemd only
bash install/pi-install.sh --skip-gpu # skip GPU-reachability check in step 7
bash install/pi-install.sh --skip-tunnel # skip step 5 (Cloudflare Tunnel info)
bash install/install.sh --steps "1,2,6" # apt + venv + systemd only
bash install/install.sh --skip-gpu # skip GPU-reachability check in step 7
bash install/install.sh --skip-tunnel # skip step 5 (Cloudflare Tunnel info)
```

**Installer steps:**
Expand All @@ -96,7 +101,7 @@ bash install/pi-install.sh --skip-tunnel # skip step 5 (Cloudflare Tunnel i
|------|--------|
| 1 | `apt-get install` python3.11 python3.11-venv python3-pip build-essential rsync |
| 2 | Create `.venv` in `/opt/vocalize`, `pip install -e .` |
| 3 | GPU services note (GPU lives on a separate host; no on-Pi install) |
| 3 | GPU services note (GPU lives on a separate host; no on-orchestrator install) |
| 4 | Tailscale presence check (warns if absent) |
| 5 | Cloudflare Tunnel token-install instructions |
| 6 | Copy `vocalize.service` to `/etc/systemd/system/`, copy `.env.template` to `/opt/vocalize/.env` if absent, `systemctl enable vocalize` |
Expand Down Expand Up @@ -124,9 +129,9 @@ sudo nano /opt/vocalize/.env
| `GPU_HOST` | yes (if using GPU) | STT/TTS host — Tailscale IP of your GPU node |
| `SENSEVOICE_WS_PORT` | default ok | STT port; default `8000` |
| `COSYVOICE_WS_PORT` | default ok | TTS port; default `8001` |
| `VOCALIZE_HOST` | default ok | uvicorn bind host; set to `0.0.0.0` for Pi production |
| `VOCALIZE_HOST` | default ok | uvicorn bind host; set to `0.0.0.0` for production |
| `VOCALIZE_PORT` | default ok | uvicorn bind port; default `8080` |
| `ORCHESTRATOR_LISTEN_PORT` | default ok | Pi service port; default `8080` (legacy compatibility) |
| `ORCHESTRATOR_LISTEN_PORT` | default ok | Orchestrator service port; default `8080` (legacy compatibility) |
| `VOCALIZE_WS_BASE_URL` | **yes** | Public WS base URL; e.g. `wss://api.<your-domain>` — startup raises if missing in non-localhost mode |
| `VOCALIZE_CORS_ORIGINS` | default ok | Comma-separated allowed CORS origins; default auto-picked from VOCALIZE_HOST |
| `DEFAULT_LANGUAGE` | default ok | `zh` or `en`; default `zh` |
Expand All @@ -153,7 +158,7 @@ VOCALIZE_CORS_ORIGINS=https://<your-domain>

## Cloudflare Tunnel

The Cloudflare Tunnel connects the Pi to the public internet without exposing SSH
The Cloudflare Tunnel connects the orchestrator host to the public internet without exposing SSH
or opening firewall ports.

**Token-based install (recommended):**
Expand All @@ -163,15 +168,15 @@ or opening firewall ports.
# Zero Trust -> Networks -> Tunnels -> [your tunnel] -> Configure
# -> "Install and run a connector" -> Copy the displayed token

# Install the tunnel service on the Pi:
# Install the tunnel service on the orchestrator host:
sudo cloudflared service install <TUNNEL_TOKEN>

# Verify the service is running:
sudo systemctl status cloudflared
```

The reference ingress shape for this project is documented in
`infra/pi-orchestrator/cloudflared-config.yml` (maps
`infra/orchestrator/cloudflared-config.yml` (maps
`vocalize-api.<your-tunnel-name>` → `http://localhost:8080` and
`vocalize.<your-tunnel-name>` → `http://localhost:3000`). Configure
the actual public hostname routing in the Cloudflare dashboard under
Expand All @@ -186,7 +191,7 @@ tunnel name; tunnels are account-specific.

### vocalize.service

The `vocalize.service` unit file is at `infra/pi-orchestrator/vocalize.service`
The `vocalize.service` unit file is at `infra/orchestrator/vocalize.service`
and is copied to `/etc/systemd/system/vocalize.service` by step 6 of the installer.

```ini
Expand Down Expand Up @@ -240,7 +245,7 @@ sudo systemctl restart vocalize
After the installer completes (step 7 runs this automatically), verify the deployment:

```bash
# Smoke test against the Pi's local port:
# Smoke test against the orchestrator's local port:
VOCALIZE_API_BASE=http://127.0.0.1:8080 bash scripts/smoke.sh
# Exit 0 = working deployment

Expand All @@ -251,7 +256,7 @@ VOCALIZE_API_BASE=https://api.<your-domain> bash scripts/smoke.sh
The smoke script exercises 6 round-trips: `GET /health`, `POST /api/sessions`,
`POST /api/sessions/{id}/task`, WS upgrade + send/recv, `DELETE /api/sessions/{id}`.

Note: the local smoke on the Pi uses port 8080 (production port), not 8000 (dev
Note: the local smoke uses port 8080 (production port), not 8000 (dev
port). Make sure `VOCALIZE_API_BASE` is set accordingly.

---
Expand Down Expand Up @@ -311,3 +316,39 @@ Common causes:
**Port conflicts:**
- `VOCALIZE_PORT` defaults to 8080. If another service occupies that port, change
`VOCALIZE_PORT` in `.env` and update the Cloudflare Tunnel ingress rule accordingly.

---

## Hardware example: Raspberry Pi

The Raspberry Pi was the original reference target for this runbook. None of
the steps above are Pi-specific; this section just captures the bits that
differ when the orchestrator host happens to be a Pi.

### BOM

- Raspberry Pi 4 or Pi 5, **8 GB RAM recommended** (4 GB works for the
orchestrator alone but is tight if other services run alongside).
- 32 GB+ microSD card or USB SSD (SSD strongly recommended for production).
- Reliable internet connection (Cloudflare Tunnel requires outbound HTTPS).

### Imaging and first boot

```bash
# Flash Raspberry Pi OS Lite (64-bit) to the SD card / SSD using Raspberry
# Pi Imager. In Imager, pre-configure:
# - hostname
# - SSH enabled
# - SSH public key (paste your ~/.ssh/id_ed25519.pub or generate one first)
# - Wi-Fi credentials (if not using Ethernet)

# After first boot, SSH in and update the system:
ssh pi@<pi-hostname>
sudo apt-get update && sudo apt-get upgrade -y

# Ensure git and curl are present:
sudo apt-get install -y git curl
```

From here on, the rest of this runbook (Tailscale, install, Cloudflare
Tunnel, smoke) applies unchanged.
6 changes: 3 additions & 3 deletions docs/deploy/local.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,12 +72,12 @@ $EDITOR .env
| `OPENAI_API_KEY` | **yes** | LLM authentication — any OpenAI-compatible provider (OpenAI, DeepSeek, Qwen, etc.) |
| `OPENAI_BASE_URL` | default ok | LLM endpoint; default `https://api.deepseek.com/v1` |
| `OPENAI_MODEL` | default ok | Model name; default `deepseek-chat` |
| `GPU_HOST` | only if using GPU | STT/TTS host; use `localhost` for single-machine dev, Tailscale IP for Pi deployment |
| `GPU_HOST` | only if using GPU | STT/TTS host; use `localhost` for single-machine dev, Tailscale IP for remote-GPU deployment (e.g. Raspberry Pi orchestrator → GPU node) |
| `SENSEVOICE_WS_PORT` | default ok | SenseVoice STT WebSocket port; default `8000` |
| `COSYVOICE_WS_PORT` | default ok | CosyVoice TTS WebSocket port; default `8001` |
| `VOCALIZE_HOST` | default ok | uvicorn bind host; `127.0.0.1` for local dev, `0.0.0.0` for production |
| `VOCALIZE_PORT` | default ok | uvicorn bind port; default `8080` (note: dev `main.py` defaults to 8000) |
| `ORCHESTRATOR_LISTEN_PORT` | default ok | Pi service port; default `8080` (legacy; mirrors `VOCALIZE_PORT`) |
| `ORCHESTRATOR_LISTEN_PORT` | default ok | Orchestrator service port; default `8080` (legacy; mirrors `VOCALIZE_PORT`) |
| `VOCALIZE_WS_BASE_URL` | required when non-localhost | Public WS base URL (e.g. `wss://api.example.com`); startup raises if missing in non-localhost mode (D-11) |
| `VOCALIZE_CORS_ORIGINS` | default ok | Comma-separated allowed CORS origins; auto-picked from VOCALIZE_HOST in dev mode |
| `DEFAULT_LANGUAGE` | default ok | Session default language; `zh` or `en`; default `zh` |
Expand Down Expand Up @@ -178,7 +178,7 @@ cd frontend && npm run test:integration
```

Note: `tests/integration/` release-audio cases require a physical audio setup
(microphone + speaker) and a live Pi orchestrator. These are gated behind
(microphone + speaker) and a live Linux-host orchestrator. These are gated behind
`--release-audio` and do not run in PR CI. The standard integration test suite
(`npm run test:integration`) runs the 8 text-bypass AI-merchant scenarios and
does not require physical hardware.
Expand Down
Loading