Deploying VocalizeAI on a Linux Host

This runbook covers end-to-end production deployment of VocalizeAI on any modern Linux host with systemd: the orchestrator runs on this host, GPU services (SenseVoice STT + CosyVoice TTS) run on a separate machine reachable over Tailscale, and a Cloudflare Tunnel fronts the orchestrator host to the public internet.

Tested on Debian 12, Ubuntu 22.04 / 24.04, and Raspberry Pi OS (Bookworm). A Raspberry Pi was the original reference target — see "Hardware example: Raspberry Pi" below for the BOM, OS imaging, and SSH bootstrap steps for that specific target.

Bill of Materials

Orchestrator host:

Any 64-bit Linux box with systemd, ≥ 2 GB RAM, ≥ 16 GB free disk.
Python 3.11 (installed by step 1 of install/install.sh).
Persistent internet connection (Cloudflare Tunnel requires outbound HTTPS).

GPU node (separate machine):

NVIDIA RTX-class GPU (GTX 1080 or better; RTX 30/40 series recommended).
Windows + WSL2 or Linux (PyTorch 2.7.1+cu128).
Reachable from the orchestrator host over Tailscale on the configured GPU_HOST IP/hostname.

Network:

Tailscale account (free tier is sufficient) with both the orchestrator host and the GPU node enrolled.
Cloudflare account with a domain pointed at Cloudflare DNS (free tier is sufficient).

OS Preparation

# On the orchestrator host (any modern 64-bit Linux with systemd):
ssh <user>@<host>
sudo apt-get update && sudo apt-get upgrade -y

# Ensure git and curl are present:
sudo apt-get install -y git curl

For the Raspberry Pi-specific imaging / first-boot steps, see "Hardware example: Raspberry Pi".

Tailscale Setup

Tailscale provides the encrypted overlay network between the orchestrator host and the GPU node.

# Install Tailscale on the orchestrator host:
curl -fsSL https://tailscale.com/install.sh | sh
sudo tailscale up

# Verify the GPU node is visible:
tailscale status
# You should see your GPU node listed with its Tailscale IP.

# Test reachability (replace <tailscale-ip> with your GPU node's Tailscale IP):
# nc -zv <tailscale-ip> 8000   # SenseVoice STT
# nc -zv <tailscale-ip> 8001   # CosyVoice TTS

Set GPU_HOST in /opt/vocalize/.env to the GPU node's Tailscale IP.

If the GPU services are not yet running, use install/install.sh --skip-gpu to proceed with installation without the GPU-reachability check.

Clone and Install

# Clone the repository to the orchestrator host:
git clone https://github.com/DGPisces/VocalizeAI.git /opt/vocalize
cd /opt/vocalize

# Dry-run first to preview all 7 steps:
bash install/install.sh --dry-run

# Run the full installer:
bash install/install.sh

# Or run selectively:
bash install/install.sh --steps "1,2,6"   # apt + venv + systemd only
bash install/install.sh --skip-gpu         # skip GPU-reachability check in step 7
bash install/install.sh --skip-tunnel      # skip step 5 (Cloudflare Tunnel info)

Installer steps:

Step	Action
1	`apt-get install` python3.11 python3.11-venv python3-pip build-essential rsync
2	Create `.venv` in `/opt/vocalize`, `pip install -e .`
3	GPU services note (GPU lives on a separate host; no on-orchestrator install)
4	Tailscale presence check (warns if absent)
5	Cloudflare Tunnel token-install instructions
6	Copy `vocalize.service` to `/etc/systemd/system/`, copy `.env.template` to `/opt/vocalize/.env` if absent, `systemctl enable vocalize`
7	`systemctl restart vocalize` + `bash scripts/smoke.sh`

The installer is idempotent — every step is safe to re-run on an existing deployment.

Environment Configuration

After step 6 copies .env.template to /opt/vocalize/.env, edit it:

sudo nano /opt/vocalize/.env

Full env-var reference (17 keys from .env.example):

Key	Required?	Purpose
`OPENAI_API_KEY`	yes	LLM authentication (any OpenAI-compatible provider)
`OPENAI_BASE_URL`	default ok	LLM endpoint; default `https://api.deepseek.com/v1`
`OPENAI_MODEL`	default ok	Model name; default `deepseek-chat`
`GPU_HOST`	yes (if using GPU)	STT/TTS host — Tailscale IP of your GPU node
`SENSEVOICE_WS_PORT`	default ok	STT port; default `8000`
`COSYVOICE_WS_PORT`	default ok	TTS port; default `8001`
`VOCALIZE_HOST`	default ok	uvicorn bind host; set to `0.0.0.0` for production
`VOCALIZE_PORT`	default ok	uvicorn bind port; default `8080`
`ORCHESTRATOR_LISTEN_PORT`	default ok	Orchestrator service port; default `8080` (legacy compatibility)
`VOCALIZE_WS_BASE_URL`	yes	Public WS base URL; e.g. `wss://api.<your-domain>` — startup raises if missing in non-localhost mode
`VOCALIZE_CORS_ORIGINS`	default ok	Comma-separated allowed CORS origins; default auto-picked from VOCALIZE_HOST
`DEFAULT_LANGUAGE`	default ok	`zh` or `en`; default `zh`
`LOG_DIR`	default ok	Log directory; default `logs`
`NEXT_PUBLIC_VOCALIZE_API_BASE_URL`	yes (for frontend)	Frontend API base URL; baked into JS bundle at build time
`NEXT_PUBLIC_VOCALIZE_WS_BASE_URL`	optional	Frontend WS base; derived from API base if absent

Example production .env (use your own values for all <...> placeholders):

OPENAI_API_KEY=<your-openai-compatible-api-key>
OPENAI_BASE_URL=https://api.deepseek.com/v1
OPENAI_MODEL=deepseek-chat
GPU_HOST=<tailscale-ip-of-gpu-node>
SENSEVOICE_WS_PORT=8000
COSYVOICE_WS_PORT=8001
VOCALIZE_HOST=0.0.0.0
VOCALIZE_PORT=8080
VOCALIZE_WS_BASE_URL=wss://api.<your-domain>
VOCALIZE_CORS_ORIGINS=https://<your-domain>

Cloudflare Tunnel

The Cloudflare Tunnel connects the orchestrator host to the public internet without exposing SSH or opening firewall ports.

Token-based install (recommended):

# Get your tunnel token from the Cloudflare dashboard:
# Zero Trust -> Networks -> Tunnels -> [your tunnel] -> Configure
# -> "Install and run a connector" -> Copy the displayed token

# Install the tunnel service on the orchestrator host:
sudo cloudflared service install <TUNNEL_TOKEN>

# Verify the service is running:
sudo systemctl status cloudflared

The reference ingress shape for this project is documented in infra/orchestrator/cloudflared-config.yml (maps vocalize-api.<your-tunnel-name> → http://localhost:8080 and vocalize.<your-tunnel-name> → http://localhost:3000). Configure the actual public hostname routing in the Cloudflare dashboard under your tunnel's Public Hostnames tab.

Use <your-tunnel-name> as the tunnel name — do not use another person's tunnel name; tunnels are account-specific.

systemd Units

vocalize.service

The vocalize.service unit file is at infra/orchestrator/vocalize.service and is copied to /etc/systemd/system/vocalize.service by step 6 of the installer.

[Unit]
Description=VocalizeAI Orchestrator
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=vocalize
WorkingDirectory=/opt/vocalize
ExecStart=/opt/vocalize/.venv/bin/python -m vocalize.main
Restart=always
RestartSec=5
EnvironmentFile=/opt/vocalize/.env

[Install]
WantedBy=multi-user.target

Note: The unit runs as the vocalize system user. Create it if it does not exist: sudo useradd --system --home /opt/vocalize --shell /bin/false vocalize then sudo chown -R vocalize:vocalize /opt/vocalize.

cloudflared.service

cloudflared service install <TOKEN> installs its own systemd unit automatically. No manual unit file is needed.

Useful commands:

# Check status of both services:
sudo systemctl status vocalize cloudflared

# Tail orchestrator logs:
sudo journalctl -u vocalize -f

# Tail tunnel logs:
sudo journalctl -u cloudflared -n 50

# Restart orchestrator after a config change:
sudo systemctl restart vocalize

Smoke Verification

After the installer completes (step 7 runs this automatically), verify the deployment:

# Smoke test against the orchestrator's local port:
VOCALIZE_API_BASE=http://127.0.0.1:8080 bash scripts/smoke.sh
# Exit 0 = working deployment

# Smoke test against the public Cloudflare Tunnel URL:
VOCALIZE_API_BASE=https://api.<your-domain> bash scripts/smoke.sh

The smoke script exercises 6 round-trips: GET /health, POST /api/sessions, POST /api/sessions/{id}/task, WS upgrade + send/recv, DELETE /api/sessions/{id}.

Note: the local smoke uses port 8080 (production port), not 8000 (dev port). Make sure VOCALIZE_API_BASE is set accordingly.

24-Hour Stability Check

After the deployment has been running for 24 hours, verify stability:

# Check orchestrator status and uptime:
sudo systemctl status vocalize
# Expected: active (running) with significant uptime

# Check restart count (should be 0 on a stable deployment):
systemctl show vocalize -p NRestarts

# Run a fresh smoke to confirm still operational:
VOCALIZE_API_BASE=http://127.0.0.1:8080 bash scripts/smoke.sh

The maintainer-side scripts/stability-24h-driver.py script is the rehearsal harness used to generate the v1.0 stability evidence run. External operators do not need to run it; the systemctl/smoke.sh checks above are the operator-facing stability gate.

Troubleshooting

Tunnel not connecting:

sudo journalctl -u cloudflared -n 50

Check the Cloudflare dashboard for connector status. Ensure the tunnel token matches the connector the dashboard expects.

GPU services unreachable (gpu_reachable=false in /health):

# Confirm Tailscale is up and the GPU node is reachable:
tailscale status

# Test TCP connectivity to each GPU service port:
nc -zv $GPU_HOST $SENSEVOICE_WS_PORT    # e.g. nc -zv 100.x.y.z 8000
nc -zv $GPU_HOST $COSYVOICE_WS_PORT     # e.g. nc -zv 100.x.y.z 8001

Check that SenseVoice and CosyVoice are running on the GPU host.

Orchestrator fails to start:

sudo journalctl -u vocalize -n 50

Common causes:

OPENAI_API_KEY missing or invalid — Layer 1 will fail on the first task
VOCALIZE_WS_BASE_URL unset in production mode — raises RuntimeError at startup (D-11 guard)
.env file not at /opt/vocalize/.env — check EnvironmentFile= in the unit

Port conflicts:

VOCALIZE_PORT defaults to 8080. If another service occupies that port, change VOCALIZE_PORT in .env and update the Cloudflare Tunnel ingress rule accordingly.

Hardware example: Raspberry Pi

The Raspberry Pi was the original reference target for this runbook. None of the steps above are Pi-specific; this section just captures the bits that differ when the orchestrator host happens to be a Pi.

BOM

Raspberry Pi 4 or Pi 5, 8 GB RAM recommended (4 GB works for the orchestrator alone but is tight if other services run alongside).
32 GB+ microSD card or USB SSD (SSD strongly recommended for production).
Reliable internet connection (Cloudflare Tunnel requires outbound HTTPS).

Imaging and first boot

# Flash Raspberry Pi OS Lite (64-bit) to the SD card / SSD using Raspberry
# Pi Imager. In Imager, pre-configure:
#   - hostname
#   - SSH enabled
#   - SSH public key (paste your ~/.ssh/id_ed25519.pub or generate one first)
#   - Wi-Fi credentials (if not using Ethernet)

# After first boot, SSH in and update the system:
ssh pi@<pi-hostname>
sudo apt-get update && sudo apt-get upgrade -y

# Ensure git and curl are present:
sudo apt-get install -y git curl

From here on, the rest of this runbook (Tailscale, install, Cloudflare Tunnel, smoke) applies unchanged.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploying VocalizeAI on a Linux Host

Bill of Materials

OS Preparation

Tailscale Setup

Clone and Install

Environment Configuration

Cloudflare Tunnel

systemd Units

vocalize.service

cloudflared.service

Smoke Verification

24-Hour Stability Check

Troubleshooting

Hardware example: Raspberry Pi

BOM

Imaging and first boot

FilesExpand file tree

linux.md

Latest commit

History

linux.md

File metadata and controls

Deploying VocalizeAI on a Linux Host

Bill of Materials

OS Preparation

Tailscale Setup

Clone and Install

Environment Configuration

Cloudflare Tunnel

systemd Units

vocalize.service

cloudflared.service

Smoke Verification

24-Hour Stability Check

Troubleshooting

Hardware example: Raspberry Pi

BOM

Imaging and first boot