AI phone support demo using Vapi. Callers dial a real phone number, talk to a voice agent, and the agent uses tools to look up orders, check hours, search policies, and escalate to humans.
Caller dials phone number
│
▼
Vapi (voice platform)
├── Deepgram STT (speech → text)
├── GPT-4.1 LLM (decides what to say / which tool to call)
├── Vapi TTS (text → speech)
│
└── Tool calls ──POST──▶ This server (FastAPI)
├── lookup_order
├── business_hours
├── search_kb
└── escalate_to_human
Vapi handles the entire voice pipeline. This server is a stateless tool server — it only gets called when the LLM needs data.
python -m venv venv && source venv/bin/activate
pip install -e ".[dev]"
python scripts/index_kb.py
# Start the server
uvicorn respondo.main:app --reload --port 8000
# Expose publicly for Vapi
cloudflared tunnel --url http://localhost:8000
# or: ngrok http 8000
# Fill in keys
cp .env.example .env
# Set VAPI_API_KEY, VAPI_PUBLIC_KEY, PUBLIC_WEBHOOK_URL
# Create the Vapi assistant
source .env && python scripts/provision_vapi.py
# Attach the assistant to a phone number in the Vapi dashboard
# Call your number!| Order | Zip | Status |
|---|---|---|
| NW-1001 | 97201 | Shipped |
| NW-1002 | 10001 | Processing |
| NW-1003 | 94110 | Delivered |
| NW-1004 | 60614 | Delayed (weather) |
| NW-1005 | 30301 | Out for delivery |
| NW-1006 | 02139 | Lost |
| NW-1007 | 98101 | Refunded |
respondo-vapi/
├── agent/
│ ├── config.yaml # Vapi assistant config
│ └── system_prompt.md # Voice agent prompt
├── kb/*.md # Knowledge base (5 files)
├── server/respondo/
│ ├── main.py # FastAPI webhook server
│ ├── db.py # SQLite call logging
│ ├── tools/
│ │ ├── schemas.py # Tool definitions (Pydantic + Vapi format)
│ │ └── handlers.py # Tool implementations
│ └── mocks/orders.py # Mock order data
├── scripts/
│ ├── provision_vapi.py # Create/update Vapi assistant
│ └── index_kb.py # Index KB → SQLite FTS5
└── docker-compose.yaml
See also: respondo-ollama — same tools running locally with Ollama (no API keys needed).