An AI-powered voice assistant that answers clinic phone calls, transcribes patient requests, and books appointments — automatically.
Small and mid-sized medical clinics miss dozens of calls daily. Front-desk staff are juggling in-person patients, paperwork, and billing — leaving ringing phones unanswered and patients frustrated. Missed calls mean lost appointments and lost revenue.
ClinicGuard-AI plugs into your clinic's existing phone number via Twilio. When a patient calls, the system:
- Greets them with a natural voice prompt
- Records their message
- Transcribes the audio with OpenAI Whisper
- Generates a contextual reply with a local Llama 3 model
- Speaks the response back using ElevenLabs text-to-speech
- Persists the conversation to a database for staff review
No missed calls. No hold music. No extra headcount.
- Automated call handling — Twilio SIP integration with a complete inbound call flow
- Speech-to-text — OpenAI Whisper transcription (runs locally, no data leaves your server)
- AI responses — Llama 3 8B (quantized GGUF) generates context-aware replies
- Text-to-speech — ElevenLabs API for natural-sounding voice; falls back to local TTS
- Conversation memory — Ephemeral (per-call) or persistent (database-backed, cross-call) sessions
- Patient records — SQLite/PostgreSQL storage of calls, conversation logs, and summaries
- Auto-summarization — Conversation summaries saved per patient for future context (Llama or OpenAI)
- Twilio signature validation — Webhooks verified in production; skipped in dev
- Docker support — Single-container deployment
Incoming call
│
▼
Twilio SIP ──► POST /twilio/voice/answer
│ (greet + start recording)
▼
POST /twilio/voice
│
┌────────────┼────────────┐
▼ ▼ ▼
Whisper LLaMA 3 ElevenLabs
(transcribe) (respond) (speak)
└────────────┼────────────┘
│
▼
TwiML <Play> response
│
POST /twilio/voice/end
│
┌────────────┴────────────┐
▼ ▼
Summarize SQLite / PostgreSQL
(Llama/OpenAI) (patients, calls, logs)
| Layer | Technology |
|---|---|
| API framework | FastAPI + Uvicorn |
| Voice / telephony | Twilio Programmable Voice |
| Speech-to-text | OpenAI Whisper (base, local) |
| LLM | Llama 3 8B Q4 via llama-cpp-python |
| Text-to-speech | ElevenLabs API / macOS say / pyttsx3 |
| Database | SQLAlchemy ORM, SQLite (dev) / PostgreSQL (prod) |
| Containerization | Docker + Docker Compose |
| Testing | pytest + unittest.mock |
- Python 3.8+
- Twilio account with a phone number
- ElevenLabs API key (optional — local TTS fallback available)
- Llama 3 8B GGUF model file
- ngrok for local webhook testing
git clone https://github.com/Shriiii01/Clinic-guard-AI.git
cd Clinic-guard-AIpython -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r server/requirements.txtcp env.example .env
# Edit .env — at minimum set TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN,
# TWILIO_PHONE_NUMBER, ELEVENLABS_API_KEY, ELEVENLABS_VOICE_ID, PUBLIC_URLmkdir -p models
# Download Meta-Llama-3-8B-Q4_0.gguf from HuggingFace and place it here:
# https://huggingface.co/TheBloke/Meta-Llama-3-8B-GGUF
# models/llama-3-8b-q4_0.gguf (~4 GB)# From the project root
uvicorn server.main:app --reload --host 0.0.0.0 --port 8000ngrok http 8000
# Copy the https URL, set PUBLIC_URL in .envIn the Twilio Console, set your phone number's Voice webhook to:
POST https://<your-ngrok-url>/twilio/voice/answer
| Method | Path | Description |
|---|---|---|
| GET | / |
Service status |
| GET | /health |
Health check |
| POST | /twilio/voice/answer |
Initial call greeting + recording prompt |
| POST | /twilio/voice |
Main call handler (transcribe → generate → speak) |
| POST | /twilio/voice/end |
Call teardown, summarize, cleanup |
| POST | /api/process_audio |
Direct audio upload (no Twilio required) |
Interactive API docs: http://localhost:8000/docs
pytest tests/ -vTests use mocked Whisper and LLaMA instances — no model downloads needed to run the test suite.
docker-compose up --buildThe container copies the entire project to /app and runs uvicorn server.main:app. Mount your models/ directory via the volume in docker-compose.yml.
See env.example for all available environment variables with descriptions.
Key options:
| Variable | Default | Description |
|---|---|---|
CLINICGUARD_MEMORY_BACKEND |
ephemeral |
ephemeral = in-memory per call, persistent = DB-backed cross-call memory |
CLINICGUARD_SUMMARIZER_BACKEND |
llama |
llama = local model, openai = GPT-3.5 |
CLINICGUARD_DB_PATH |
sqlite:///clinicguard.db |
SQLAlchemy database URL |
ENVIRONMENT |
development |
Set to production to enable Twilio signature validation |
- Appointment calendar integration — Google Calendar / Calendly API to actually book slots
- IVR menu — Press 1 for appointments, 2 for prescriptions, etc.
- Web dashboard — Staff UI to review call logs, summaries, and missed calls
- SMS follow-ups — Twilio SMS confirmation after booking
- Streaming TTS — Reduce first-response latency with ElevenLabs streaming
- HIPAA hardening — At-rest encryption, audit logging, BAA with cloud providers
Clinic-guard-AI/
├── server/
│ ├── main.py # FastAPI app and startup
│ ├── twilio_router.py # Twilio webhook handlers
│ ├── agent_services.py # Whisper, LLaMA, TTS, memory backends
│ ├── pipeline_controller.py# /api/process_audio direct endpoint
│ ├── db.py # SQLAlchemy models + init_db
│ ├── tts_handler.py # ElevenLabs TTS client
│ ├── utils.py # File helpers, env validation
│ ├── Dockerfile
│ └── requirements.txt
├── tests/
│ ├── conftest.py # pytest fixtures (model mocks)
│ ├── test_memory.py # Session memory + transcription tests
│ └── test_conversation.py # Multi-turn context tests
├── scripts/
│ ├── test_pipeline.py # End-to-end smoke test
│ └── generate_test_audio.py
├── models/ # Place GGUF model files here (gitignored)
├── audio_files/ # Generated audio (gitignored)
├── docker-compose.yml
└── env.example
MIT