Skip to content

Shriiii01/Clinic-guard-AI

Repository files navigation

ClinicGuard-AI

Python FastAPI Twilio License

An AI-powered voice assistant that answers clinic phone calls, transcribes patient requests, and books appointments — automatically.


The Problem

Small and mid-sized medical clinics miss dozens of calls daily. Front-desk staff are juggling in-person patients, paperwork, and billing — leaving ringing phones unanswered and patients frustrated. Missed calls mean lost appointments and lost revenue.

The Solution

ClinicGuard-AI plugs into your clinic's existing phone number via Twilio. When a patient calls, the system:

  1. Greets them with a natural voice prompt
  2. Records their message
  3. Transcribes the audio with OpenAI Whisper
  4. Generates a contextual reply with a local Llama 3 model
  5. Speaks the response back using ElevenLabs text-to-speech
  6. Persists the conversation to a database for staff review

No missed calls. No hold music. No extra headcount.


Features

  • Automated call handling — Twilio SIP integration with a complete inbound call flow
  • Speech-to-text — OpenAI Whisper transcription (runs locally, no data leaves your server)
  • AI responses — Llama 3 8B (quantized GGUF) generates context-aware replies
  • Text-to-speech — ElevenLabs API for natural-sounding voice; falls back to local TTS
  • Conversation memory — Ephemeral (per-call) or persistent (database-backed, cross-call) sessions
  • Patient records — SQLite/PostgreSQL storage of calls, conversation logs, and summaries
  • Auto-summarization — Conversation summaries saved per patient for future context (Llama or OpenAI)
  • Twilio signature validation — Webhooks verified in production; skipped in dev
  • Docker support — Single-container deployment

Architecture

Incoming call
     │
     ▼
Twilio SIP ──► POST /twilio/voice/answer
                       │  (greet + start recording)
                       ▼
               POST /twilio/voice
                       │
          ┌────────────┼────────────┐
          ▼            ▼            ▼
       Whisper       LLaMA 3    ElevenLabs
    (transcribe)   (respond)     (speak)
          └────────────┼────────────┘
                       │
                       ▼
               TwiML <Play> response
                       │
               POST /twilio/voice/end
                       │
          ┌────────────┴────────────┐
          ▼                         ▼
     Summarize                 SQLite / PostgreSQL
    (Llama/OpenAI)            (patients, calls, logs)

Tech Stack

Layer Technology
API framework FastAPI + Uvicorn
Voice / telephony Twilio Programmable Voice
Speech-to-text OpenAI Whisper (base, local)
LLM Llama 3 8B Q4 via llama-cpp-python
Text-to-speech ElevenLabs API / macOS say / pyttsx3
Database SQLAlchemy ORM, SQLite (dev) / PostgreSQL (prod)
Containerization Docker + Docker Compose
Testing pytest + unittest.mock

Setup

Prerequisites

1. Clone the repo

git clone https://github.com/Shriiii01/Clinic-guard-AI.git
cd Clinic-guard-AI

2. Create a virtual environment and install dependencies

python -m venv venv
source venv/bin/activate        # Windows: venv\Scripts\activate
pip install -r server/requirements.txt

3. Configure environment variables

cp env.example .env
# Edit .env — at minimum set TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN,
# TWILIO_PHONE_NUMBER, ELEVENLABS_API_KEY, ELEVENLABS_VOICE_ID, PUBLIC_URL

4. Download the Llama 3 model

mkdir -p models
# Download Meta-Llama-3-8B-Q4_0.gguf from HuggingFace and place it here:
# https://huggingface.co/TheBloke/Meta-Llama-3-8B-GGUF
# models/llama-3-8b-q4_0.gguf  (~4 GB)

5. Run the server

# From the project root
uvicorn server.main:app --reload --host 0.0.0.0 --port 8000

6. Expose to the internet (local dev)

ngrok http 8000
# Copy the https URL, set PUBLIC_URL in .env

7. Configure Twilio webhooks

In the Twilio Console, set your phone number's Voice webhook to:

POST https://<your-ngrok-url>/twilio/voice/answer

API Reference

Method Path Description
GET / Service status
GET /health Health check
POST /twilio/voice/answer Initial call greeting + recording prompt
POST /twilio/voice Main call handler (transcribe → generate → speak)
POST /twilio/voice/end Call teardown, summarize, cleanup
POST /api/process_audio Direct audio upload (no Twilio required)

Interactive API docs: http://localhost:8000/docs


Running Tests

pytest tests/ -v

Tests use mocked Whisper and LLaMA instances — no model downloads needed to run the test suite.


Docker

docker-compose up --build

The container copies the entire project to /app and runs uvicorn server.main:app. Mount your models/ directory via the volume in docker-compose.yml.


Configuration Reference

See env.example for all available environment variables with descriptions.

Key options:

Variable Default Description
CLINICGUARD_MEMORY_BACKEND ephemeral ephemeral = in-memory per call, persistent = DB-backed cross-call memory
CLINICGUARD_SUMMARIZER_BACKEND llama llama = local model, openai = GPT-3.5
CLINICGUARD_DB_PATH sqlite:///clinicguard.db SQLAlchemy database URL
ENVIRONMENT development Set to production to enable Twilio signature validation

Future Improvements

  • Appointment calendar integration — Google Calendar / Calendly API to actually book slots
  • IVR menu — Press 1 for appointments, 2 for prescriptions, etc.
  • Web dashboard — Staff UI to review call logs, summaries, and missed calls
  • SMS follow-ups — Twilio SMS confirmation after booking
  • Streaming TTS — Reduce first-response latency with ElevenLabs streaming
  • HIPAA hardening — At-rest encryption, audit logging, BAA with cloud providers

Project Structure

Clinic-guard-AI/
├── server/
│   ├── main.py               # FastAPI app and startup
│   ├── twilio_router.py      # Twilio webhook handlers
│   ├── agent_services.py     # Whisper, LLaMA, TTS, memory backends
│   ├── pipeline_controller.py# /api/process_audio direct endpoint
│   ├── db.py                 # SQLAlchemy models + init_db
│   ├── tts_handler.py        # ElevenLabs TTS client
│   ├── utils.py              # File helpers, env validation
│   ├── Dockerfile
│   └── requirements.txt
├── tests/
│   ├── conftest.py           # pytest fixtures (model mocks)
│   ├── test_memory.py        # Session memory + transcription tests
│   └── test_conversation.py  # Multi-turn context tests
├── scripts/
│   ├── test_pipeline.py      # End-to-end smoke test
│   └── generate_test_audio.py
├── models/                   # Place GGUF model files here (gitignored)
├── audio_files/              # Generated audio (gitignored)
├── docker-compose.yml
└── env.example

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages