Pet Insurance Claims AI Automation (Rule + ML + Document Intelligence) – Production-minded POC

End-to-end applied AI system for automated claims decisioning:

• Ground-truth dataset generation (SQL + DuckDB)
• Coverage rule engine + ML approval model
• Invoice text → structured feature extraction
• FastAPI real-time scoring service
• Metrics endpoint for operational monitoring

This project mirrors a production Data Scientist workflow for modernizing the claims lifecycle — from raw data → validated gold layer → trained model → explainable API decision.

What this demonstrates

Ground truth & trusted reporting: SQL-first gold dataset with validation checks
Structured + unstructured ML features: policy data + invoice text extraction
Decision automation: rule system + probability-based ML recommendations
Full model lifecycle: train → evaluate → versioned artifact
Production mindset: low-latency API with operational metrics
Explainability: human-readable decision reasoning

3-minute demo flow

Generate synthetic claims and invoices
Build analytics-ready gold dataset
Train approval model
Start FastAPI service
Submit a claim → receive decision, confidence, payout, and explanation

Quick start

1) Create environment

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

2) Generate data + build gold dataset

python -m src.data.generate_synthetic_data --out-dir data/raw --n 2000
python -m src.data.build_gold_dataset --raw-dir data/raw --db-path data/warehouse.duckdb

3) Train model

python -m src.models.train --db-path data/warehouse.duckdb --model-dir artifacts/model

4) Run API

uvicorn src.api.app:app --reload --port 8000

Open:

Swagger UI http://127.0.0.1:8000/docs

Metrics endpoint http://127.0.0.1:8000/metrics

Example request

curl -X POST "http://127.0.0.1:8000/submit-claim" \
-H "Content-Type: application/json" \
-d '{
  "claim_id": "CLM-NEW-001",
  "policy_id": "POL-00010",
  "pet_id": "PET-00010",
  "invoice_text": "Date: 2026-02-02\nProcedure: XRAY\nDiagnosis: BACK_PAIN\nTotal: $350\n",
  "claimed_amount": 350
}'

Demo – Automated Claims Decisioning

Same treatment → different policy → different outcome

BASIC policy → high-confidence deny

PREMIUM policy → approve with computed payout

Model performance

The system combines deterministic coverage rules with an ML approval model to automate claim decisions. Covered procedures under higher-tier policies are recommended for approval with a calculated reimbursement, while non-covered scenarios produce high-confidence denials with clear reasoning.

Business impact (simulated workflow)

Consistent, explainable claim decisions
Reduced manual review for high-confidence cases
Trusted “ground truth” layer for reporting and model evaluation
Real-time decision support for operations teams

Repo layout

src/
  api/            FastAPI service + metrics
  data/           synthetic generator + gold builder (DuckDB)
  decisioning/    rule + model decision engine
  models/         train + predict helpers
  nlp/            invoice extractor
  evaluation/     metrics helpers

Notes

Designed to be small, runnable, and demo-friendly
DuckDB simulates a cloud warehouse for local development
Synthetic data used to mirror real claims workflows

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs/images		docs/images
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pet Insurance Claims AI Automation (Rule + ML + Document Intelligence) – Production-minded POC

What this demonstrates

3-minute demo flow

Quick start

1) Create environment

2) Generate data + build gold dataset

3) Train model

4) Run API

Example request

Demo – Automated Claims Decisioning

Same treatment → different policy → different outcome

BASIC policy → high-confidence deny

PREMIUM policy → approve with computed payout

Model performance

Business impact (simulated workflow)

Repo layout

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pet Insurance Claims AI Automation (Rule + ML + Document Intelligence) – Production-minded POC

What this demonstrates

3-minute demo flow

Quick start

1) Create environment

2) Generate data + build gold dataset

3) Train model

4) Run API

Example request

Demo – Automated Claims Decisioning

Same treatment → different policy → different outcome

BASIC policy → high-confidence deny

PREMIUM policy → approve with computed payout

Model performance

Business impact (simulated workflow)

Repo layout

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages