This project started as a college assignment and I further developed it into a small, end-to-end ML product that trains multiple classifiers, serves them behind a Flask API, and exposes a lightweight browser UI. The goal is to predict a patient's diabetes risk/type from routinely collected clinical signals.
Given features such as age, glucose, insulin, and body mass index, predict whether an individual is non-diabetic, type-1-like, or type-2-like. The system must support experimentation with different model families, provide reproducible training, and expose a low-latency inference API that can be consumed by a web client.
- Source: Pima Indians Diabetes dataset (Kaggle/UCI) or a similarly structured CSV placed at
data/diabetes.csv. When absent, the pipeline synthesizes a dataset to keep the system runnable. - Size: ~768 rows in the canonical dataset; synthetic generator defaults to 800 rows for parity.
- Features: Age (years), Glucose (mg/dL), Insulin (µU/mL), BMI (kg/m²); target column
typeencoded as {0: non-diabetic, 1: type1-like, 2: type2-like}.
| Model | Rationale |
|---|---|
| Gaussian Naive Bayes | Fast baseline, probabilistic outputs, good for imbalanced small datasets. |
| MLPClassifier (sklearn) | Modern non-linear baseline with automatic differentiation and regularization. |
| Custom two-layer MLP | Educational implementation that exposes weight serialization and manual training loops. |
training/train.pyis the CLI entry point that orchestrates data loading, preprocessing (standardization), model training, hold-out validation, and artifact persistence (models/*.pkl).training/evaluate_and_report.pynow loads trained artifacts, runs a hold-out evaluation, saves confusion matrices + ROC curves underreports/, and prints summary metrics. (Legacy REPORT.md generation can be re-enabled if needed.)- Metrics tracked: accuracy, precision, recall, F1 (micro/macro). Confusion matrices and SHAP-style interpretability are planned (see limitations).
- Every CLI training run logs hyperparameters, metrics, and serialized artifacts to a local MLflow store (
mlruns/) using the experiment namediabetes-risk-local. Launch the dashboard withmlflow ui --backend-store-uri mlruns(tracking URIfile:///.../mlruns) to inspect experiments without reading the code.
| Model | Accuracy | Precision | Recall | F1 |
|---|---|---|---|---|
| GaussianNB | 0.7013 | 0.6928 | 0.7013 | 0.6951 |
| MLP (sklearn) | 0.7208 | 0.7140 | 0.7208 | 0.7158 |
| MLP (custom) | 0.7403 | 0.7349 | 0.7403 | 0.7364 |
Metrics synced with REPORT.md generated on 2025-12-23 via
python -m training.evaluate_and_report.
Each run also drops confusion matrices and ROC curves to reports/<model>_{confusion_matrix,roc_curve}.png, so reviewers can inspect the visuals without re-training (regenerate anytime with python -m training.evaluate_and_report).
- Training pipeline: Generates artifacts (
*.pkl, scaler) insidemodels/. - Model registry: Flask loads the serialized estimators and scaler on startup.
- REST API:
/predictaccepts JSON payloads, validates inputs, applies the scaler, and returns the predicted risk/type. - Frontend UI: Static HTML/JS client (
frontend/) hits the Flask API to let users compare model outputs interactively.
data → training scripts → model artifacts → Flask API → browser UI
pip install -r requirements.txt
# Train and persist models (also logs to mlruns/)
python -m training.train --config configs/base.yaml
# Generate evaluation plots (writes PNGs to reports/)
python -m training.evaluate_and_report
# Optional: inspect MLflow dashboard locally
mlflow ui --backend-store-uri mlruns
# Run the Flask API (serves frontend as static files)
python backend/app.py
# Visit the UI
open frontend/index.html # or navigate to http://127.0.0.1:5000The frontend JavaScript calls the API via same-origin relative paths (/predict), so it works unchanged whether served locally, inside Docker, or behind a reverse proxy.
The curated .dockerignore keeps bytecode, MLflow runs, pickled artifacts, and virtualenvs out of the build context, so Docker layers stay lean.
# Build the production image locally
docker build -t diabetes-risk-prod .
# Run it with Gunicorn listening on $PORT (defaults to 5000)
docker run --rm -p 5000:5000 --env PORT=5000 diabetes-risk-prod
# Or rely on the provided compose file for repeatable dev/prod parity
docker compose up --buildThe compose stack exposes the API on http://127.0.0.1:5000 and serves the static frontend via the same container. Override PORT or FLASK_DEBUG in docker-compose.yml or with --env flags if needed.
Pytest covers the input validation helpers (tests/test_input_validation.py) and the Flask /predict endpoint (tests/test_api.py).
python -m pytest- Install formatter/linter/test extras with
pip install -r requirements-dev.txt(includespytest,black,isort,flake8, andpre-commit). - Enable Git hooks by running
pre-commit installonce; enforce them manually anytime withpre-commit run --all-files. - Black + isort keep the code style consistent, while flake8 prevents lint regressions before CI even runs.
- Needs formal data validation (Great Expectations) and stronger provenance tracking when switching between real vs synthetic data.
- No centralized experiment tracking backend beyond the local MLflow file store; promotion-ready tracking (e.g., hosted MLflow/W&B) is still future work.
- Interpretability (SHAP, feature importances) and monitoring hooks are not yet implemented.
Planned improvements include structured logging, automated tests for the Flask app, and a richer React/Vite frontend with model comparison charts.