Machine learning model deployment from notebook to production using FastAPI, Docker, and CI/CD pipelines.
Production-grade REST API for serving a scikit-learn GradientBoosting model, built with FastAPI and Docker.
Part of the Production ML Engineering series at EmiTechLogic.
ml-service/
├── model/
│ ├── train.py # train & persist artifacts
│ ├── predict.py # ModelPredictor class
│ └── artifacts/ # generated by train.py (gitignored)
├── api/
│ ├── main.py # FastAPI app
│ └── schemas.py # Pydantic request/response models
├── tests/
│ ├── test_api.py # integration tests (TestClient)
│ └── test_predict.py # unit tests (ModelPredictor)
├── deploy/
│ └── prometheus.yml # Prometheus scrape config
├── .github/
│ └── workflows/
│ └── ml-deploy.yml # CI/CD pipeline
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
└── requirements-dev.txt
# 1. Install dependencies
pip install -r requirements.txt -r requirements-dev.txt
# 2. Train the model — generates model/artifacts/*.pkl
python -m model.train
# 3. Run the API
uvicorn api.main:app --reload --port 8000
# 4. Run tests
pytest tests/ -v# Train artifacts first (needed before the image build)
pip install -r requirements.txt && python -m model.train
# Build and start all services
docker compose up --build
# Services:
# API → http://localhost:8000
# Docs → http://localhost:8000/docs
# Prometheus → http://localhost:9090
# Grafana → http://localhost:3000 (admin / admin)| Method | Path | Description |
|---|---|---|
| GET | /health |
Liveness probe |
| GET | /ready |
Readiness probe |
| POST | /predict |
Single-sample prediction |
| POST | /predict/batch |
Batch prediction (up to 512 samples) |
| GET | /docs |
Swagger UI |
| GET | /redoc |
ReDoc |
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{
"features": [
17.99, 10.38, 122.8, 1001.0, 0.1184,
0.2776, 0.3001, 0.1471, 0.2419, 0.07871,
1.095, 0.9053, 8.589, 153.4, 0.006399,
0.04904, 0.05373, 0.01587, 0.03003, 0.006193,
25.38, 17.33, 184.6, 2019.0, 0.1622,
0.6656, 0.7119, 0.2654, 0.4601, 0.1189
]
}'Example response:
{
"label": 0,
"probabilities": {
"class_0": 0.962341,
"class_1": 0.037659
},
"model_version": "1.0.0",
"n_features_in": 30
}| Variable | Default | Description |
|---|---|---|
MODEL_VERSION |
1.0.0 |
Injected at build time or at runtime |
The GitHub Actions workflow (.github/workflows/ml-deploy.yml) handles:
- Test — lint, unit tests, integration tests
- Build — Docker image pushed to Docker Hub
- Deploy staging — SSH deploy + smoke test
- Deploy production — manual approval gate + SSH deploy