FastAPI ML Model Deployment Template

A production-ready, reusable template for deploying any ML model (forecasting, classification, etc.) with FastAPI. Just drop your model into the /app/models folder and deploy!

🚀 Features

✅ Modular & Scalable Architecture - Easy to extend and maintain
✅ Dynamic Model Loading - Add models without code changes
✅ Multiple Prediction Endpoints - Path-based or body-based routing
✅ Model Versioning Support - Deploy multiple versions simultaneously
✅ Production-Ready Logging - JSON logging for production environments
✅ Pydantic Validation - Type-safe request/response handling
✅ Docker & Docker Compose - Containerized deployment
✅ Comprehensive Testing - Unit tests included
✅ CORS Enabled - Ready for frontend integration
✅ Auto-Generated Docs - Swagger UI & ReDoc

📁 Project Structure

├── app/
│   ├── api/
│   │   ├── __init__.py
│   │   ├── routes.py          # API endpoints
│   │   └── schemas.py         # Pydantic models
│   ├── core/
│   │   ├── __init__.py
│   │   ├── config.py          # Settings & configuration
│   │   ├── logging.py         # Logging setup
│   │   └── startup.py         # Startup/shutdown events
│   ├── models/
│   │   ├── __init__.py
│   │   ├── base.py            # Base model interface
│   │   ├── forecast.py        # Example: Forecasting model
│   │   └── classifier.py      # Example: Classification model
│   ├── services/
│   │   ├── __init__.py
│   │   └── model_service.py   # Model loading & inference
│   ├── utils/
│   │   └── __init__.py        # Utility functions
│   └── main.py                # FastAPI application
├── tests/
│   ├── __init__.py
│   ├── conftest.py
│   ├── test_api.py
│   └── test_models.py
├── model_storage/             # Store trained models here
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
├── env.example                # Environment variables template
└── README.md

🏁 Quick Start

1. Clone or Download

# No need to clone - this is your template!
cd your-project-directory

2. Install Dependencies

# Create virtual environment
python -m venv venv

# Activate (Windows)
venv\Scripts\activate

# Activate (Linux/Mac)
source venv/bin/activate

# Install requirements
pip install -r requirements.txt

3. Configure Environment

# Copy environment template
cp env.example .env

# Edit .env with your settings

4. Run the Application

# Development mode with auto-reload
python app/main.py

# Or using uvicorn directly
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

5. Access the API

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc
Health Check: http://localhost:8000/api/v1/health

🐳 Docker Deployment

Build and Run with Docker

# Build the image
docker build -t ml-model-api .

# Run the container
docker run -p 8000:8000 --env-file .env ml-model-api

Using Docker Compose (Recommended)

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

📝 How to Add a New Model

Step 1: Create Your Model File

Create a new file in app/models/, for example app/models/sentiment.py:

"""Sentiment Analysis Model"""
import logging
from typing import Dict, Any
from app.models.base import BaseMLModel

logger = logging.getLogger(__name__)


class SentimentModel(BaseMLModel):
    """Sentiment analysis model"""
    
    def __init__(self):
        super().__init__(model_name="sentiment", version="1.0")
        self.model = None
    
    async def load_model(self) -> None:
        """Load your trained sentiment model"""
        logger.info(f"Loading {self.model_name} model...")
        
        # Option 1: Load from file
        # import joblib
        # self.model = joblib.load("model_storage/sentiment.pkl")
        
        # Option 2: Load from cloud storage
        # self.model = load_from_s3("bucket", "sentiment.pkl")
        
        # Option 3: Initialize new model (demo)
        # from transformers import pipeline
        # self.model = pipeline("sentiment-analysis")
        
        self.is_loaded = True
        logger.info(f"{self.model_name} model loaded")
    
    async def predict(self, input_data: Dict[str, Any]) -> Dict[str, Any]:
        """Make predictions"""
        if not self.is_loaded:
            raise RuntimeError(f"Model {self.model_name} is not loaded")
        
        text = input_data.get("text", "")
        
        # Your prediction logic here
        # result = self.model(text)
        
        return {
            "text": text,
            "sentiment": "positive",  # Replace with actual prediction
            "confidence": 0.95,
            "model": self.model_name,
            "version": self.version
        }

Step 2: Register the Model

Edit env.example or .env:

MODEL_REGISTRY={"forecast": "forecast.py", "classifier": "classifier.py", "sentiment": "sentiment.py"}

Step 3: Restart and Test

# Restart the application
# The new model will be automatically loaded!

# Test it
curl -X POST "http://localhost:8000/api/v1/predict/sentiment" \
  -H "Content-Type: application/json" \
  -d '{"text": "This is amazing!"}'

🔌 API Endpoints

Health Check

GET /api/v1/health

Response:

{
  "status": "ok",
  "version": "1.0.0",
  "models_loaded": 2
}

List All Models

GET /api/v1/models

Response:

{
  "models": [
    {
      "name": "forecast",
      "version": "1.0",
      "loaded": true,
      "available_versions": ["1"]
    },
    {
      "name": "classifier",
      "version": "1.0",
      "loaded": true,
      "available_versions": ["1"]
    }
  ],
  "total": 2
}

Get Model Info

GET /api/v1/models/{model_name}

Example:

curl http://localhost:8000/api/v1/models/forecast

Predict (Path-based)

POST /api/v1/predict/{model_name}

Example - Forecast:

curl -X POST "http://localhost:8000/api/v1/predict/forecast" \
  -H "Content-Type: application/json" \
  -d '{
    "periods": 30,
    "freq": "D"
  }'

Response:

{
  "success": true,
  "model": "forecast",
  "version": "1.0",
  "result": {
    "predictions": [10.5, 11.2, 12.1, ...],
    "dates": ["2024-01-01", "2024-01-02", ...],
    "lower_bound": [9.5, 10.2, ...],
    "upper_bound": [11.5, 12.2, ...],
    "periods": 30
  }
}

Example - Classifier:

curl -X POST "http://localhost:8000/api/v1/predict/classifier" \
  -H "Content-Type: application/json" \
  -d '{
    "features": [1.2, 3.4, 5.6, 7.8]
  }'

Response:

{
  "success": true,
  "model": "classifier",
  "version": "1.0",
  "result": {
    "prediction": "class_1",
    "probabilities": {
      "class_0": 0.1,
      "class_1": 0.7,
      "class_2": 0.2
    },
    "confidence": 0.7
  }
}

Predict (Body-based)

POST /api/v1/predict

Example:

curl -X POST "http://localhost:8000/api/v1/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "forecast",
    "input_data": {
      "periods": 7,
      "freq": "D"
    },
    "version": "1"
  }'

Predict with Version (Query Parameter)

POST /api/v1/predict/{model_name}?v={version}

Example:

curl -X POST "http://localhost:8000/api/v1/predict/forecast?v=2" \
  -H "Content-Type: application/json" \
  -d '{
    "periods": 30,
    "freq": "D"
  }'

Reload Model

POST /api/v1/models/{model_name}/reload

Example:

curl -X POST "http://localhost:8000/api/v1/models/forecast/reload"

🧪 Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=app

# Run specific test file
pytest tests/test_api.py

# Run with verbose output
pytest -v

📊 Model Storage

Store your trained models in the model_storage/ directory:

model_storage/
├── forecast_model.pkl
├── classifier_model.json
├── sentiment_model.h5
└── ...

Load them in your model's load_model() method:

import joblib
from app.core.config import settings

model_path = f"{settings.model_storage_path}/your_model.pkl"
self.model = joblib.load(model_path)

⚙️ Configuration

All configuration is managed through environment variables (.env file):

# Application
APP_NAME="ML Model API"
APP_VERSION="1.0.0"
DEBUG=false
LOG_LEVEL=INFO

# Server
HOST=0.0.0.0
PORT=8000
WORKERS=1

# CORS
CORS_ORIGINS=["http://localhost:3000","http://localhost:8000"]

# Models
MODELS_PATH=./app/models
MODEL_REGISTRY={"forecast": "forecast.py", "classifier": "classifier.py"}
MODEL_STORAGE_PATH=./model_storage

📝 Logging

The template includes production-ready JSON logging:

import logging

logger = logging.getLogger(__name__)

# Logs will include context
logger.info("Processing prediction", extra={
    "model": "forecast",
    "user_id": user_id,
    "request_id": request_id
})

🔧 Advanced Features

Model Versioning

Deploy multiple versions of the same model:

# In model_service.py
await model_service.load_model("forecast", "forecast.py", version="1")
await model_service.load_model("forecast", "forecast_v2.py", version="2")

# Use specific version
result = await model_service.predict(
    model_name="forecast",
    input_data=data,
    version="2"
)

Custom Validation Schemas

Create model-specific schemas in app/api/schemas.py:

class SentimentRequest(BaseModel):
    text: str = Field(..., min_length=1, max_length=1000)
    language: str = Field(default="en", pattern="^(en|es|fr)$")

Background Model Loading

Models are loaded asynchronously on startup, so the API starts quickly.

Graceful Shutdown

Models are properly unloaded during shutdown to free memory.

🚦 Production Deployment

Environment Variables for Production

DEBUG=false
LOG_LEVEL=WARNING
WORKERS=4

Using Gunicorn with Uvicorn Workers

gunicorn app.main:app \
  -w 4 \
  -k uvicorn.workers.UvicornWorker \
  --bind 0.0.0.0:8000 \
  --timeout 120

Kubernetes Deployment

Create k8s/deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ml-model-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: ml-model-api
  template:
    metadata:
      labels:
        app: ml-model-api
    spec:
      containers:
      - name: api
        image: your-registry/ml-model-api:latest
        ports:
        - containerPort: 8000
        env:
        - name: WORKERS
          value: "1"
        livenessProbe:
          httpGet:
            path: /api/v1/health
            port: 8000
          initialDelaySeconds: 30
          periodSeconds: 10

🤝 Contributing

This is a template - customize it for your needs!

📄 License

MIT License - Use freely!

🙋 Support

For issues or questions:

Check the Swagger UI at /docs
Review the example models in app/models/
Read the API schemas in app/api/schemas.py

🎯 Next Steps

Replace example models with your actual trained models
Configure model registry in .env
Add model-specific validation schemas
Set up monitoring (Prometheus, Grafana)
Add authentication if needed (JWT, API keys)
Deploy to cloud (AWS, GCP, Azure)

Happy Deploying! 🚀

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
app		app
examples		examples
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
DEPLOYMENT.md		DEPLOYMENT.md
Dockerfile		Dockerfile
LICENSE		LICENSE
PROJECT_OVERVIEW.md		PROJECT_OVERVIEW.md
QUICKSTART.md		QUICKSTART.md
README.md		README.md
START_HERE.md		START_HERE.md
STRUCTURE.md		STRUCTURE.md
SUMMARY.md		SUMMARY.md
TREE.txt		TREE.txt
docker-compose.yml		docker-compose.yml
env.example		env.example
pytest.ini		pytest.ini
requirements.txt		requirements.txt
run.py		run.py

License

alghamdima/fastapi-ml

Folders and files

Latest commit

History

Repository files navigation

FastAPI ML Model Deployment Template

🚀 Features

📁 Project Structure

🏁 Quick Start

1. Clone or Download

2. Install Dependencies

3. Configure Environment

4. Run the Application

5. Access the API

🐳 Docker Deployment

Build and Run with Docker

Using Docker Compose (Recommended)

📝 How to Add a New Model

Step 1: Create Your Model File

Step 2: Register the Model

Step 3: Restart and Test

🔌 API Endpoints

Health Check

List All Models

Get Model Info

Predict (Path-based)

Predict (Body-based)

Predict with Version (Query Parameter)

Reload Model

🧪 Testing

📊 Model Storage

⚙️ Configuration

📝 Logging

🔧 Advanced Features

Model Versioning

Custom Validation Schemas

Background Model Loading

Graceful Shutdown

🚦 Production Deployment

Environment Variables for Production

Using Gunicorn with Uvicorn Workers

Kubernetes Deployment

🤝 Contributing

📄 License

🙋 Support

🎯 Next Steps

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages