Skip to content

alghamdima/fastapi-ml

Repository files navigation

FastAPI ML Model Deployment Template

A production-ready, reusable template for deploying any ML model (forecasting, classification, etc.) with FastAPI. Just drop your model into the /app/models folder and deploy!

πŸš€ Features

  • βœ… Modular & Scalable Architecture - Easy to extend and maintain
  • βœ… Dynamic Model Loading - Add models without code changes
  • βœ… Multiple Prediction Endpoints - Path-based or body-based routing
  • βœ… Model Versioning Support - Deploy multiple versions simultaneously
  • βœ… Production-Ready Logging - JSON logging for production environments
  • βœ… Pydantic Validation - Type-safe request/response handling
  • βœ… Docker & Docker Compose - Containerized deployment
  • βœ… Comprehensive Testing - Unit tests included
  • βœ… CORS Enabled - Ready for frontend integration
  • βœ… Auto-Generated Docs - Swagger UI & ReDoc

πŸ“ Project Structure

β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ routes.py          # API endpoints
β”‚   β”‚   └── schemas.py         # Pydantic models
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ config.py          # Settings & configuration
β”‚   β”‚   β”œβ”€β”€ logging.py         # Logging setup
β”‚   β”‚   └── startup.py         # Startup/shutdown events
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   β”œβ”€β”€ base.py            # Base model interface
β”‚   β”‚   β”œβ”€β”€ forecast.py        # Example: Forecasting model
β”‚   β”‚   └── classifier.py      # Example: Classification model
β”‚   β”œβ”€β”€ services/
β”‚   β”‚   β”œβ”€β”€ __init__.py
β”‚   β”‚   └── model_service.py   # Model loading & inference
β”‚   β”œβ”€β”€ utils/
β”‚   β”‚   └── __init__.py        # Utility functions
β”‚   └── main.py                # FastAPI application
β”œβ”€β”€ tests/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ conftest.py
β”‚   β”œβ”€β”€ test_api.py
β”‚   └── test_models.py
β”œβ”€β”€ model_storage/             # Store trained models here
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ env.example                # Environment variables template
└── README.md

🏁 Quick Start

1. Clone or Download

# No need to clone - this is your template!
cd your-project-directory

2. Install Dependencies

# Create virtual environment
python -m venv venv

# Activate (Windows)
venv\Scripts\activate

# Activate (Linux/Mac)
source venv/bin/activate

# Install requirements
pip install -r requirements.txt

3. Configure Environment

# Copy environment template
cp env.example .env

# Edit .env with your settings

4. Run the Application

# Development mode with auto-reload
python app/main.py

# Or using uvicorn directly
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

5. Access the API

🐳 Docker Deployment

Build and Run with Docker

# Build the image
docker build -t ml-model-api .

# Run the container
docker run -p 8000:8000 --env-file .env ml-model-api

Using Docker Compose (Recommended)

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

πŸ“ How to Add a New Model

Step 1: Create Your Model File

Create a new file in app/models/, for example app/models/sentiment.py:

"""Sentiment Analysis Model"""
import logging
from typing import Dict, Any
from app.models.base import BaseMLModel

logger = logging.getLogger(__name__)


class SentimentModel(BaseMLModel):
    """Sentiment analysis model"""
    
    def __init__(self):
        super().__init__(model_name="sentiment", version="1.0")
        self.model = None
    
    async def load_model(self) -> None:
        """Load your trained sentiment model"""
        logger.info(f"Loading {self.model_name} model...")
        
        # Option 1: Load from file
        # import joblib
        # self.model = joblib.load("model_storage/sentiment.pkl")
        
        # Option 2: Load from cloud storage
        # self.model = load_from_s3("bucket", "sentiment.pkl")
        
        # Option 3: Initialize new model (demo)
        # from transformers import pipeline
        # self.model = pipeline("sentiment-analysis")
        
        self.is_loaded = True
        logger.info(f"{self.model_name} model loaded")
    
    async def predict(self, input_data: Dict[str, Any]) -> Dict[str, Any]:
        """Make predictions"""
        if not self.is_loaded:
            raise RuntimeError(f"Model {self.model_name} is not loaded")
        
        text = input_data.get("text", "")
        
        # Your prediction logic here
        # result = self.model(text)
        
        return {
            "text": text,
            "sentiment": "positive",  # Replace with actual prediction
            "confidence": 0.95,
            "model": self.model_name,
            "version": self.version
        }

Step 2: Register the Model

Edit env.example or .env:

MODEL_REGISTRY={"forecast": "forecast.py", "classifier": "classifier.py", "sentiment": "sentiment.py"}

Step 3: Restart and Test

# Restart the application
# The new model will be automatically loaded!

# Test it
curl -X POST "http://localhost:8000/api/v1/predict/sentiment" \
  -H "Content-Type: application/json" \
  -d '{"text": "This is amazing!"}'

πŸ”Œ API Endpoints

Health Check

GET /api/v1/health

Response:

{
  "status": "ok",
  "version": "1.0.0",
  "models_loaded": 2
}

List All Models

GET /api/v1/models

Response:

{
  "models": [
    {
      "name": "forecast",
      "version": "1.0",
      "loaded": true,
      "available_versions": ["1"]
    },
    {
      "name": "classifier",
      "version": "1.0",
      "loaded": true,
      "available_versions": ["1"]
    }
  ],
  "total": 2
}

Get Model Info

GET /api/v1/models/{model_name}

Example:

curl http://localhost:8000/api/v1/models/forecast

Predict (Path-based)

POST /api/v1/predict/{model_name}

Example - Forecast:

curl -X POST "http://localhost:8000/api/v1/predict/forecast" \
  -H "Content-Type: application/json" \
  -d '{
    "periods": 30,
    "freq": "D"
  }'

Response:

{
  "success": true,
  "model": "forecast",
  "version": "1.0",
  "result": {
    "predictions": [10.5, 11.2, 12.1, ...],
    "dates": ["2024-01-01", "2024-01-02", ...],
    "lower_bound": [9.5, 10.2, ...],
    "upper_bound": [11.5, 12.2, ...],
    "periods": 30
  }
}

Example - Classifier:

curl -X POST "http://localhost:8000/api/v1/predict/classifier" \
  -H "Content-Type: application/json" \
  -d '{
    "features": [1.2, 3.4, 5.6, 7.8]
  }'

Response:

{
  "success": true,
  "model": "classifier",
  "version": "1.0",
  "result": {
    "prediction": "class_1",
    "probabilities": {
      "class_0": 0.1,
      "class_1": 0.7,
      "class_2": 0.2
    },
    "confidence": 0.7
  }
}

Predict (Body-based)

POST /api/v1/predict

Example:

curl -X POST "http://localhost:8000/api/v1/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "model_name": "forecast",
    "input_data": {
      "periods": 7,
      "freq": "D"
    },
    "version": "1"
  }'

Predict with Version (Query Parameter)

POST /api/v1/predict/{model_name}?v={version}

Example:

curl -X POST "http://localhost:8000/api/v1/predict/forecast?v=2" \
  -H "Content-Type: application/json" \
  -d '{
    "periods": 30,
    "freq": "D"
  }'

Reload Model

POST /api/v1/models/{model_name}/reload

Example:

curl -X POST "http://localhost:8000/api/v1/models/forecast/reload"

πŸ§ͺ Testing

# Run all tests
pytest

# Run with coverage
pytest --cov=app

# Run specific test file
pytest tests/test_api.py

# Run with verbose output
pytest -v

πŸ“Š Model Storage

Store your trained models in the model_storage/ directory:

model_storage/
β”œβ”€β”€ forecast_model.pkl
β”œβ”€β”€ classifier_model.json
β”œβ”€β”€ sentiment_model.h5
└── ...

Load them in your model's load_model() method:

import joblib
from app.core.config import settings

model_path = f"{settings.model_storage_path}/your_model.pkl"
self.model = joblib.load(model_path)

βš™οΈ Configuration

All configuration is managed through environment variables (.env file):

# Application
APP_NAME="ML Model API"
APP_VERSION="1.0.0"
DEBUG=false
LOG_LEVEL=INFO

# Server
HOST=0.0.0.0
PORT=8000
WORKERS=1

# CORS
CORS_ORIGINS=["http://localhost:3000","http://localhost:8000"]

# Models
MODELS_PATH=./app/models
MODEL_REGISTRY={"forecast": "forecast.py", "classifier": "classifier.py"}
MODEL_STORAGE_PATH=./model_storage

πŸ“ Logging

The template includes production-ready JSON logging:

import logging

logger = logging.getLogger(__name__)

# Logs will include context
logger.info("Processing prediction", extra={
    "model": "forecast",
    "user_id": user_id,
    "request_id": request_id
})

πŸ”§ Advanced Features

Model Versioning

Deploy multiple versions of the same model:

# In model_service.py
await model_service.load_model("forecast", "forecast.py", version="1")
await model_service.load_model("forecast", "forecast_v2.py", version="2")

# Use specific version
result = await model_service.predict(
    model_name="forecast",
    input_data=data,
    version="2"
)

Custom Validation Schemas

Create model-specific schemas in app/api/schemas.py:

class SentimentRequest(BaseModel):
    text: str = Field(..., min_length=1, max_length=1000)
    language: str = Field(default="en", pattern="^(en|es|fr)$")

Background Model Loading

Models are loaded asynchronously on startup, so the API starts quickly.

Graceful Shutdown

Models are properly unloaded during shutdown to free memory.

🚦 Production Deployment

Environment Variables for Production

DEBUG=false
LOG_LEVEL=WARNING
WORKERS=4

Using Gunicorn with Uvicorn Workers

gunicorn app.main:app \
  -w 4 \
  -k uvicorn.workers.UvicornWorker \
  --bind 0.0.0.0:8000 \
  --timeout 120

Kubernetes Deployment

Create k8s/deployment.yaml:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: ml-model-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: ml-model-api
  template:
    metadata:
      labels:
        app: ml-model-api
    spec:
      containers:
      - name: api
        image: your-registry/ml-model-api:latest
        ports:
        - containerPort: 8000
        env:
        - name: WORKERS
          value: "1"
        livenessProbe:
          httpGet:
            path: /api/v1/health
            port: 8000
          initialDelaySeconds: 30
          periodSeconds: 10

🀝 Contributing

This is a template - customize it for your needs!

πŸ“„ License

MIT License - Use freely!

πŸ™‹ Support

For issues or questions:

  1. Check the Swagger UI at /docs
  2. Review the example models in app/models/
  3. Read the API schemas in app/api/schemas.py

🎯 Next Steps

  1. Replace example models with your actual trained models
  2. Configure model registry in .env
  3. Add model-specific validation schemas
  4. Set up monitoring (Prometheus, Grafana)
  5. Add authentication if needed (JWT, API keys)
  6. Deploy to cloud (AWS, GCP, Azure)

Happy Deploying! πŸš€

About

fastapi-ml

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published