Skip to content

Rahilshah01/customer-churn-prediction-api

Repository files navigation

🚀 Customer Churn Prediction API

Python FastAPI Docker Scikit-learn

A production-ready ML microservice for telecom customer churn prediction.
Trained on 7,043 real customers. Deployed via FastAPI + Docker. Returns churn probability + risk tier.


⚡ Results at a Glance

Metric Detail
📊 Training Data 7,043 telecom customers (Telco Customer Churn — Kaggle)
🤖 Model Random Forest (n_estimators=200, class_weight="balanced")
🎯 Features 19 real churn signals (tenure, contract type, monthly charges, etc.)
📤 Output Churn prediction + probability score + risk tier (Low / Medium / High)
🐳 Deployment Dockerized — one command to build and run
📄 API Docs Auto-generated Swagger UI at /docs

🧠 What This Project Demonstrates

Most ML projects end at a .ipynb notebook. This project completes the full ML lifecycle:

1. train_churn_model.py   →  Train & evaluate on real Telco dataset → save .pkl
2. main.py                →  FastAPI wraps model as a REST microservice
3. Dockerfile             →  Containerize for cloud deployment

The result is a service any application can call — no Python environment needed on the consumer side.


💻 Core Implementation

class CustomerData(BaseModel):
    tenure: int           # Months with the company (0–72)
    Contract: int         # 0=Month-to-month, 1=One year, 2=Two year
    MonthlyCharges: float # Monthly bill in USD
    TotalCharges: float   # Cumulative charges to date
    InternetService: int  # 0=DSL, 1=Fiber optic, 2=No
    # ... 14 additional features

@app.post("/predict")
def predict(customer: CustomerData):
    df = pd.DataFrame([customer.model_dump()])
    df = df[feature_columns]              # Enforce training column order
    prediction = int(model.predict(df)[0])
    probability = float(model.predict_proba(df)[0][1])

    return {
        "prediction": "Will Churn" if prediction == 1 else "Will Stay",
        "churn_probability": round(probability, 4),
        "risk_level": "High" if probability >= 0.7 else "Medium" if probability >= 0.4 else "Low"
    }

🛠️ Tech Stack

Layer Technology
API Framework FastAPI
Input Validation Pydantic v2 BaseModel with field descriptions
ML Model Random Forest (class_weight="balanced" for ~26% churn imbalance)
Feature Encoding LabelEncoder per categorical column
Serialization Pickle (model + feature column order)
Containerization Docker (python:3.10-slim)
API Docs Auto-generated OpenAPI / Swagger UI

📡 API Usage

POST /predict

// Request
{
  "gender": 1, "SeniorCitizen": 0, "Partner": 1, "Dependents": 0,
  "tenure": 5, "Contract": 0, "PaperlessBilling": 1,
  "PaymentMethod": 2, "MonthlyCharges": 70.35, "TotalCharges": 351.75,
  ...
}

// Response
{
  "prediction": "Will Churn",
  "churn_probability": 0.7823,
  "risk_level": "High"
}

Additional endpoints:

  • GET / — API info
  • GET /health — Health check + model version

🚀 Quick Start

# 1. Clone
git clone https://github.com/Rahilshah01/customer-churn-prediction-api.git
cd customer-churn-prediction-api

# 2. Download dataset from Kaggle → place CSV in project root
# https://www.kaggle.com/datasets/blastchar/telco-customer-churn

# 3. Train and save the model
pip install scikit-learn pandas
python train_churn_model.py

# 4. Build and run with Docker
docker build -t churn-api .
docker run -p 8000:8000 churn-api

# 5. Open Swagger UI
# http://localhost:8000/docs

Without Docker:

pip install fastapi uvicorn scikit-learn pandas pydantic
uvicorn main:app --reload

📁 Repository Structure

customer-churn-prediction-api/
├── train_churn_model.py    # Data cleaning, training, evaluation, saves .pkl files
├── main.py                 # FastAPI app — prediction endpoint
├── churn_model.pkl         # Serialized RandomForestClassifier
├── feature_columns.pkl     # Saved feature order (prevents column mismatch)
├── Dockerfile
├── requirements.txt
└── README.md

🔗 Related Project

Model trained in Customer Churn Analysis — full EDA on churn drivers across 7,000+ telecom customers. This repo handles the deployment phase.


Built by Rahil Shah · MS Data Science @ Stevens Institute of Technology

About

A production-ready machine learning microservice that uses FastAPI and Docker to serve a Random Forest model for predicting customer churn in real-time

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors