Skip to content

MarouaHattab/sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎯 Reddit Sentiment Analysis - Complete MLOps Project

Python DVC MLflow Docker FastAPI ZenML Optuna CI/CD

Complete end-to-end MLOps workflow demonstrating best practices for production ML systems: data versioning, experiment tracking, containerization, pipeline orchestration, hyperparameter optimization, CI/CD, model serving, and monitoring.


🎬 Complete Project Demo

Complete Demo

Complete end-to-end demonstration showing all MLOps components working together:

  • βœ… Docker services orchestration
  • βœ… DVC data versioning (push/pull)
  • βœ… MLflow experiment tracking
  • βœ… ZenML pipeline execution
  • βœ… Optuna hyperparameter optimization
  • βœ… Model deployment (V1 β†’ V2 upgrade)
  • βœ… API inference testing
  • βœ… Model rollback demonstration
  • βœ… Monitoring dashboards (Grafana)

πŸ“– Project Description

This MLOps Mini-Project aims to build a complete end-to-end workflow for machine learning operations, demonstrating industry best practices for production ML systems. The project implements a sentiment analysis use case on Reddit comments, showcasing how to manage the entire ML lifecycle from data versioning to model deployment and monitoring.

Key Features

This project demonstrates a production-ready MLOps pipeline including:

  • Code Management (Git): Version control with branching strategy and release tags
  • Containerization (Docker/Docker Compose): Reproducible environments for training and serving
  • Data Versioning (DVC): Track datasets without storing large files in Git
  • Experiment Tracking (MLflow): Log parameters, metrics, and artifacts
  • Pipeline Orchestration (ZenML): Automated ML pipelines with experiment tracking
  • Hyperparameter Optimization (Optuna): Bayesian optimization for model tuning
  • CI/CD Pipeline (GitLab CI): Automated testing, building, and deployment
  • Model Serving (FastAPI): Production-ready inference API
  • Monitoring (Prometheus + Grafana): Real-time metrics and dashboards
  • Automated Retraining: Scheduled and conditional model retraining

🎯 Project Objectives

1. Introduction

This MLOps mini-project aims to build an end-to-end workflow including:

  • Code management (Git): Clean repository structure with branches and tags
  • Containerization (Docker / Docker Compose): Reproducible environments
  • Data versioning (DVC): Track datasets with remote storage
  • Experiment tracking (MLflow): Log experiments, parameters, and metrics
  • ML Pipeline (ZenML): Orchestrate ML workflows with CI/CD/CT
  • Hyperparameter optimization (Optuna): Automated model tuning
  • Model deployment (API): Production inference service
  • Monitoring & Retraining (Bonus): Real-time monitoring and automated retraining

2. Project Goals

  • βœ… Simple ML use case: CPU-friendly, quick to train (no GPU required)
  • βœ… Reproducibility: Reproducible training and inference (code + data + config)
  • βœ… Traceability: Full traceability with Git versions, data versions, MLflow runs, and pipeline executions
  • βœ… Deployment: Deploy inference API with version management (v1 β†’ v2) and rollback capability

3. Project Content

3.1 Use Case, Data, and Model

  • βœ… Public dataset: Lightweight, quickly trainable dataset
  • βœ… Baseline model: Simple but measurable baseline model
  • βœ… Primary metric: Defined primary metric (F1-Score for multi-class classification)

Our Implementation:

  • Dataset: Reddit Sentiment Analysis (~37,000 comments)
  • Baseline: RandomForest with Bag of Words (F1: 0.60)
  • Production: LightGBM with TF-IDF + Optuna HPT (F1: 0.8446)
  • Metric: F1-Score (Macro) as primary metric

3.2 Code Management (Git)

  • βœ… Clean GitLab repository: Well-structured README, clear structure
  • βœ… Branching strategy: main/dev branches with feature branches
  • βœ… Version tags: Tags for model versions (v1, v2)

Our Implementation:

  • Protected main and dev branches
  • Feature branches for development
  • Version tags: v1.0.0, v2.0.0, v2.1.0
  • Git Flow workflow

3.3 Containerization (Docker / Docker Compose)

  • βœ… Dockerfiles: Separate Dockerfiles for training and serving
  • βœ… Docker Compose: Complete stack orchestration for local execution

Our Implementation:

  • Dockerfile.train: Training container
  • Dockerfile.serve: API serving container
  • frontend/Dockerfile.frontend: Frontend container
  • docker-compose.yaml: Full stack (MinIO, MLflow, ZenML, API, Monitoring)

3.4 Data Versioning (DVC)

  • βœ… DVC tracking: Dataset tracked with DVC (no large files in Git)
  • βœ… Remote storage: Functional DVC remote (push/pull)
  • βœ… Reproducibility proof: Demonstration of reproducibility

Our Implementation:

  • MinIO (S3-compatible) as DVC remote
  • dvc.yaml pipeline definition
  • Push/pull demonstrations with screenshots
  • Full reproducibility with dvc repro

3.5 Experiment Tracking (MLflow)

  • βœ… Baseline run: At least 1 baseline run
  • βœ… Comparable runs: Multiple comparable runs with simple variations
  • βœ… Artifact logging: Log parameters, metrics, and artifacts (models, figures, etc.)

Our Implementation:

  • Baseline: RandomForest (F1: 0.60)
  • Multiple model comparisons: LightGBM, XGBoost, RandomForest, LogisticRegression
  • Full artifact logging: models, vectorizers, confusion matrices, evaluation reports
  • MLflow UI with comparison views

3.6 ML Pipeline (ZenML) + CT

  • βœ… ZenML pipeline: Complete pipeline (data β†’ train β†’ eval β†’ export)
  • βœ… Multiple executions: Baseline + variations pipeline runs
  • βœ… Continuous Training (CT): Scheduled smoke tests and full training

Our Implementation:

  • ZenML pipeline with 5 steps: data ingestion β†’ preprocessing β†’ training β†’ evaluation β†’ export
  • Multiple pipeline runs with different model types
  • GitLab CI scheduled jobs for CT (smoke test + full training)

3.7 Optimization (Optuna)

  • βœ… Optuna study: Short study (5-10 trials) on several hyperparameters
  • βœ… Comparison: Simple comparison between baseline / variations / best Optuna run

Our Implementation:

  • 10-trial Optuna study on LightGBM
  • Comparison: Baseline (F1: 0.8306) vs Grid Search (F1: 0.8350) vs Optuna (F1: 0.8426)
  • Hyperparameter importance analysis
  • MLflow integration for tracking

3.8 CI/CD (GitLab CI)

  • βœ… CI pipeline: Minimum pipeline with tests/lint + build images + push to registry
  • βœ… Continuous Training (CT): Scheduled job running "smoke" test (epochs=1 or subset)

Our Implementation:

  • 5-stage pipeline: test β†’ build β†’ push β†’ deploy β†’ smoke
  • Linting (flake8, black, isort)
  • Unit tests with coverage
  • Docker image building and pushing
  • Scheduled smoke test (1000 rows, n_estimators=10)
  • Weekly full CT training

3.9 Deployment (Serving)

  • βœ… Stable inference API: Independent API endpoint (e.g., /predict)
  • βœ… Docker Compose deployment: Deploy via Docker Compose
  • βœ… Version simulation: Simulate v1 β†’ v2 update + rollback (proof with tests/captures)

Our Implementation:

  • FastAPI inference service with /predict endpoint
  • Docker Compose deployment
  • Version management: v1.0.0 (F1: 0.60) β†’ v2.1.0 (F1: 0.84)
  • Rollback demonstration with screenshots
  • Health checks and metrics endpoints

4. Bonus (Optional)

  • βœ… Monitoring: Latency, request count, errors, custom metrics
  • βœ… Retraining: Simple triggering (scheduled or conditional)

Our Implementation:

  • Prometheus metrics collection
  • Grafana dashboards (request rate, latency, confidence distribution)
  • Retrain service with scheduled (daily) and conditional triggers
  • Accuracy threshold-based retraining

5. Deliverables

  • βœ… Repository link: GitHub/GitLab repository
  • βœ… Dockerfiles: Dockerfile.train, Dockerfile.serve, docker-compose.yaml
  • βœ… DVC configuration: .dvc files, dvc.yaml + push/pull proof
  • βœ… MLflow captures: Experiment list + comparison screenshots
  • βœ… ZenML captures: Pipeline runs + DAG visualizations
  • βœ… GitLab CI: .gitlab-ci.yml with full pipeline
  • βœ… Deployment demo: Inference testing + v1 β†’ v2 update + rollback with screenshots
  • βœ… Documentation: Complete README with execution instructions, commands, and structure

🎯 1. Project Overview

Use Case

Multi-class sentiment classification of Reddit comments into three categories:

  • Negative (-1): Negative sentiment
  • Neutral (0): Neutral sentiment
  • Positive (1): Positive sentiment

Dataset

  • Source: Reddit Sentiment Analysis Dataset
  • Size: ~37,000 comments
  • Features: Text comments with sentiment labels
  • Public & Lightweight: Suitable for quick training and experimentation

Models & Performance

Version Model Features F1 Score Accuracy Status
v1.0.0 RandomForest Bag of Words (5000 features) 0.60 0.67 Baseline
v2.0.0 LightGBM TF-IDF (1-3 ngrams) + SMOTE 0.84 0.84 Production
v2.1.0 LightGBM TF-IDF + Optuna HPT 0.8446 0.8463 Optimized

Primary Metric

  • F1-Score (Macro): Primary metric for multi-class classification
  • Secondary Metrics: Accuracy, Precision, Recall per class

🌿 2. Git Management & Version Control

Branching Strategy

This project follows a Git Flow branching model with protected branches:

main ────●────────────●────────────●──── (releases)
          \          /            /
dev ───────●────●────●────●──────●────── (integration)
            \  /      \  /
feature/* ───●─────────●──────────────── (development)
Branch Description Protection
main Production-ready code βœ… Protected
dev Development branch βœ… Protected

Version Tags

All model versions are tagged in Git for traceability:

Tag Description Model F1 Score Commit
v1.0.0 Baseline model RandomForest + BoW 0.60 abc123...
v2.0.0 Improved model LightGBM + TF-IDF + SMOTE 0.84 def456...
v2.1.0 Optimized model LightGBM + Optuna HPT 0.8446 ghi789...

Git Workflow Proof

Creating and tagging versions:

# Tag v1.0.0 (Baseline)
git tag -a v1.0.0 -m "Baseline: RandomForest + BoW (F1=0.60)"
git push origin v1.0.0

# Tag v2.0.0 (Production)
git tag -a v2.0.0 -m "Production: LightGBM + TF-IDF (F1=0.84)"
git push origin v2.0.0

# View all tags
git tag -l

Branch protection:

  • main and dev branches require pull requests
  • All commits must pass CI/CD pipeline
  • Code review required before merge

🐳 3. Docker Containerization

Dockerfiles

The project includes three Dockerfiles for different purposes:

3.1 Training Container (Dockerfile.train)

FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY src/ ./src/
COPY configs/ ./configs/
CMD ["python", "src/training/trainer.py"]

Usage:

docker build -f Dockerfile.train -t sentiment-train .
docker run -v ${PWD}/data:/app/data -v ${PWD}/models:/app/models sentiment-train

3.2 API Serving Container (Dockerfile.serve)

FROM python:3.10-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY api/ ./api/
COPY src/ ./src/
COPY models/ ./models/
EXPOSE 8000
CMD ["uvicorn", "api.main:app", "--host", "0.0.0.0", "--port", "8000"]

Usage:

docker build -f Dockerfile.serve -t sentiment-api .
docker run -p 8000:8000 -v ${PWD}/models:/app/models sentiment-api

3.3 Frontend Container (frontend/Dockerfile.frontend)

Nginx-based frontend for user interface.

Docker Compose Stack

The docker-compose.yaml orchestrates the entire MLOps infrastructure:

Services included:

  • MinIO: S3-compatible storage for DVC and MLflow artifacts
  • MLflow: Experiment tracking server
  • ZenML Server: Pipeline orchestration dashboard
  • Sentiment API: FastAPI inference service
  • Prometheus: Metrics collection
  • Grafana: Monitoring dashboards
  • Retrain Service: Automated retraining service
  • Frontend: Web UI for predictions

Start all services:

docker-compose up -d

View running services:

docker-compose ps

Docker Proof

Docker Services

Screenshot showing all Docker containers running with docker-compose

Service URLs:

Service URL Credentials
MinIO Console http://localhost:9003 minio / minio12345
MLflow UI http://localhost:5001 -
ZenML Dashboard http://localhost:8080 -
API http://localhost:8000 -
Grafana http://localhost:3000 admin / admin
Prometheus http://localhost:9090 -

πŸ“¦ 4. DVC - Data Version Control

Overview

DVC (Data Version Control) tracks large data files and model artifacts without storing them in Git. This project uses MinIO (S3-compatible) as remote storage.

DVC Configuration

Initialize DVC:

dvc init

Configure remote storage (MinIO):

dvc remote add -d storage s3://mlops-dvc
dvc remote modify storage endpointurl http://localhost:9002
dvc remote modify storage use_ssl false
dvc remote modify storage region us-east-1

Set environment variables:

# Windows PowerShell
$env:AWS_ACCESS_KEY_ID="minio"
$env:AWS_SECRET_ACCESS_KEY="minio12345"

# Linux/macOS
export AWS_ACCESS_KEY_ID=minio
export AWS_SECRET_ACCESS_KEY=minio12345

Tracking Data with DVC

Add data to DVC:

# Download dataset
python scripts/download_data.py

# Track raw data
dvc add data/raw/reddit.csv

# Commit .dvc file to Git (not the actual data)
git add data/raw/reddit.csv.dvc data/raw/.gitignore
git commit -m "Add raw dataset with DVC"

DVC Pipeline (dvc.yaml):

stages:
  preprocess:
    cmd: python src/preprocessing.py
    deps:
      - data/raw/reddit.csv
    outs:
      - data/processed/reddit_processed.csv

  train:
    cmd: python src/train.py
    deps:
      - data/processed/reddit_processed.csv
    outs:
      - models/model.pkl
      - models/vectorizer.pkl

Run pipeline:

dvc repro  # Reproduce entire pipeline
dvc dag    # Visualize pipeline DAG

DVC Proof & Demonstrations

4.1 MinIO Storage Setup

MinIO Bucket Creation MinIO console showing bucket creation for DVC storage

MinIO DVC Bucket DVC bucket with versioned data files

MinIO File Storage Actual data files stored in MinIO (not in Git)

4.2 DVC Push Operation

DVC Push Demonstration of dvc push command pushing data to MinIO remote storage

Command output:

Pushing to 'storage' (s3://mlops-dvc)
[####################] 100% data/raw/reddit.csv

4.3 DVC Pull Operation

DVC Pull Demonstration of dvc pull command retrieving data from remote storage

Command output:

Pulling from 'storage' (s3://mlops-dvc)
[####################] 100% data/raw/reddit.csv

4.4 DVC Status

DVC Status DVC status showing tracked files and pipeline stages

Reproducibility proof:

# On a new machine
git clone <repo-url>
cd reddit-sentiment-mlops
dvc pull  # Retrieves data from MinIO
dvc repro # Reproduces entire pipeline with exact same results

πŸ“Š 5. MLflow - Experiment Tracking

Overview

MLflow tracks experiments, logs parameters, metrics, and artifacts. This project uses MLflow with MinIO for artifact storage.

MLflow Setup

Start MLflow server:

# Set environment variables
$env:AWS_ACCESS_KEY_ID="minio"
$env:AWS_SECRET_ACCESS_KEY="minio12345"
$env:MLFLOW_S3_ENDPOINT_URL="http://localhost:9002"

# Start MLflow server
mlflow server --host 0.0.0.0 --port 5001 \
  --backend-store-uri sqlite:///mlflow.db \
  --default-artifact-root s3://mlflow-artifacts/

Access MLflow UI: http://localhost:5001

Experiment Tracking

Run experiments with MLflow:

# Baseline: Random Forest
python src/training/train_mlflow.py \
  --exp-name sentiment_analysis \
  --run-name "random_forest_baseline" \
  --model-type random_forest

# LightGBM with TF-IDF
python src/training/train_mlflow.py \
  --exp-name sentiment_analysis \
  --run-name "lightgbm_tfidf" \
  --model-type lightgbm \
  --max-features 10000 \
  --ngram-min 1 \
  --ngram-max 3 \
  --use-smote

# XGBoost
python src/training/train_mlflow.py \
  --exp-name sentiment_analysis \
  --run-name "xgboost_baseline" \
  --model-type xgboost

MLflow Proof & Screenshots

5.1 MLflow Experiments List

MLflow Experiments MLflow UI showing all experiments with multiple runs

Experiments tracked:

  • sentiment_analysis: Main experiment
  • optuna_sentiment: Optuna optimization runs
  • zenml_pipeline: ZenML pipeline runs

5.2 MLflow Run Comparison

MLflow Comparison Comparing multiple runs: Random Forest vs LightGBM vs XGBoost

Metrics comparison:

Run Name Model F1 Score Accuracy Training Time
random_forest_baseline RandomForest 0.60 0.67 45s
lightgbm_tfidf LightGBM 0.84 0.84 38s
xgboost_baseline XGBoost 0.82 0.82 52s

5.3 MLflow Artifacts

MLflow Artifacts MLflow storing model artifacts, confusion matrices, and evaluation reports

Artifacts logged:

  • model.pkl: Trained model
  • vectorizer.pkl: Feature vectorizer
  • confusion_matrix.png: Classification visualization
  • evaluation_report.json: Detailed metrics
  • class_performance.png: Per-class metrics

5.4 MLflow-ZenML Integration

MLflow-ZenML Integration ZenML pipeline runs automatically logged to MLflow


πŸ”„ 6. ZenML - Pipeline Orchestration

Overview

ZenML provides ML pipeline orchestration with automatic experiment tracking integration. Our pipeline follows: data β†’ preprocess β†’ train β†’ evaluate β†’ export.

ZenML Setup

Install and connect:

pip install zenml==0.91.2

# Connect to ZenML server (Docker)
zenml login http://localhost:8080 --no-verify-ssl

# Verify connection
zenml status

Configure stack:

# Register MLflow experiment tracker
zenml experiment-tracker register mlflow_tracker \
  --flavor=mlflow \
  --tracking_uri=http://localhost:5001

# Create and activate stack
zenml stack register mlflow_stack \
  -o default \
  -a default \
  -e mlflow_tracker \
  --set

Pipeline Definition

Pipeline structure:

@pipeline
def sentiment_analysis_pipeline():
    # Step 1: Ingest data
    raw_data = data_ingestion_step()

    # Step 2: Preprocess
    processed_data = preprocessing_step(raw_data)

    # Step 3: Train model
    model = training_step(processed_data)

    # Step 4: Evaluate
    metrics = evaluation_step(model, processed_data)

    # Step 5: Export
    export_step(model, metrics)

Run pipeline:

python run_pipeline.py \
  --model-type lightgbm \
  --max-features 10000 \
  --ngram-min 1 \
  --ngram-max 3 \
  --use-smote

ZenML Proof & Screenshots

6.1 ZenML Dashboard

ZenML Dashboard ZenML dashboard showing pipeline runs and status

6.2 Pipeline Visualization

Pipeline DAG Visual representation of the pipeline DAG (data β†’ preprocess β†’ train β†’ eval β†’ export)

6.3 Pipeline Runs

Pipeline Runs List of all pipeline runs with status, duration, and metadata

6.4 ZenML Analytics

ZenML Analytics Analytics dashboard showing pipeline performance metrics

6.5 Successful Pipeline Run

ZenML Success Successful pipeline execution with all steps completed

6.6 MLflow-ZenML Integration

MLflow-ZenML ZenML runs automatically logged to MLflow for unified tracking

6.7 Pipeline Execution Script

Pipeline Script Command-line execution of ZenML pipeline with parameters

6.8 ZenML CLI Commands

![ZenML CLI](img/Zenml/zenml cmd.png) ZenML CLI showing stack configuration and pipeline management

6.9 Other Pipeline Views

Pipeline View 1 Pipeline View 2 Additional pipeline visualization views


πŸ”¬ 7. Optuna - Hyperparameter Optimization

Overview

Optuna performs Bayesian hyperparameter optimization using TPE (Tree-structured Parzen Estimator) sampler. This section demonstrates optimization results compared to baseline and grid search.

Optuna Study

Run Optuna optimization:

# Set environment variables
$env:AWS_ACCESS_KEY_ID="minio"
$env:AWS_SECRET_ACCESS_KEY="minio12345"
$env:MLFLOW_S3_ENDPOINT_URL="http://localhost:9002"
$env:MLFLOW_TRACKING_URI="http://localhost:5001"

# Run Optuna with 10 trials
python src/training/optuna_sentiment.py \
  --n-trials 10 \
  --model-type lightgbm \
  --max-features 10000 \
  --ngram-min 1 \
  --ngram-max 3 \
  --exp-name optuna_sentiment

Hyperparameters Searched

LightGBM hyperparameters:

  • n_estimators: 100-400
  • learning_rate: 0.01-0.2 (log scale)
  • max_depth: 4-10
  • num_leaves: 20-80
  • min_child_samples: 10-50
  • colsample_bytree: 0.6-1.0
  • subsample: 0.6-1.0
  • reg_alpha, reg_lambda: 0.001-0.5

Optimization Results

Method Model Trials Best F1 Time Improvement
Baseline LightGBM 1 0.8306 ~38s -
Grid Search LightGBM 16 0.8350 ~8min +0.5%
Optuna TPE LightGBM 10 0.8426 ~4.5min +1.2%

Improvement: Optuna achieved +1.2% F1 score with fewer trials than grid search.

Optuna Proof & Screenshots

7.1 Optuna Best Trial

Optuna Best F1 Optuna dashboard showing best trial with F1 score of 0.8426

Best hyperparameters found:

{
    'n_estimators': 250,
    'learning_rate': 0.08,
    'max_depth': 7,
    'num_leaves': 65,
    'min_child_samples': 25,
    'colsample_bytree': 0.85,
    'subsample': 0.9,
    'reg_alpha': 0.05,
    'reg_lambda': 0.1
}

7.2 Optuna Study Visualization

![Optuna Study](img/optuna/Screenshot 2026-01-03 031442.png) Optuna study showing optimization history and parameter importance

Key insights:

  • learning_rate and n_estimators are most important
  • max_depth has moderate impact
  • Optimization converged after ~8 trials

πŸ”„ 8. CI/CD Pipeline (GitLab CI)

Overview

The project includes a complete CI/CD pipeline with automated testing, building, deployment, and Continuous Training (CT).

Pipeline Stages

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  TEST   β”‚ β†’  β”‚  BUILD  β”‚ β†’  β”‚  PUSH   β”‚ β†’  β”‚ DEPLOY  β”‚ β†’  β”‚  SMOKE  β”‚
β”‚lint+testβ”‚    β”‚ images  β”‚    β”‚registry β”‚    β”‚(manual) β”‚    β”‚   CT    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Stage Details

Stage Jobs Description
test lint, test, security_scan Run flake8, pytest, safety, bandit
build build_api, build_retrain, build_frontend Build Docker images
push push_api, push_retrain, push_frontend Push to GitLab Container Registry
deploy deploy_staging, deploy_production Manual deployment (optional)
smoke smoke_test_training, ct_full_training Smoke test & CT (scheduled)

CI/CD Proof & Screenshots

8.1 CI Linting Stage

![CI Linting](img/ci-cd/CI Linting .png) GitLab CI pipeline showing linting stage with flake8, black, and isort checks

8.2 CI Testing Stage

![CI Testing](img/ci-cd/CI Testing.png) Unit tests running with pytest and coverage reporting

8.3 Continuous Training (CT) - Smoke Test

CT Smoke Test 1 CT Smoke Test 2 CT Smoke Test 3 Scheduled CT jobs running minimal training to verify pipeline integrity

Smoke test configuration:

  • Uses 1000 rows of data
  • Trains with n_estimators=10
  • Validates minimum F1 threshold (0.3)
  • Runs daily or on manual trigger

8.4 GitHub/GitLab Integration

GitHub Integration GitLab CI/CD pipeline integrated with repository

Pipeline Configuration

Key features:

  • βœ… Automated testing on every commit
  • βœ… Docker image building and pushing
  • βœ… Scheduled CT jobs (daily smoke test, weekly full training)
  • βœ… Manual deployment gates
  • βœ… Artifact storage for test reports

πŸš€ 9. Model Deployment & Serving

Overview

The project includes a production-ready FastAPI inference service with model versioning, health checks, and metrics endpoints.

API Endpoints

Method Endpoint Description
GET / API information
GET /health Health check
GET /model/info Model version & metrics
POST /predict Single text prediction
POST /predict/batch Batch predictions
GET /docs Swagger documentation
GET /metrics Prometheus metrics

Deployment Demo

🎬 Live demonstration of model versioning, inference testing, and rollback capabilities.

Demo Workflow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        DEPLOYMENT DEMO WORKFLOW                              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                              β”‚
β”‚   β‘  START                    β‘‘ DEPLOY V2              β‘’ ROLLBACK V1         β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚   β”‚  V1.0   β”‚  ──────────►  β”‚  V2.1   β”‚  ──────────►  β”‚  V1.0   β”‚           β”‚
β”‚   β”‚ F1=0.60 β”‚    upgrade    β”‚ F1=0.84 β”‚    rollback   β”‚ F1=0.60 β”‚           β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚
β”‚       β”‚                         β”‚                         β”‚                  β”‚
β”‚       β–Ό                         β–Ό                         β–Ό                  β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚
β”‚   β”‚  Test   β”‚               β”‚  Test   β”‚               β”‚  Test   β”‚           β”‚
β”‚   β”‚  49.8%  β”‚               β”‚  92.1%  β”‚               β”‚  49.8%  β”‚           β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚
β”‚                                                                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Deployment Proof & Screenshots

9.1 Deploy V1 (Baseline)

![Deploy V1](img/Serving & Rollback/dockerv1.png) Deploying version 1.0.0 (RandomForest + BoW) with F1=0.60

Deployment command:

.\scripts\deploy.ps1 -Version "v1.0.0"

Expected output:

=============================================
  Sentiment API Deployment Script
=============================================
Model folder: models/v1

Deploying version: v1.0.0
βœ… DEPLOYMENT SUCCESSFUL!

Model Info:
  Version: v1
  Model: RandomForestClassifier
  Status: healthy

Test Prediction:
  Text: 'This is great!'
  Label: Positive
  Confidence: 49.82%        ← Low confidence (baseline model)
=============================================

9.2 Deploy V2 (Optimized)

![Deploy V2](img/Serving & Rollback/v2.png) Upgrading to version 2.1.0 (LightGBM + TF-IDF + Optuna) with F1=0.84

Deployment command:

.\scripts\deploy.ps1 -Version "v2.1.0"

Expected output:

=============================================
  Sentiment API Deployment Script
=============================================
Current version: v1
Model folder: models/v2

Deploying version: v2.1.0
βœ… DEPLOYMENT SUCCESSFUL!

Model Info:
  Version: v2
  Model: lightgbm
  Status: healthy

Test Prediction:
  Text: 'This is great!'
  Label: Positive
  Confidence: 92.12%        ← High confidence (optimized model)
=============================================

9.3 API Health Check

![API Health](img/Serving & Rollback/healthapi.png) API health endpoint showing service status and model information

Health check response:

{
  "status": "healthy",
  "model_version": "v2",
  "model_type": "lightgbm",
  "f1_score": 0.8446,
  "accuracy": 0.8463
}

9.4 API Documentation

![API Docs](img/Serving & Rollback/docs.png) FastAPI Swagger documentation showing all available endpoints

Inference Testing

V1 Prediction (Baseline):

curl -X POST "http://localhost:8000/predict" \
  -H "Content-Type: application/json" \
  -d '{"text": "I love this product, it'\''s amazing!"}'

Response:

{
  "text": "I love this product, it's amazing!",
  "prediction": 1,
  "label": "Positive",
  "confidence": 0.52,        ← ~52% confidence
  "probabilities": {
    "Negative": 0.18,
    "Neutral": 0.30,
    "Positive": 0.52
  }
}

V2 Prediction (Optimized):

{
  "text": "I love this product, it's amazing!",
  "prediction": 1,
  "label": "Positive",
  "confidence": 0.94,        ← ~94% confidence (much better!)
  "probabilities": {
    "Negative": 0.02,
    "Neutral": 0.04,
    "Positive": 0.94
  }
}

Results Comparison

Test Text V1 Confidence V2 Confidence Improvement
"This is great!" 49.8% 92.1% +85%
"I love this product!" 52.0% 94.0% +81%
"Terrible experience" 45.2% 89.3% +98%
"It's okay I guess" 38.1% 67.4% +77%

Rollback Demonstration

Rollback to V1:

.\scripts\deploy.ps1 -Version "v1.0.0" -Rollback

Expected output:

πŸ”΄ ROLLBACK to version: v1.0.0

Stopping current API...
Starting API with model from v1 (version v1.0.0)...

βœ… DEPLOYMENT SUCCESSFUL!

Model Info:
  Version: v1
  Model: RandomForestClassifier
  Confidence: 49.82%        ← Back to baseline

πŸ“ˆ 10. Monitoring & Retraining (Bonus)

Monitoring Stack

The project includes Prometheus + Grafana for comprehensive monitoring:

  • Prometheus: Metrics collection
  • Grafana: Visualization dashboards
  • Custom metrics: Request latency, prediction confidence, error rates

Monitoring Proof & Screenshots

10.1 Grafana Dashboard

Grafana Dashboard Grafana dashboard showing API metrics: request rate, latency, and prediction confidence

Metrics tracked:

  • Request rate (requests/second)
  • Prediction latency (p50, p95, p99)
  • Prediction confidence distribution
  • Error rate
  • Model version in use

10.2 Retrain Service - Scheduled

Scheduled Retrain Retrain service configuration showing scheduled retraining triggers

Retrain triggers:

  • Scheduled: Daily at 2 AM
  • Conditional: When accuracy drops below threshold
  • Manual: Via API endpoint

10.3 Retrain Execution Time

Retrain Time Grafana showing retraining execution time and frequency

Retraining Service

Automatic retraining conditions:

  1. Scheduled: Daily/weekly retraining
  2. Accuracy threshold: Retrain if F1 < 0.80
  3. Feedback threshold: Retrain after 1000 new predictions
  4. Manual trigger: Via API endpoint

Retrain service logs:

[2025-01-03 02:00:00] Starting scheduled retraining...
[2025-01-03 02:00:05] Loading new data...
[2025-01-03 02:00:10] Training new model...
[2025-01-03 02:02:30] New model F1: 0.8450 (improvement: +0.0004)
[2025-01-03 02:02:31] Deploying new model...
[2025-01-03 02:02:35] Retraining completed successfully

πŸš€ Quick Start Guide

Prerequisites

  • Python 3.10+
  • Docker & Docker Compose
  • Git
  • (Optional) GitLab account for CI/CD

Step-by-Step Setup

1. Clone Repository

git clone <your-repo-url>
cd reddit-sentiment-mlops

2. Create Virtual Environment

python -m venv venv
venv\Scripts\activate  # Windows
# source venv/bin/activate  # Linux/Mac

3. Install Dependencies

pip install -r requirements.txt

4. Start Infrastructure

docker-compose up -d

Verify services:

docker-compose ps

5. Setup DVC

# Initialize DVC
dvc init

# Configure remote (MinIO)
dvc remote add -d storage s3://mlops-dvc
dvc remote modify storage endpointurl http://localhost:9002
dvc remote modify storage use_ssl false

# Set environment variables
$env:AWS_ACCESS_KEY_ID="minio"
$env:AWS_SECRET_ACCESS_KEY="minio12345"

# Download and track data
python scripts/download_data.py
dvc add data/raw/reddit.csv
git add data/raw/reddit.csv.dvc
git commit -m "Add data with DVC"
dvc push

6. Setup MLflow

# Set environment variables
$env:AWS_ACCESS_KEY_ID="minio"
$env:AWS_SECRET_ACCESS_KEY="minio12345"
$env:MLFLOW_S3_ENDPOINT_URL="http://localhost:9002"
$env:MLFLOW_TRACKING_URI="http://localhost:5001"

7. Setup ZenML

# Install ZenML
pip install zenml==0.91.2

# Connect to ZenML server
zenml login http://localhost:8080 --no-verify-ssl

# Register MLflow tracker
zenml experiment-tracker register mlflow_tracker \
  --flavor=mlflow \
  --tracking_uri=http://localhost:5001

# Create stack
zenml stack register mlflow_stack \
  -o default \
  -a default \
  -e mlflow_tracker \
  --set

8. Train Baseline Model

python src/training/train_mlflow.py \
  --exp-name sentiment_analysis \
  --run-name "baseline" \
  --model-type random_forest

9. Run Optuna Optimization

python src/training/optuna_sentiment.py \
  --n-trials 10 \
  --model-type lightgbm

10. Run ZenML Pipeline

python run_pipeline.py \
  --model-type lightgbm \
  --max-features 10000 \
  --ngram-min 1 \
  --ngram-max 3 \
  --use-smote

11. Deploy Model

# Deploy V1
.\scripts\deploy.ps1 -Version "v1.0.0"

# Deploy V2
.\scripts\deploy.ps1 -Version "v2.1.0"

12. Test API

curl -X POST "http://localhost:8000/predict" \
  -H "Content-Type: application/json" \
  -d '{"text": "This is amazing!"}'

πŸ“ Project Structure

reddit-sentiment-mlops/
β”‚
β”œβ”€β”€ πŸ“‚ api/                     # FastAPI application
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── main.py                 # API endpoints
β”‚
β”œβ”€β”€ πŸ“‚ configs/                 # Configuration files
β”‚   β”œβ”€β”€ config.yaml             # Main configuration
β”‚   └── optuna_config.yaml      # Hyperparameter tuning config
β”‚
β”œβ”€β”€ πŸ“‚ data/                    # Data directory (DVC tracked)
β”‚   β”œβ”€β”€ raw/                    # Raw data
β”‚   β”‚   └── reddit.csv.dvc      # DVC tracking file
β”‚   └── processed/               # Processed data
β”‚
β”œβ”€β”€ πŸ“‚ models/                   # Trained models
β”‚   β”œβ”€β”€ v1/                     # Version 1 models
β”‚   β”‚   β”œβ”€β”€ model.pkl
β”‚   β”‚   └── vectorizer.pkl
β”‚   └── v2/                     # Version 2 models
β”‚       β”œβ”€β”€ model.pkl
β”‚       └── vectorizer.pkl
β”‚
β”œβ”€β”€ πŸ“‚ notebooks/               # Jupyter notebooks (EDA & experiments)
β”‚   β”œβ”€β”€ Preprocessing_and_EDA.ipynb
β”‚   β”œβ”€β”€ experiment_1_baseline_model.ipynb
β”‚   └── ...
β”‚
β”œβ”€β”€ πŸ“‚ scripts/                 # Utility scripts
β”‚   β”œβ”€β”€ download_data.py
β”‚   β”œβ”€β”€ deploy.ps1              # Deployment script
β”‚   β”œβ”€β”€ run_optuna.ps1          # Optuna runner
β”‚   └── ...
β”‚
β”œβ”€β”€ πŸ“‚ src/                     # Source code (modular structure)
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ πŸ“‚ data/                # Data loading module
β”‚   β”‚   └── loader.py
β”‚   β”œβ”€β”€ πŸ“‚ features/            # Feature engineering
β”‚   β”‚   β”œβ”€β”€ preprocessing.py
β”‚   β”‚   └── extraction.py
β”‚   β”œβ”€β”€ πŸ“‚ training/            # Training module
β”‚   β”‚   β”œβ”€β”€ trainer.py
β”‚   β”‚   β”œβ”€β”€ train_mlflow.py
β”‚   β”‚   └── optuna_sentiment.py
β”‚   β”œβ”€β”€ πŸ“‚ evaluation/          # Evaluation module
β”‚   β”‚   └── evaluator.py
β”‚   β”œβ”€β”€ πŸ“‚ serving/             # Serving module
β”‚   β”‚   β”œβ”€β”€ predictor.py
β”‚   β”‚   └── retrain_service.py
β”‚   β”œβ”€β”€ πŸ“‚ pipelines/           # ZenML pipelines
β”‚   β”‚   β”œβ”€β”€ πŸ“‚ steps/
β”‚   β”‚   β”‚   β”œβ”€β”€ data_ingestion.py
β”‚   β”‚   β”‚   β”œβ”€β”€ preprocessing.py
β”‚   β”‚   β”‚   β”œβ”€β”€ training.py
β”‚   β”‚   β”‚   β”œβ”€β”€ evaluation.py
β”‚   β”‚   β”‚   └── export.py
β”‚   β”‚   └── πŸ“‚ zenml/
β”‚   β”‚       └── sentiment_pipeline.py
β”‚   └── πŸ“‚ utils/                # Utilities
β”‚       β”œβ”€β”€ config.py
β”‚       └── logging_utils.py
β”‚
β”œβ”€β”€ πŸ“‚ monitoring/              # Prometheus/Grafana configs
β”‚   β”œβ”€β”€ prometheus/
β”‚   β”‚   └── prometheus.yml
β”‚   └── grafana/
β”‚       └── provisioning/
β”‚
β”œβ”€β”€ πŸ“‚ tests/                   # Unit tests
β”‚   β”œβ”€β”€ test_data.py
β”‚   β”œβ”€β”€ test_features.py
β”‚   β”œβ”€β”€ test_training.py
β”‚   └── test_evaluation.py
β”‚
β”œβ”€β”€ πŸ“‚ img/                     # Documentation images
β”‚   β”œβ”€β”€ demo.gif                # Complete demo
β”‚   β”œβ”€β”€ Docker/
β”‚   β”œβ”€β”€ dvc/
β”‚   β”œβ”€β”€ mlflow/
β”‚   β”œβ”€β”€ Zenml/
β”‚   β”œβ”€β”€ optuna/
β”‚   β”œβ”€β”€ ci-cd/
β”‚   β”œβ”€β”€ Serving & Rollback/
β”‚   └── Grafana/
β”‚
β”œβ”€β”€ πŸ“„ .env                     # Environment variables
β”œβ”€β”€ πŸ“„ .gitignore
β”œβ”€β”€ πŸ“„ .gitlab-ci.yml           # GitLab CI/CD pipeline
β”œβ”€β”€ πŸ“„ docker-compose.yaml      # Full stack orchestration
β”œβ”€β”€ πŸ“„ Dockerfile.train         # Training container
β”œβ”€β”€ πŸ“„ Dockerfile.serve         # API serving container
β”œβ”€β”€ πŸ“„ dvc.yaml                 # DVC pipeline definition
β”œβ”€β”€ πŸ“„ dvc.lock                 # DVC lock file
β”œβ”€β”€ πŸ“„ params.yaml              # DVC parameters
β”œβ”€β”€ πŸ“„ requirements.txt         # Python dependencies
β”œβ”€β”€ πŸ“„ Makefile                 # Automation commands
β”œβ”€β”€ πŸ“„ run_pipeline.py          # ZenML pipeline runner
└── πŸ“„ README.md                # This file

πŸ“ Commands Reference

DVC Commands

dvc init                    # Initialize DVC
dvc add <file>              # Track file with DVC
dvc push                    # Push data to remote
dvc pull                    # Pull data from remote
dvc status                  # Check local status
dvc status --cloud          # Check cloud sync status
dvc repro                   # Reproduce pipeline
dvc dag                     # Show pipeline DAG

Docker Commands

docker-compose up -d        # Start all services
docker-compose down         # Stop all services
docker-compose ps           # List running services
docker-compose logs         # View logs
docker build -f Dockerfile.serve -t sentiment-api .

MLflow Commands

mlflow server --host 0.0.0.0 --port 5001 \
  --backend-store-uri sqlite:///mlflow.db \
  --default-artifact-root s3://mlflow-artifacts/

ZenML Commands

zenml login http://localhost:8080 --no-verify-ssl
zenml status
zenml stack list
zenml stack describe
zenml pipeline runs list

Optuna Commands

python src/training/optuna_sentiment.py \
  --n-trials 10 \
  --model-type lightgbm \
  --max-features 10000

Make Commands

make help          # Show all commands
make lint          # Run linting
make test          # Run tests
make test-cov      # Run tests with coverage
make build         # Build Docker images
make smoke         # Run smoke test
make docker-up     # Start Docker services
make docker-down   # Stop Docker services

πŸŽ“ Key Learnings & Best Practices

MLOps Principles Demonstrated

  1. Reproducibility

    • βœ… Git versioning for code
    • βœ… DVC for data versioning
    • βœ… MLflow for experiment tracking
    • βœ… Docker for environment consistency
  2. Traceability

    • βœ… Git tags for model versions
    • βœ… MLflow run tracking
    • βœ… ZenML pipeline execution logs
    • βœ… DVC pipeline provenance
  3. Automation

    • βœ… CI/CD pipeline for testing and deployment
    • βœ… Scheduled retraining (CT)
    • βœ… Automated model deployment
  4. Monitoring

    • βœ… Prometheus metrics collection
    • βœ… Grafana dashboards
    • βœ… API health checks
    • βœ… Model performance tracking
  5. Scalability

    • βœ… Containerized services
    • βœ… Microservices architecture
    • βœ… Stateless API design

πŸ“š Additional Resources

πŸ™ Acknowledgments

  • Reddit Sentiment Analysis Dataset by Himanshu-1703
  • MLOps tools: DVC, MLflow, ZenML, Optuna, Docker, FastAPI
  • Open-source community

πŸŽ‰ This project demonstrates a complete, production-ready MLOps workflow from data versioning to model deployment and monitoring!

About

A complete end-to-end MLOps project for sentiment analysis on Reddit comments

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors