Skip to content
Rabieh Fashwall edited this page Nov 27, 2025 · 2 revisions

MLOps Workshop Wiki

Welcome to the MLOps Workshop! This wiki provides step-by-step walkthroughs for each module.

Workshop Overview

This 6-hour hands-on workshop teaches you to build production-ready ML systems from scratch. You'll progress through 8 modules covering the complete MLOps lifecycle: from model training to production deployment with monitoring and CI/CD.


Workshop Structure

Complete Learning Path

Module 0: Setup
    ↓
Module 1: Model Training & Experiment Tracking
    ↓
Module 2: Model Packaging & Serving
    ↓
Module 3: Kubernetes Deployment
    ↓
Module 4: API Gateway & Polyglot Architecture
    ↓
Module 5: ML Pipeline Automation
    ↓
Module 6: Monitoring & Observability
    ↓
Module 7: CI/CD Pipeline
    ↓
🎉 Complete MLOps Platform!

Modules {#modules}

Module 0: Environment Setup

Set up your development environment with Python, Go, Docker, Kubernetes, and all workshop dependencies.

What you'll install:

  • Python 3.9+ with ML libraries (MLflow, BentoML, Transformers)
  • Go 1.21+ for infrastructure services
  • Docker for containerization
  • kubectl and kind for local Kubernetes
  • MLflow tracking server and BentoML

Start Module 0: Setup Guide


Module 1: Model Training & Experiment Tracking

Train a sentiment analysis model with Hugging Face transformers and track experiments using MLflow.

What you'll learn:

  • ✅ Fine-tune DistilBERT for sentiment classification
  • ✅ Track experiments with MLflow (parameters, metrics, models)
  • ✅ Use MLflow Model Registry for version management
  • ✅ Compare training runs and select best models
  • ✅ Build production-ready training scripts

Exercises:

  1. Exercise 1: Basic Training with MLflow
  2. Exercise 2: Model Registry Workflow

Start Module 1: MLflow & Experiment Tracking


Module 2: Model Packaging & Serving

Package your trained model as a production-ready REST API using BentoML 1.4+.

What you'll learn:

  • ✅ BentoML 1.4+ class-based service architecture
  • ✅ Pydantic v2 validation for type-safe APIs
  • ✅ Error handling and structured logging
  • ✅ Batch processing for higher throughput
  • ✅ Docker containerization
  • ✅ OpenAPI/Swagger documentation

Exercises:

  1. Exercise 1: Basic BentoML Service
  2. Exercise 2: Production Features

Start Module 2: BentoML & Model Serving


Module 3: Kubernetes Deployment

Deploy your containerized ML service to Kubernetes with production-grade configuration.

What you'll learn:

  • ✅ Kubernetes fundamentals (Pods, Deployments, Services)
  • ✅ Resource management (requests, limits, QoS)
  • ✅ Health probes (startup, liveness, readiness)
  • ✅ Horizontal Pod Autoscaling (HPA)
  • ✅ ConfigMaps for configuration management
  • ✅ High availability and security patterns

Exercises:

  1. Exercise 1: Basic Deployment
  2. Exercise 2: Production Configuration
  3. Exercise 3: Auto-scaling & HA

Start Module 3: Kubernetes Deployment


Module 4: API Gateway & Polyglot Architecture

Build a high-performance API gateway in Go to front your ML services.

What you'll learn:

  • ✅ Why Go for infrastructure (67% resource reduction)
  • ✅ Reverse proxy patterns
  • ✅ Middleware (logging, CORS, rate limiting)
  • ✅ Health checks and circuit breakers
  • ✅ Prometheus metrics integration
  • ✅ Polyglot architecture benefits

Exercises:

  1. Exercise 1: Basic Reverse Proxy
  2. Exercise 2: Production Middleware

Start Module 4: Go API Gateway


Module 5: ML Pipeline Automation

Orchestrate end-to-end ML workflows with Kubeflow Pipelines.

What you'll learn:

  • ✅ Kubeflow Pipelines components and DAGs
  • ✅ Artifact tracking and versioning
  • ✅ Pipeline orchestration patterns
  • ✅ KServe for model serving
  • ✅ Multi-model deployment strategies
  • ✅ Automated retraining workflows

Exercises:

  1. Exercise 1: Data Preparation Component
  2. Exercise 2: Training & Evaluation Components
  3. Exercise 3: Pipeline Orchestration

Start Module 5: Kubeflow Pipelines


Module 6: Monitoring & Observability

Set up production monitoring with Prometheus and Grafana.

What you'll learn:

  • ✅ Prometheus for metrics collection
  • ✅ PromQL queries and aggregation
  • ✅ Alerting rules and Alertmanager
  • ✅ Grafana dashboards for visualization
  • ✅ ML-specific metrics (prediction latency, model performance)
  • ✅ SLO/SLA monitoring

Exercises:

  1. Exercise 2: Alerting Rules
  2. Exercise 3: Grafana Dashboard

Start Module 6: Prometheus & Grafana


Module 7: CI/CD Pipeline

Automate your ML deployment pipeline with GitHub Actions.

What you'll learn:

  • ✅ GitHub Actions workflow syntax
  • ✅ Multi-stage CI/CD (build, test, deploy)
  • ✅ Security scanning (Trivy, Snyk)
  • ✅ Multi-environment deployment (dev → staging → prod)
  • ✅ Approval gates and notifications
  • ✅ Rollback strategies
  • ✅ GitOps principles

Workflows:

  1. Step 1: Basic Build
  2. Step 2: Build & Test
  3. Step 3: Build, Test & Deploy
  4. Step 4: Production-Ready Pipeline

Start Module 7: GitHub Actions CI/CD


Learning Approach

Scaffolded Exercises

This workshop uses a hands-on scaffolded approach:

What you get:

  • Complete file structure and imports
  • 80-90% of code already written
  • TODOs with inline hints

What you implement:

  • Specific function calls (1-3 lines per TODO)
  • Key parameter values
  • Critical configuration
  • ~10-20% of each exercise

Workshop Goals

By the end of this workshop, you will be able to:

Technical Skills

  • ✅ Train ML models with experiment tracking (MLflow)
  • ✅ Package models as production-ready APIs (BentoML)
  • ✅ Deploy services to Kubernetes with auto-scaling
  • ✅ Build high-performance infrastructure (Go)
  • ✅ Orchestrate ML workflows (Kubeflow)
  • ✅ Monitor model performance in production (Prometheus/Grafana)
  • ✅ Automate deployments with CI/CD (GitHub Actions)

Additional Resources

Workshop Guides

  • Setup Guide - Detailed environment setup instructions

External Documentation


Getting Help

If you encounter issues:

  1. Check the module's Troubleshooting section - Each module has common issues and fixes
  2. Review the Troubleshooting Guide - Comprehensive troubleshooting resource
  3. Check solution files - Located in modules/module-X/solution/

Quick Start

Ready to begin? Follow these steps:

1. Setup Your Environment

Start with Module 0 to install all required tools:

Module 0: Setup Guide

2. Follow Modules in Order

Each module builds on the previous. Do not skip modules.

Module 0 → Module 1 → Module 2 → ... → Module 7

3. Complete All Exercises

Each module has hands-on exercises with TODOs. Fill in the blanks.

4. Check solution if stuck


What You'll Build

By the end of this workshop, you'll have a complete, production-ready MLOps platform:

Components:

  • Model Training: MLflow tracking + model registry
  • Model Serving: BentoML API + Docker containers
  • Orchestration: Kubernetes with auto-scaling
  • API Gateway: Go reverse proxy + middleware
  • ML Pipelines: Kubeflow for workflow automation
  • Monitoring: Prometheus metrics + Grafana dashboards
  • CI/CD: GitHub Actions for automated deployments

Let's Get Started!

Begin with Module 0: Setup Guide