# MLOps Workshop Wiki

Welcome to the **MLOps Workshop**! This wiki provides step-by-step walkthroughs for each module.

## Workshop Overview

This **6-hour hands-on workshop** teaches you to build production-ready ML systems from scratch. You'll progress through 8 modules covering the complete MLOps lifecycle: from model training to production deployment with monitoring and CI/CD.

### Platform Support

| Platform | How to run |
|---|---|
| macOS (Intel & Apple Silicon) | Local — **[Module 0 → Option A](Module-0#option-a-macos-local-setup)** |
| Linux (Ubuntu 20.04+) | Local — follow Option A, `apt` instead of `brew` |
| Windows 10/11 (native) | Local — **[Module 0 → Option D](Module-0#option-d-windows-native-powershell)** |
| Windows 10/11 (WSL 2) | Local — **[Module 0 → Option C](Module-0#option-c-windows-wsl-2)** |
| Any (browser, zero install) | **[Module 0 → Option B: Codespaces](Module-0#option-b-github-codespaces)** |

---

## Workshop Structure

### Complete Learning Path

```
Module 0: Setup
    ↓
Module 1: Model Training & Experiment Tracking
    ↓
Module 2: Model Packaging & Serving
    ↓
Module 3: Kubernetes Deployment
    ↓
Module 4: API Gateway & Polyglot Architecture
    ↓
Module 5: ML Pipeline Automation
    ↓
Module 6: Monitoring & Observability
    ↓
Module 7: CI/CD Pipeline
    ↓
🎉 Complete MLOps Platform!
```

---

## Modules {#modules}

### Module 0: Environment Setup

Set up your development environment with Python, Go, Docker, Kubernetes, and all workshop dependencies.

**What you'll install:**
- Python 3.9+ with ML libraries (MLflow, BentoML, Transformers)
- Go 1.21+ for infrastructure services
- Docker for containerization
- kubectl and kind for local Kubernetes
- MLflow tracking server and BentoML

→ **[Start Module 0: Setup Guide](Module-0)**

---

### Module 1: Model Training & Experiment Tracking

Train a sentiment analysis model with Hugging Face transformers and track experiments using MLflow.

**What you'll learn:**
- ✅ Fine-tune DistilBERT for sentiment classification
- ✅ Track experiments with MLflow (parameters, metrics, models)
- ✅ Use MLflow Model Registry for version management
- ✅ Compare training runs and select best models
- ✅ Build production-ready training scripts

**Exercises:**
1. **Exercise 1:** Basic Training with MLflow
2. **Exercise 2:** Model Registry Workflow

→ **[Start Module 1: MLflow & Experiment Tracking](Module-1)**

---

### Module 2: Model Packaging & Serving

Package your trained model as a production-ready REST API using BentoML 1.4+.

**What you'll learn:**
- ✅ BentoML 1.4+ class-based service architecture
- ✅ Pydantic v2 validation for type-safe APIs
- ✅ Error handling and structured logging
- ✅ Batch processing for higher throughput
- ✅ Docker containerization
- ✅ OpenAPI/Swagger documentation

**Exercises:**
1. **Exercise 1:** Basic BentoML Service
2. **Exercise 2:** Production Features

→ **[Start Module 2: BentoML & Model Serving](Module-2)**

---

### Module 3: Kubernetes Deployment

Deploy your containerized ML service to Kubernetes with production-grade configuration.

**What you'll learn:**
- ✅ Kubernetes fundamentals (Pods, Deployments, Services)
- ✅ Resource management (requests, limits, QoS)
- ✅ Health probes (startup, liveness, readiness)
- ✅ Horizontal Pod Autoscaling (HPA)
- ✅ ConfigMaps for configuration management
- ✅ High availability and security patterns

**Exercises:**
1. **Exercise 1:** Basic Deployment
2. **Exercise 2:** Production Configuration
3. **Exercise 3:** Auto-scaling & HA

→ **[Start Module 3: Kubernetes Deployment](Module-3)**

---

### Module 4: API Gateway & Polyglot Architecture

Build a high-performance API gateway in Go to front your ML services.

**What you'll learn:**
- ✅ Why Go for infrastructure (67% resource reduction)
- ✅ Reverse proxy patterns
- ✅ Middleware (logging, CORS, rate limiting)
- ✅ Health checks and circuit breakers
- ✅ Prometheus metrics integration
- ✅ Polyglot architecture benefits

**Exercises:**
1. **Exercise 1:** Basic Reverse Proxy
2. **Exercise 2:** Production Middleware

→ **[Start Module 4: Go API Gateway](Module-4)**

---

### Module 5: ML Pipeline Automation

Orchestrate end-to-end ML workflows with Kubeflow Pipelines.

**What you'll learn:**
- ✅ Kubeflow Pipelines components and DAGs
- ✅ Artifact tracking and versioning
- ✅ Pipeline orchestration patterns
- ✅ KServe for model serving
- ✅ Multi-model deployment strategies
- ✅ Automated retraining workflows

**Exercises:**
1. **Exercise 1:** Data Preparation Component
2. **Exercise 2:** Training & Evaluation Components
3. **Exercise 3:** Pipeline Orchestration

→ **[Start Module 5: Kubeflow Pipelines](Module-5)**

---

### Module 6: Monitoring & Observability

Set up production monitoring with Prometheus and Grafana.

**What you'll learn:**
- ✅ Prometheus for metrics collection
- ✅ PromQL queries and aggregation
- ✅ Alerting rules and Alertmanager
- ✅ Grafana dashboards for visualization
- ✅ ML-specific metrics (prediction latency, model performance)
- ✅ SLO/SLA monitoring

**Exercises:**
1. **Exercise 2:** Alerting Rules
2. **Exercise 3:** Grafana Dashboard

→ **[Start Module 6: Prometheus & Grafana](Module-6)**

---

### Module 7: CI/CD Pipeline

Automate your ML deployment pipeline with GitHub Actions.

**What you'll learn:**
- ✅ GitHub Actions workflow syntax
- ✅ Multi-stage CI/CD (build, test, deploy)
- ✅ Security scanning (Trivy, Snyk)
- ✅ Multi-environment deployment (dev → staging → prod)
- ✅ Approval gates and notifications
- ✅ Rollback strategies
- ✅ GitOps principles

**Workflows:**
1. **Step 1:** Basic Build
2. **Step 2:** Build & Test
3. **Step 3:** Build, Test & Deploy
4. **Step 4:** Production-Ready Pipeline

→ **[Start Module 7: GitHub Actions CI/CD](Module-7)**

---

## Installing on Windows (Native PowerShell)

Run the workshop natively on Windows 10/11 using PowerShell — no WSL required.

### Prerequisites

| Tool | Install |
|---|---|
| Python 3.11+ | [python.org](https://www.python.org/downloads/) — check **"Add to PATH"** during install |
| Docker Desktop | [docker.com](https://www.docker.com/products/docker-desktop/) |
| kind | `winget install Kubernetes.kind` or download from [kind.sigs.k8s.io](https://kind.sigs.k8s.io/) |
| kubectl | `winget install Kubernetes.kubectl` |
| Go 1.21+ | `winget install GoLang.Go` |
| Git | [git-scm.com](https://git-scm.com/download/win) |

### Setup

**1. Clone the repo**

```powershell
git clone <repo-url>
cd ml-con-workshop
```

**2. Create a Python virtual environment**

```powershell
python -m venv venv
venv\Scripts\Activate.ps1
pip install --upgrade pip
pip install -r requirements.txt
```

**3. Verify Docker and kind**

```powershell
docker version
kind version
kubectl version --client
```

**4. Create a local Kubernetes cluster**

```powershell
kind create cluster --name ml-workshop
kubectl cluster-info --context kind-ml-workshop
```

> **Note:** Throughout the wiki, bash snippets like `source venv/bin/activate` become `venv\Scripts\Activate.ps1` in PowerShell, and paths use `\` instead of `/`.

---

## Learning Approach

### Scaffolded Exercises

This workshop uses a **hands-on scaffolded approach**:

✅ **What you get:**
- Complete file structure and imports
- 80-90% of code already written
- TODOs with inline hints

✅ **What you implement:**
- Specific function calls (1-3 lines per TODO)
- Key parameter values
- Critical configuration
- ~10-20% of each exercise
---

## Workshop Goals

By the end of this workshop, you will be able to:

### Technical Skills
- ✅ Train ML models with experiment tracking (MLflow)
- ✅ Package models as production-ready APIs (BentoML)
- ✅ Deploy services to Kubernetes with auto-scaling
- ✅ Build high-performance infrastructure (Go)
- ✅ Orchestrate ML workflows (Kubeflow)
- ✅ Monitor model performance in production (Prometheus/Grafana)
- ✅ Automate deployments with CI/CD (GitHub Actions)

---

## Additional Resources

### Workshop Guides
- **[Setup Guide](Module-0)** - Detailed environment setup instructions

### External Documentation
- [MLflow Documentation](https://mlflow.org/docs/latest/index.html)
- [BentoML Documentation](https://docs.bentoml.com/)
- [Kubernetes Documentation](https://kubernetes.io/docs/)
- [Kubeflow Documentation](https://www.kubeflow.org/docs/)
- [Prometheus Documentation](https://prometheus.io/docs/)
- [GitHub Actions Documentation](https://docs.github.com/en/actions)

---

## Getting Help

If you encounter issues:

1. **Check the module's Troubleshooting section** - Each module has common issues and fixes
2. **Review the [Troubleshooting Guide](Troubleshooting.md)** - Comprehensive troubleshooting resource
3. **Check solution files** - Located in `modules/module-X/solution/`
---

## Quick Start

Ready to begin? Follow these steps:

### 1. Setup Your Environment
Start with Module 0 to install all required tools:

→ **[Module 0: Setup Guide](Module-0)**

### 2. Follow Modules in Order

Each module builds on the previous. **Do not skip modules.**

```
Module 0 → Module 1 → Module 2 → ... → Module 7
```

### 3. Complete All Exercises

Each module has hands-on exercises with TODOs. Fill in the blanks.

### 4. Check solution if stuck

---

## What You'll Build

By the end of this workshop, you'll have a **complete, production-ready MLOps platform**. The diagram below shows how the pieces you build in each module fit together:

```mermaid
flowchart TB
    subgraph DEV["Developer workflow"]
        GH[GitHub repo]
        CI[GitHub Actions<br/>Module 7]
        GH -->|push| CI
    end

    subgraph TRAIN["Training plane — Modules 1 & 5"]
        HF[Hugging Face Hub]
        KFP[Kubeflow Pipelines<br/>Module 5]
        MLF[(MLflow<br/>tracking + registry<br/>Module 1)]
        HF --> KFP
        KFP -->|log runs & register models| MLF
    end

    subgraph SERVE["Serving plane — Modules 2, 3, 4"]
        U((User / client))
        GW[Go API gateway<br/>Module 4]
        ML[BentoML sentiment API<br/>Module 2]
        K8S[(Kubernetes / kind<br/>Module 3)]
        U -->|HTTP| GW -->|reverse proxy| ML
        ML -.runs on.-> K8S
        GW -.runs on.-> K8S
    end

    subgraph OBS["Observability — Module 6"]
        PROM[Prometheus]
        GRAF[Grafana dashboards]
        PROM --> GRAF
    end

    MLF -->|load model| ML
    CI -->|build & push images| REG[(ghcr.io)]
    REG -->|pull| K8S
    ML -->|/metrics| PROM
    GW -->|/metrics| PROM
```

**Components:**
- **Model Training:** MLflow tracking + model registry (Module 1)
- **Model Serving:** BentoML API + Docker containers (Module 2)
- **Orchestration:** Kubernetes with auto-scaling (Module 3)
- **API Gateway:** Go reverse proxy + middleware (Module 4)
- **ML Pipelines:** Kubeflow for workflow automation (Module 5)
- **Monitoring:** Prometheus metrics + Grafana dashboards (Module 6)
- **CI/CD:** GitHub Actions for automated deployments (Module 7)

---

## Let's Get Started!

→ **[Begin with Module 0: Setup Guide](Module-0)**