-
Notifications
You must be signed in to change notification settings - Fork 35
Module 5.md
By the end of this module, you will have:
- ✅ Running Kubeflow Pipelines on local Kubernetes
- ✅ Working ML pipeline (data prep → train → evaluate → deploy)
- ✅ Model deployed as REST API with KServe
- ✅ Understanding of production ML orchestration
- What Kubeflow Pipelines and KServe are and when to use them
- How to build reusable pipeline components
- How to create end-to-end ML workflows
- How to deploy models as scalable APIs
- How to integrate deployed models into applications
Before starting, ensure you have:
-
Docker Desktop installed and running
docker --version docker ps # Should connect without error -
kubectl (Kubernetes CLI)
kubectl version --client
-
kind (Kubernetes in Docker)
kind version
-
Python 3.9-3.13
python3 --version
-
8GB+ RAM available for the Kubernetes cluster
- Check Docker Desktop > Settings > Resources
cd modules/module-5
# Run automated installation
./scripts/install-kubeflow.shWhat this does:
- Creates kind cluster named
mlops-workshop(if not exists) - Installs cert-manager (required for Kubeflow)
- Installs Kubeflow Pipelines v2.14.3
- Patches minio with compatible image
- Waits for all components to start
Expected output:
✓ kind cluster created/verified
✓ Kubeflow Pipelines installed
✓ Waiting for pods to be ready...
✓ Installation complete!
Next steps:
1. kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80
2. Open http://localhost:8080
Installation takes 5-10 minutes on first run. Monitor progress:
# Watch pod status (Ctrl+C to exit when all Running)
kubectl get pods -n kubeflow -wAll pods should show Running with 1/1 or 2/2 READY:
NAME READY STATUS RESTARTS AGE
cache-server-xxx 2/2 Running 0 3m
metadata-envoy-deployment-xxx 1/1 Running 0 3m
metadata-grpc-deployment-xxx 2/2 Running 0 3m
minio-xxx 2/2 Running 0 3m
ml-pipeline-xxx 2/2 Running 0 3m
ml-pipeline-ui-xxx 2/2 Running 0 3m
mysql-xxx 2/2 Running 0 3m
workflow-controller-xxx 2/2 Running 0 3m
# In a separate terminal, keep this running
kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80Open browser: http://localhost:8080
You should see the Kubeflow Pipelines dashboard.
cd modules/module-5
# Create virtual environment
python3 -m venv venv
# Activate it
source venv/bin/activate # macOS/Linux# Navigate to starter directory
cd starter
# Install all required packages
pip install -r requirements.txtWhat gets installed:
-
kfp==2.14.3- Kubeflow Pipelines SDK -
pandas,numpy,scikit-learn- ML libraries -
setuptools,wheel- Build tools (for Python 3.13+)
# Check kfp version
pip show kfp
# Should show: Version: 2.14.3
# Test imports
python3 -c "from kfp.dsl import component, pipeline; print('✓ KFP imports working')"Run the verification script:
cd modules/module-5
./scripts/verify-installation.shExpected output:
✓ kubectl installed
✓ kind cluster running
✓ kubeflow namespace exists
✓ All pods running
✓ ml-pipeline-ui service available
✓ All checks passed!
If any checks fail, see Troubleshooting
Traditional ML workflow:
python prepare_data.py
python train_model.py --data=./data/train.csv
python evaluate.py --model=./models/model.pkl
kubectl apply -f deployment.yaml # if model is goodProblems:
- ❌ Not reproducible (hard to rerun exactly)
- ❌ No tracking (which data? which code?)
- ❌ Manual (human runs each step)
- ❌ No dependency management (what if step fails?)
- ❌ Hard to share with team
Kubeflow Pipelines turns your ML workflow into an automated, reproducible graph:
┌─────────────┐
│ Data Prep │ → train_data, test_data
└──────┬──────┘
↓
┌─────────────┐
│ Train Model │ → trained_model
└──────┬──────┘
↓
┌─────────────┐ ┌─────────────┐
│ Evaluate │←───┤ test_data │
└──────┬──────┘ └─────────────┘
↓
┌─────────────┐
│ Deploy │ → REST API
│ (KServe) │
└─────────────┘
Benefits:
- ✅ Reproducible - Same code + data + parameters = same results
- ✅ Automated - One click to run entire workflow
- ✅ Tracked - All inputs, outputs, and metrics logged
- ✅ Scalable - Runs on Kubernetes, auto-scales
- ✅ Shareable - Export as YAML, anyone can run
Components are self-contained code blocks that run in isolated containers:
from kfp.dsl import component, Output, Dataset
@component(
base_image="python:3.11-slim",
packages_to_install=["pandas==2.0.3"]
)
def prepare_data(output_data: Output[Dataset]):
"""Download and prepare data"""
import pandas as pd
data = pd.read_csv("https://example.com/data.csv")
data.to_csv(output_data.path, index=False)Key features:
- Runs in own container (isolated)
- Declares dependencies (
pandas) - Typed outputs (
Dataset) - Reusable across pipelines
Pipelines connect components into a workflow (DAG):
from kfp.dsl import pipeline
@pipeline(name="my-ml-pipeline")
def my_pipeline():
# Step 1: Prepare data
prep_task = prepare_data()
# Step 2: Train (uses step 1 output)
train_task = train_model(
data=prep_task.outputs["output_data"]
)
# Ensure order
train_task.after(prep_task)Automatic features:
- Runs steps in correct order
- Passes data between steps
- Tracks all inputs/outputs
- Handles failures
Artifacts are typed data passed between components:
| Type | Purpose | Example |
|---|---|---|
Dataset |
Training/test data | CSV files |
Model |
Trained models | Pickle, SavedModel |
Metrics |
Performance metrics | Accuracy, RMSE |
How it works:
def train_model(
train_data: Input[Dataset], # Read from previous step
model: Output[Model] # Write for next step
):
df = pd.read_csv(train_data.path)
trained = fit(df)
pickle.dump(trained, open(model.path, 'wb'))- Kubeflow stores artifacts in Minio (S3-compatible storage)
- Components read/write using
.path - Automatic lineage tracking
After training a model, you need to:
- Create HTTP server for predictions
- Handle scaling (0 to many replicas)
- Manage deployments (blue/green, canary)
- Monitor performance
Doing this manually is complex!
KServe is a Kubernetes-native platform that turns your model into a production REST API:
Your Model (model.pkl)
↓ Deploy
┌──────────────────────────────┐
│ KServe InferenceService │
│ │
│ HTTP: /v1/models/NAME:predict
│ │
│ Auto-scaling: 0 → many pods │
│ Monitoring: metrics, logs │
└──────────────────────────────┘
↓
Your App calls API
What you get:
- ✅ Standard API - All models use same format
- ✅ Auto-scaling - Scale to zero (save $), scale up on traffic
- ✅ Canary deployments - Test new versions with % of traffic
- ✅ Monitoring - Request logs, latency, errors
All KServe models use this standard format:
Request:
POST /v1/models/MODEL_NAME:predict
{
"instances": [
{"user_id": 1, "n_recommendations": 5}
]
}Response:
{
"predictions": [
{
"user_id": 1,
"recommendations": [
{"movie_id": 50, "movie_name": "Star Wars", "score": 0.89}
]
}
]
}┌────────────────────────────────────────────┐
│ Kubeflow Pipeline │
│ │
│ Data Prep → Train → Evaluate → Deploy │
│ ↓ │
│ model.pkl │
└──────────────────────┬─────────────────────┘
│ Creates
↓
┌────────────────────────────────────────────┐
│ KServe InferenceService │
│ │
│ REST API: http://service:8080/predict │
│ │
│ Your App → [Request] → [Model] → [Response]
└────────────────────────────────────────────┘
Complete workflow:
- Kubeflow Pipeline trains and evaluates model
- Deploy component creates KServe InferenceService
- KServe starts serving model as HTTP API
- Your application calls API for predictions
Goal: Build a component to download and prepare the MovieLens dataset.
cd modules/module-5/starter
# Open in your editor
code components/data_prep.py # VS Code
# OR
nano components/data_prep.py # Terminal editor
# OR
open components/data_prep.py # macOS defaultYou need to implement:
- @component decorator - Configure base image and packages
-
Download dataset - Use
urllib.request.urlretrieve() -
Load data - Read with
pd.read_csv() -
Train/test split - Use
train_test_split() -
Save outputs - Write to
train_data.pathandtest_data.path - Movie metadata - Extract genres from binary columns
Key concepts to use:
-
Output[Dataset]for outputs -
.pathattribute to get file path -
pd.read_csv()anddf.to_csv()
Stuck? Look at solution/components/data_prep_solution.py
Goal: Create components for training a collaborative filtering model and evaluating it.
cd modules/module-5/starter
open components/train.pyWhat you'll implement:
- Load training data using
Input[Dataset] - Create user-movie matrix with numpy
- Train
TruncatedSVDmodel - Calculate training RMSE
- Log metrics with
metrics.log_metric() - Save model with pickle to
model.path
Key concepts:
-
Input[Dataset]for inputs,Output[Model]for model -
Input[Metrics]for logging to Kubeflow - Matrix factorization with SVD
open components/evaluate.pyWhat you'll implement:
- Load test data and model
- Calculate predictions
- Compute RMSE and MAE
- Log evaluation metrics
Stuck? Look at solution/components/train_solution.py and evaluate_solution.py
Goal: Connect all components into an end-to-end pipeline.
cd modules/module-5/starter
open recommendation_pipeline.pyYou'll create a pipeline that:
-
Runs data preparation - Call
prepare_data() -
Trains model - Call
train_model()with outputs from step 1 -
Evaluates model - Call
evaluate_model()with test data and model - Connects components - Pass outputs as inputs
-
Sets dependencies - Use
.after()to control order
Pipeline structure:
@pipeline(name="movie-recommendation-pipeline")
def recommendation_pipeline():
# Step 1
data_task = prepare_data()
# Step 2 (uses data from step 1)
train_task = train_model(
train_data=data_task.outputs["train_data"]
)
# Step 3 (uses test data and model)
eval_task = evaluate_model(
test_data=data_task.outputs["test_data"],
model=train_task.outputs["model"]
)Before deploying with KServe, we need to create a Docker image that can serve predictions.
cd modules/module-5/model
lsFiles:
-
Dockerfile- Container definition -
serve.py- Flask server implementing KServe v2 protocol -
recommender.py- Movie recommendation logic -
requirements.txt- Python dependencies
How it works:
# serve.py
@app.route("/v1/models/recommender:predict", methods=["POST"])
def predict():
request_data = request.get_json()
user_id = request_data["instances"][0]["user_id"]
# Load model from minio and predict
recommendations = model.recommend_movies(user_id)
return jsonify({"predictions": [recommendations]})cd modules/module-5/model
# Build image
docker build -t movie-recommender:latest .What this does:
- Uses Python 3.11 slim base image
- Installs pandas, scikit-learn, numpy, boto3
- Copies serve.py and recommender.py
- Exposes port 8080
- Starts Flask server
IMPORTANT: Knative Serving (used by KServe) tries to resolve image digests from Docker Hub. For local images, we need to use a special prefix:
# Tag with kind.local prefix (bypasses digest resolution)
docker tag movie-recommender:latest kind.local/movie-recommender:latestWhy kind.local?
- Knative config has
registries-skipping-tag-resolving: "kind.local,ko.local,dev.local" - Images with these prefixes skip digest resolution
- Allows local images to work without pushing to registry
# Load image into kind cluster
kind load docker-image kind.local/movie-recommender:latest --name mlops-workshop
# Verify image is loaded
docker exec mlops-workshop-control-plane crictl images | grep movie-recommenderExpected output:
kind.local/movie-recommender latest abc123def456 50MB
KServe needs permissions to create InferenceServices:
cd modules/module-5
# Apply RBAC fix
kubectl apply -f kserve/rbac-fix.yamlVerify permissions:
kubectl auth can-i create inferenceservices \
--as=system:serviceaccount:kubeflow:pipeline-runner \
-n defaultShould return: yes
cd modules/module-5/solution
# Compile solution pipeline (includes deploy component)
python3 recommendation_pipeline_solution.py --output pipeline_with_deploy.yaml-
Upload
pipeline_with_deploy.yamlto Kubeflow UI -
Create run with parameters:
-
deploy_model_flag: True (enable deployment) - Other parameters: use defaults
-
- Start the run
Pipeline will:
- Prepare data (5 min)
- Train model (3 min)
- Evaluate model (2 min)
- Deploy to KServe (2 min)
# Check InferenceService
kubectl get inferenceservice -n default
# Expected output:
# NAME URL READY
# movie-recommender http://movie-recommender.default... TrueCheck predictor pods:
kubectl get pods -n default | grep movie-recommender
# Expected: 1-2 predictor pods runningView logs:
# Get pod name
kubectl get pods -n default -l serving.kserve.io/inferenceservice=movie-recommender
# View logs
kubectl logs -n default <pod-name># In a separate terminal
kubectl port-forward -n default pod/movie-recommender-predictor 8080:80# Basic prediction
curl -X POST http://localhost:8080/v1/models/recommender:predict \
-H "Content-Type: application/json" \
-d '{
"instances": [
{"user_id": 1, "n_recommendations": 5}
]
}'Expected response:
{
"predictions": [
{
"user_id": 1,
"recommendations": [
{
"movie_id": 50,
"movie_name": "Star Wars (1977)",
"score": 0.89,
"genres": ["Action", "Adventure", "Sci-Fi"]
},
{
"movie_id": 181,
"movie_name": "Return of the Jedi (1983)",
"score": 0.87,
"genres": ["Action", "Adventure", "Sci-Fi"]
}
]
}
]
}If you only see movie_id (no names/genres), see Re-running Pipeline
# Try different user IDs
curl -X POST http://localhost:8080/v1/models/recommender:predict \
-H "Content-Type: application/json" \
-d '{"instances": [{"user_id": 42, "n_recommendations": 3}]}'class MovieRecommenderClient {
constructor(baseUrl = 'http://localhost:8080') {
this.baseUrl = baseUrl;
this.predictUrl = `${baseUrl}/v1/models/recommender:predict`;
}
async getRecommendations(userId, nRecommendations = 10, genre = null) {
const payload = {
instances: [
{
user_id: userId,
n_recommendations: nRecommendations
}
]
};
// Add genre filter if specified
if (genre) {
payload.instances[0].genre = genre;
}
const response = await fetch(this.predictUrl, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify(payload)
});
if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}
const data = await response.json();
return data.predictions[0];
}
}
// Usage
const client = new MovieRecommenderClient();
client.getRecommendations(1, 5)
.then(result => {
result.recommendations.forEach(rec => {
console.log(`${rec.movie_name} (${rec.score.toFixed(2)})`);
console.log(` Genres: ${rec.genres.join(', ')}`);
});
})
.catch(error => console.error('Error:', error));The service supports filtering recommendations by genre:
- Action
- Adventure
- Animation
- Children
- Comedy
- Crime
- Documentary
- Drama
- Fantasy
- Film-Noir
- Horror
- Musical
- Mystery
- Romance
- Sci-Fi
- Thriller
- War
- Western
Get only Action movies:
curl -X POST http://localhost:8080/v1/models/recommender:predict \
-H "Content-Type: application/json" \
-d '{
"instances": [
{
"user_id": 1,
"n_recommendations": 5,
"genre": "Action"
}
]
}'Get Comedy movies:
curl -X POST http://localhost:8080/v1/models/recommender:predict \
-H "Content-Type: application/json" \
-d '{
"instances": [
{
"user_id": 1,
"n_recommendations": 5,
"genre": "Comedy"
}
]
}'How it works:
- Case-insensitive partial matching
-
"sci"matches"Sci-Fi" -
"action"matches"Action" - Only returns movies with matching genre
Error:
minio-xxx 0/2 CrashLoopBackOff
Cause: Incompatible minio image (especially on ARM/Apple Silicon)
Fix:
# Patch with compatible image
kubectl set image deployment/minio -n kubeflow \
minio=minio/minio:RELEASE.2025-09-07T16-13-09Z-cpuv1
# Wait for rollout
kubectl rollout status deployment/minio -n kubeflow
# Verify
kubectl get pods -n kubeflow | grep minio
# Should show: minio-xxx 2/2 RunningError:
Cannot import 'setuptools.build_meta'
Cause: Python 3.13 doesn't include setuptools by default
Fix:
# Install setuptools first
pip install --upgrade pip setuptools wheel
# Then install requirements
pip install -r requirements.txt
# Verify
pip show setuptools # Should show 65.0.0+Error:
ApiException: (403) Forbidden
User 'system:serviceaccount:kubeflow:pipeline-runner' cannot create inferenceservices
Cause: Missing RBAC permissions
Fix:
# Apply RBAC
kubectl apply -f kserve/rbac-fix.yaml
# Verify
kubectl auth can-i create inferenceservices \
--as=system:serviceaccount:kubeflow:pipeline-runner \
-n default
# Should return: yesError:
Unable to fetch image "movie-recommender:latest"
failed to resolve image to digest: 401 Unauthorized
Cause: Knative tries to pull from Docker Hub, but image is local
Fix:
# 1. Tag with kind.local prefix
docker tag movie-recommender:latest kind.local/movie-recommender:latest
# 2. Load to kind
kind load docker-image kind.local/movie-recommender:latest --name mlops-workshop
# 3. Verify loaded
docker exec mlops-workshop-control-plane crictl images | grep movie-recommender
# 4. Update InferenceService
kubectl patch inferenceservice movie-recommender -n default --type='json' \
-p='[{"op": "replace", "path": "/spec/predictor/containers/0/image", "value": "kind.local/movie-recommender:latest"}]'
# OR delete and recreate
kubectl delete inferenceservice movie-recommender -n default
# Then re-run pipeline with deploy_model_flag=True# Access UI
kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80
# View pods
kubectl get pods -n kubeflow
# View pipeline runs
kubectl get pipelineruns -n kubeflow
# View component logs
kubectl logs <pod-name> -n kubeflow
# Restart Kubeflow
kubectl rollout restart deployment -n kubeflow# Delete pipeline run
kubectl delete pipelinerun <run-name> -n kubeflow
# Delete InferenceService
kubectl delete inferenceservice movie-recommender -n default
# Uninstall Kubeflow
kubectl delete namespace kubeflow
kubectl delete namespace cert-manager
# Delete kind cluster
kind delete cluster --name mlops-workshop✅ Automated ML Workflows - End-to-end pipelines in code
✅ Component Reusability - Build once, use everywhere
✅ Artifact Tracking - Automatic versioning and lineage
✅ Production Deployment - Models as REST APIs with KServe
✅ Enterprise Orchestration - Kubernetes-native ML platform
| Feature | Manual | With Kubeflow |
|---|---|---|
| Workflow | Manual steps | Automated pipeline |
| Tracking | None | All artifacts logged |
| Reproducibility | Hard | One-click rerun |
| Deployment | Manual kubectl | Automated with KServe |
| Collaboration | Hard to share | Share YAML file |
| Visibility | Logs only | Visual graph + metrics |
Extend your pipeline:
- Add hyperparameter tuning component (GridSearchCV)
- Implement A/B testing with canary deployments (10% traffic)
- Add data validation (Great Expectations)
- Create model monitoring dashboard (Prometheus + Grafana)
Production considerations:
- Set up persistent artifact storage (S3, GCS)
- Configure resource limits and auto-scaling
- Implement CI/CD for pipeline updates (GitHub Actions)
- Add authentication and authorization
Continue learning:
- Kubeflow Pipelines Documentation
- KFP SDK v2 Guide
- KServe Documentation
- Collaborative Filtering Tutorial
| Previous | Home | Next |
|---|---|---|
| ← Module 4: API Gateway & Polyglot Architecture | 🏠 Home | Module 6: Monitoring & Observability → |
MLOps Workshop | GitHub Repository
Congratulations! 🎉
You've mastered enterprise ML workflow orchestration with Kubeflow Pipelines and model serving with KServe!