Skip to content

Commit e10bfac

Browse files
committed
Finished Chapter05
1 parent 388f2c2 commit e10bfac

6 files changed

Lines changed: 176 additions & 27 deletions

File tree

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
## Proxy-agent on Minikube
2+
3+
When deploying Kubeflow Pipelines on a local cluster (e.g., Minikube), the `proxy-agent` may crash
4+
because it attempts to auto-register using the Google Compute metadata service and expects cloud
5+
credentials or an external proxy/back-end configuration. If you see `proxy-agent` in
6+
`CrashLoopBackOff` or `Error`, you can either disable it for local use or configure it properly for
7+
cloud use.
8+
9+
Quick fixes for local Minikube:
10+
11+
- Scale the agent to zero replicas (simple and safe):
12+
13+
```sh
14+
kubectl -n kubeflow scale deployment proxy-agent --replicas=0
15+
```
16+
17+
- Or set a minimal `Hostname` so the agent doesn't query the GCE metadata server:
18+
19+
```sh
20+
kubectl -n kubeflow patch configmap inverse-proxy-config --type=merge \
21+
-p '{"data":{"Hostname":"minikube"}}'
22+
kubectl -n kubeflow rollout restart deployment proxy-agent
23+
```
24+
25+
For cloud (GCP) usage: populate `inverse-proxy-config` with `Hostname`, `ProxyUrl`, and `BackendId`,
26+
and ensure the cluster has appropriate Google Application Default Credentials / service account
27+
access so the agent can register successfully.
28+
29+
Note: The `PIPELINE_VERSION` used by the deploy scripts selects the upstream manifests (e.g., `1.8.5`,
30+
`2.15.2`) which may reference specific container image tags; if you get `ImagePullBackOff` errors,
31+
check image tags in the release manifests or try a different `PIPELINE_VERSION`.
32+
33+
Quick helper scripts
34+
35+
- `disable_proxy_agent.ps1` — PowerShell helper to set `inverse-proxy-config` Hostname and scale the
36+
`proxy-agent` deployment to `0`. Useful for Minikube/local runs.
37+
- `disable_proxy_agent.sh` — same as above for users running Bash.
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
<#
2+
.SYNOPSIS
3+
Deploy Kubeflow Pipelines to a Kubernetes cluster (PowerShell version of deploy_kubeflow_pipelines.zsh)
4+
5+
.DESCRIPTION
6+
Sets the PIPELINE_VERSION environment variable and applies the Kubeflow Pipelines kustomize manifests
7+
from the official GitHub repo. Intended to be run in an elevated PowerShell session with `kubectl`
8+
configured to point at the target cluster.
9+
10+
.EXAMPLE
11+
./deploy_kubeflow_pipelines.ps1
12+
./deploy_kubeflow_pipelines.ps1 -PipelineVersion '1.8.5'
13+
#>
14+
15+
param(
16+
[string]
17+
$PipelineVersion = '2.15.2'
18+
)
19+
20+
Set-StrictMode -Version Latest
21+
22+
function Assert-CommandExists {
23+
param([string]$Cmd)
24+
if (-not (Get-Command $Cmd -ErrorAction SilentlyContinue)) {
25+
Write-Error "Required command '$Cmd' not found in PATH. Install it and try again."
26+
exit 2
27+
}
28+
}
29+
30+
Assert-CommandExists -Cmd 'kubectl'
31+
32+
$env:PIPELINE_VERSION = $PipelineVersion
33+
Write-Host "Using PIPELINE_VERSION=$env:PIPELINE_VERSION"
34+
35+
try {
36+
Write-Host 'Applying cluster-scoped resources...'
37+
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$env:PIPELINE_VERSION"
38+
39+
Write-Host 'Waiting for Applications CRD to be established...'
40+
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
41+
42+
Write-Host 'Applying environment manifests (dev)...'
43+
kubectl apply -k "github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$env:PIPELINE_VERSION"
44+
45+
Write-Host 'Deployment applied. Recommended verification commands:'
46+
Write-Host " kubectl -n kubeflow get pods"
47+
Write-Host " kubectl -n kubeflow get pods --watch"
48+
Write-Host "To access the UI run: kubectl port-forward -n kubeflow svc/ml-pipeline-ui 8080:80"
49+
Write-Host "Then open http://localhost:8080 in your browser (Start-Process 'http://localhost:8080')."
50+
}
51+
catch {
52+
Write-Error "Deployment failed: $_"
53+
exit 1
54+
}
55+
56+
Write-Host 'Done.'
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
<#
2+
.SYNOPSIS
3+
Disable the proxy-agent for local Minikube environments
4+
5+
.DESCRIPTION
6+
Sets a minimal Hostname in the inverse-proxy-config ConfigMap and scales the proxy-agent
7+
deployment to 0 replicas to prevent CrashLoopBackOff when running Kubeflow Pipelines locally.
8+
9+
.EXAMPLE
10+
./disable_proxy_agent.ps1
11+
#>
12+
13+
Set-StrictMode -Version Latest
14+
15+
if (-not (Get-Command kubectl -ErrorAction SilentlyContinue)) {
16+
Write-Error 'kubectl not found in PATH. Install/configure kubectl and try again.'
17+
exit 2
18+
}
19+
20+
Write-Host 'Patching inverse-proxy-config ConfigMap (Hostname=minikube)...'
21+
kubectl -n kubeflow apply -f - <<'YAML'
22+
apiVersion: v1
23+
kind: ConfigMap
24+
metadata:
25+
name: inverse-proxy-config
26+
namespace: kubeflow
27+
data:
28+
Hostname: minikube
29+
YAML
30+
31+
Write-Host 'Scaling proxy-agent deployment to 0 replicas...'
32+
kubectl -n kubeflow scale deployment proxy-agent --replicas=0
33+
34+
Write-Host 'Done.'
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
#!/usr/bin/env bash
2+
# Disable proxy-agent for local Minikube environments
3+
set -euo pipefail
4+
5+
command -v kubectl >/dev/null 2>&1 || { echo "kubectl not found in PATH" >&2; exit 2; }
6+
7+
cat <<EOF | kubectl -n kubeflow apply -f -
8+
apiVersion: v1
9+
kind: ConfigMap
10+
metadata:
11+
name: inverse-proxy-config
12+
namespace: kubeflow
13+
data:
14+
Hostname: minikube
15+
EOF
16+
17+
kubectl -n kubeflow scale deployment proxy-agent --replicas=0
18+
echo "Done. proxy-agent scaled to 0 and Hostname set to minikube."

Chapter05/going_with_the_kubeflow/going_with_the_kubeflow/pipeline_basic.py

Lines changed: 31 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,11 @@
11
from typing import List
22

3-
from kfp import Client
4-
import kfp.dsl
5-
from kfp.v2 import dsl
6-
from kfp.v2.dsl import Dataset
7-
from kfp.v2.dsl import Input
8-
from kfp.v2.dsl import Model
9-
from kfp.v2.dsl import Output
3+
from kfp import dsl, Client
4+
from kfp.dsl import Dataset, Input, Model, Output
105

116

12-
@dsl.component(packages_to_install=['pandas==1.3.5'])
7+
8+
@dsl.component(packages_to_install=['pandas==2.2.3'], base_image='python:3.11')
139
def create_dataset(iris_dataset: Output[Dataset]):
1410
import pandas as pd
1511

@@ -24,7 +20,7 @@ def create_dataset(iris_dataset: Output[Dataset]):
2420
df.to_csv(f)
2521

2622

27-
@dsl.component(packages_to_install=['pandas==1.3.5', 'scikit-learn==1.0.2'])
23+
@dsl.component(packages_to_install=['pandas==2.2.3', 'scikit-learn==1.4.2'], base_image='python:3.11')
2824
def normalize_dataset(
2925
input_iris_dataset: Input[Dataset],
3026
normalized_iris_dataset: Output[Dataset],
@@ -54,7 +50,7 @@ def normalize_dataset(
5450
df.to_csv(f)
5551

5652

57-
@dsl.component(packages_to_install=['pandas==1.3.5', 'scikit-learn==1.0.2'])
53+
@dsl.component(packages_to_install=['pandas==2.2.3', 'scikit-learn==1.4.2'], base_image='python:3.11')
5854
def train_model(
5955
normalized_iris_dataset: Input[Dataset],
6056
model: Output[Model],
@@ -80,38 +76,46 @@ def train_model(
8076
pickle.dump(clf, f)
8177

8278

83-
@dsl.pipeline(name='iris-training-pipeline')
79+
@dsl.pipeline(
80+
name="iris-training-pipeline",
81+
description="Iris pipeline",
82+
pipeline_root="", # optional
83+
)
8484
def my_pipeline(
8585
standard_scaler: bool,
8686
min_max_scaler: bool,
8787
neighbors: List[int],
8888
):
8989
create_dataset_task = create_dataset()
90+
create_dataset_task.set_caching_options(False)
9091

9192
normalize_dataset_task = normalize_dataset(
92-
input_iris_dataset=create_dataset_task.outputs['iris_dataset'],
93-
standard_scaler=True,
94-
min_max_scaler=False)
93+
input_iris_dataset=create_dataset_task.outputs["iris_dataset"],
94+
standard_scaler=standard_scaler,
95+
min_max_scaler=min_max_scaler,
96+
)
97+
normalize_dataset_task.set_caching_options(False)
9598

9699
with dsl.ParallelFor(neighbors) as n_neighbors:
97-
train_model(
98-
normalized_iris_dataset=normalize_dataset_task
99-
.outputs['normalized_iris_dataset'],
100-
n_neighbors=n_neighbors)
100+
t = train_model(
101+
normalized_iris_dataset=normalize_dataset_task.outputs["normalized_iris_dataset"],
102+
n_neighbors=n_neighbors,
103+
)
104+
t.set_caching_options(False)
101105

102106

103-
endpoint = 'http://localhost:8080' #as a result of port-forwarding.
104-
# got this from running kubectl cluster-info --context kind-mlewp (this is cluster name)
105-
#endpoint = 'https://127.0.0.1:50663'
107+
endpoint = "http://localhost:8080"
106108
kfp_client = Client(host=endpoint)
109+
107110
run = kfp_client.create_run_from_pipeline_func(
108111
my_pipeline,
109-
mode=kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE,
110112
arguments={
111-
'min_max_scaler': True,
112-
'standard_scaler': False,
113-
'neighbors': [3, 6, 9]
113+
"min_max_scaler": True,
114+
"standard_scaler": False,
115+
"neighbors": [3, 6, 9],
114116
},
117+
enable_caching=False,
115118
)
116-
url = f'{endpoint}/#/runs/details/{run.run_id}'
117-
print(url)
119+
120+
url = f"{endpoint}/#/runs/details/{run.run_id}"
121+
print(url)

inverse-proxy-config.yaml

242 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)