Skip to content

Commit 2dbfd1c

Browse files
committed
update readme
1 parent 6262fee commit 2dbfd1c

1 file changed

Lines changed: 56 additions & 16 deletions

File tree

Readme.md

Lines changed: 56 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,9 @@ This repository implements a complete ML pipeline for the Iris dataset classific
99
- Parallel model training (Decision Tree, Random Forest, XGBoost)
1010
- Automatic model evaluation and selection
1111
- Model registration and versioning in Vertex AI
12-
- Automated deployment to Vertex AI endpoints
12+
- Automated deployment to FastAPI services on Cloud Run
1313
- Batch inference capabilities
14+
- Real-time streaming inference with Dataflow
1415
- REST API serving with FastAPI
1516

1617
## Key Features
@@ -36,11 +37,14 @@ src/ml_pipelines_kfp/
3637
│ └── constants.py # Configuration constants
3738
├── workflows/ # Alternative workflow implementations
3839
└── notebooks/ # Example notebooks and experiments
40+
├── dataflow/ # Dataflow streaming pipelines
41+
│ └── iris_streaming_pipeline.py
3942
schemas/ # Input/output schemas for Vertex AI
4043
memory-bank/ # Project context and documentation
4144
Dockerfile # Container definition
4245
pyproject.toml # Project dependencies
4346
pipeline.yaml # Pipeline configuration
47+
deploy_dataflow_streaming.sh # Dataflow streaming deployment script
4448
```
4549

4650
## Prerequisites
@@ -98,8 +102,8 @@ This will:
98102
- Load data from BigQuery
99103
- Train Decision Tree and Random Forest models in parallel
100104
- Evaluate and select the best model
101-
- Register the model in Vertex AI Model Registry
102-
- Deploy the model to an endpoint
105+
- Register the model in Vertex AI Model Registry with "blessed" alias
106+
- Deploy the blessed model to FastAPI service on Cloud Run
103107

104108
### 3. Run Inference Pipeline
105109

@@ -108,7 +112,23 @@ This will:
108112
python src/ml_pipelines_kfp/iris_xgboost/pipelines/iris_pipeline_inference.py
109113
```
110114

111-
### 4. Local API Server
115+
### 4. Real-time Streaming Inference
116+
117+
Deploy a Dataflow streaming job for real-time inference:
118+
119+
```bash
120+
# Deploy streaming pipeline (update SERVICE_URL with actual Cloud Run URL)
121+
./deploy_dataflow_streaming.sh
122+
```
123+
124+
Start generating test data:
125+
126+
```bash
127+
# Run data producer to send samples to Pub/Sub
128+
python src/ml_pipelines_kfp/iris_xgboost/pubsub_producer.py --project-id deeplearning-sahil
129+
```
130+
131+
### 5. Local API Server
112132

113133
```bash
114134
# Run the FastAPI server locally
@@ -144,9 +164,10 @@ The project follows a component-based architecture where each ML pipeline step i
144164
1. **Data Component**: Loads and splits data from BigQuery
145165
2. **Model Components**: Implements various ML algorithms
146166
3. **Evaluation Component**: Compares model performance
147-
4. **Registry Component**: Manages model versioning
148-
5. **Deployment Component**: Handles endpoint deployment
167+
4. **Registry Component**: Manages model versioning with "blessed" aliases
168+
5. **Deployment Component**: Deploys blessed models to Cloud Run FastAPI services
149169
6. **Inference Component**: Performs batch predictions
170+
7. **Streaming Component**: Real-time inference via Dataflow and Pub/Sub
150171

151172
## Configuration
152173

@@ -159,25 +180,44 @@ Key configuration is managed in `src/ml_pipelines_kfp/iris_xgboost/constants.py`
159180
## CI/CD
160181

161182
The repository includes GitHub Actions workflow (`.github/workflows/cicd.yaml`) that:
162-
- Builds Docker images
183+
- Builds Docker images for KFP components
184+
- Builds generic FastAPI inference containers
163185
- Pushes to Google Artifact Registry
164186
- Triggers on pushes to main branch
165187

166188
## Technologies
167189

168190
- **Orchestration**: Kubeflow Pipelines 2.8.0
169-
- **Cloud Platform**: Google Cloud (Vertex AI, BigQuery, GCS)
191+
- **Cloud Platform**: Google Cloud (Vertex AI, BigQuery, GCS, Cloud Run, Dataflow)
170192
- **ML Frameworks**: scikit-learn, XGBoost
171193
- **API Framework**: FastAPI
194+
- **Streaming**: Apache Beam, Dataflow, Pub/Sub
172195
- **Data Processing**: Pandas, Polars, Dask
173196
- **Package Management**: uv, Hatchling
174197

175-
## Set up Pub Sub
198+
## Deployment Architecture
176199

177-
bash
178-
```
179-
gcloud auth application-default print-access-token
180-
export GOOGLE_APPLICATION_CREDENTIALS="./***.json"
181-
gcloud auth application-default print-access-token
182-
python test_pubsub.py
183-
```
200+
### Model Deployment Strategy
201+
202+
The project uses a **blessed model pattern** for production deployments:
203+
204+
1. **Training Pipeline**: Trains multiple models and selects the best performer
205+
2. **Model Registry**: Stores the winning model in Vertex AI with "blessed" alias
206+
3. **Deployment Pipeline**: Automatically deploys only "blessed" models to production
207+
4. **Cost Optimization**: Uses FastAPI on Cloud Run
208+
209+
### Streaming Architecture
210+
211+
Real-time inference is handled through:
212+
213+
1. **Data Ingestion**: Pub/Sub receives real-time inference requests
214+
2. **Stream Processing**: Dataflow processes messages and calls FastAPI services
215+
3. **Model Serving**: Cloud Run hosts FastAPI containers with blessed models
216+
4. **Results Storage**: Predictions are written to BigQuery for monitoring
217+
218+
### Key Benefits
219+
220+
- **Cost Effective**: Cloud Run FastAPI services cost ~90% less than Vertex AI endpoints
221+
- **Scalable**: Dataflow auto-scales based on Pub/Sub message volume
222+
- **Reliable**: Only production-ready "blessed" models are deployed
223+
- **Observable**: All predictions logged to BigQuery with metadata

0 commit comments

Comments
 (0)