@@ -9,8 +9,9 @@ This repository implements a complete ML pipeline for the Iris dataset classific
99- Parallel model training (Decision Tree, Random Forest, XGBoost)
1010- Automatic model evaluation and selection
1111- Model registration and versioning in Vertex AI
12- - Automated deployment to Vertex AI endpoints
12+ - Automated deployment to FastAPI services on Cloud Run
1313- Batch inference capabilities
14+ - Real-time streaming inference with Dataflow
1415- REST API serving with FastAPI
1516
1617## Key Features
@@ -36,11 +37,14 @@ src/ml_pipelines_kfp/
3637│ └── constants.py # Configuration constants
3738├── workflows/ # Alternative workflow implementations
3839└── notebooks/ # Example notebooks and experiments
40+ ├── dataflow/ # Dataflow streaming pipelines
41+ │ └── iris_streaming_pipeline.py
3942schemas/ # Input/output schemas for Vertex AI
4043memory-bank/ # Project context and documentation
4144Dockerfile # Container definition
4245pyproject.toml # Project dependencies
4346pipeline.yaml # Pipeline configuration
47+ deploy_dataflow_streaming.sh # Dataflow streaming deployment script
4448```
4549
4650## Prerequisites
@@ -98,8 +102,8 @@ This will:
98102- Load data from BigQuery
99103- Train Decision Tree and Random Forest models in parallel
100104- Evaluate and select the best model
101- - Register the model in Vertex AI Model Registry
102- - Deploy the model to an endpoint
105+ - Register the model in Vertex AI Model Registry with "blessed" alias
106+ - Deploy the blessed model to FastAPI service on Cloud Run
103107
104108### 3. Run Inference Pipeline
105109
@@ -108,7 +112,23 @@ This will:
108112python src/ml_pipelines_kfp/iris_xgboost/pipelines/iris_pipeline_inference.py
109113```
110114
111- ### 4. Local API Server
115+ ### 4. Real-time Streaming Inference
116+
117+ Deploy a Dataflow streaming job for real-time inference:
118+
119+ ``` bash
120+ # Deploy streaming pipeline (update SERVICE_URL with actual Cloud Run URL)
121+ ./deploy_dataflow_streaming.sh
122+ ```
123+
124+ Start generating test data:
125+
126+ ``` bash
127+ # Run data producer to send samples to Pub/Sub
128+ python src/ml_pipelines_kfp/iris_xgboost/pubsub_producer.py --project-id deeplearning-sahil
129+ ```
130+
131+ ### 5. Local API Server
112132
113133``` bash
114134# Run the FastAPI server locally
@@ -144,9 +164,10 @@ The project follows a component-based architecture where each ML pipeline step i
1441641 . ** Data Component** : Loads and splits data from BigQuery
1451652 . ** Model Components** : Implements various ML algorithms
1461663 . ** Evaluation Component** : Compares model performance
147- 4 . ** Registry Component** : Manages model versioning
148- 5 . ** Deployment Component** : Handles endpoint deployment
167+ 4 . ** Registry Component** : Manages model versioning with "blessed" aliases
168+ 5 . ** Deployment Component** : Deploys blessed models to Cloud Run FastAPI services
1491696 . ** Inference Component** : Performs batch predictions
170+ 7 . ** Streaming Component** : Real-time inference via Dataflow and Pub/Sub
150171
151172## Configuration
152173
@@ -159,25 +180,44 @@ Key configuration is managed in `src/ml_pipelines_kfp/iris_xgboost/constants.py`
159180## CI/CD
160181
161182The repository includes GitHub Actions workflow (` .github/workflows/cicd.yaml ` ) that:
162- - Builds Docker images
183+ - Builds Docker images for KFP components
184+ - Builds generic FastAPI inference containers
163185- Pushes to Google Artifact Registry
164186- Triggers on pushes to main branch
165187
166188## Technologies
167189
168190- ** Orchestration** : Kubeflow Pipelines 2.8.0
169- - ** Cloud Platform** : Google Cloud (Vertex AI, BigQuery, GCS)
191+ - ** Cloud Platform** : Google Cloud (Vertex AI, BigQuery, GCS, Cloud Run, Dataflow )
170192- ** ML Frameworks** : scikit-learn, XGBoost
171193- ** API Framework** : FastAPI
194+ - ** Streaming** : Apache Beam, Dataflow, Pub/Sub
172195- ** Data Processing** : Pandas, Polars, Dask
173196- ** Package Management** : uv, Hatchling
174197
175- ## Set up Pub Sub
198+ ## Deployment Architecture
176199
177- bash
178- ```
179- gcloud auth application-default print-access-token
180- export GOOGLE_APPLICATION_CREDENTIALS="./***.json"
181- gcloud auth application-default print-access-token
182- python test_pubsub.py
183- ```
200+ ### Model Deployment Strategy
201+
202+ The project uses a ** blessed model pattern** for production deployments:
203+
204+ 1 . ** Training Pipeline** : Trains multiple models and selects the best performer
205+ 2 . ** Model Registry** : Stores the winning model in Vertex AI with "blessed" alias
206+ 3 . ** Deployment Pipeline** : Automatically deploys only "blessed" models to production
207+ 4 . ** Cost Optimization** : Uses FastAPI on Cloud Run
208+
209+ ### Streaming Architecture
210+
211+ Real-time inference is handled through:
212+
213+ 1 . ** Data Ingestion** : Pub/Sub receives real-time inference requests
214+ 2 . ** Stream Processing** : Dataflow processes messages and calls FastAPI services
215+ 3 . ** Model Serving** : Cloud Run hosts FastAPI containers with blessed models
216+ 4 . ** Results Storage** : Predictions are written to BigQuery for monitoring
217+
218+ ### Key Benefits
219+
220+ - ** Cost Effective** : Cloud Run FastAPI services cost ~ 90% less than Vertex AI endpoints
221+ - ** Scalable** : Dataflow auto-scales based on Pub/Sub message volume
222+ - ** Reliable** : Only production-ready "blessed" models are deployed
223+ - ** Observable** : All predictions logged to BigQuery with metadata
0 commit comments