Skip to content

Commit 21522c4

Browse files
authored
README
Updated README to include new images and details about difficulties encountered during integration and deployment processes.
1 parent e6dbaef commit 21522c4

1 file changed

Lines changed: 15 additions & 8 deletions

File tree

README.md

Lines changed: 15 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -197,9 +197,10 @@ With XGBoost confirmed as the right model class, I ran a proper hyperparameter s
197197
**Difficulties:**
198198
- Getting MLflow and Optuna to integrate cleanly took some work. Optuna runs trials in its own loop and MLflow needs a run context — I had to nest the MLflow run inside each Optuna trial callback carefully so experiments didn't bleed into each other.
199199
- 15 trials felt like a reasonable tradeoff between search quality and time, but with a larger search space some trials landed in clearly bad regions. Pruning would have helped here.
200+
<img width="2150" height="2122" alt="Screenshot 2026-03-08 at 4 23 03 PM" src="https://github.com/user-attachments/assets/9ecdf447-7158-4e74-9eed-e8c659aff8fe" />
201+
202+
<img width="2114" height="2160" alt="Screenshot 2026-03-08 at 4 22 42 PM" src="https://github.com/user-attachments/assets/d25fd238-6e4f-42ca-9334-33cb977725a1" />
200203

201-
![MLflow Experiment — xgboost_optuna_housing](assets/screenshots/mlflow_runs.png)
202-
![MLflow Runs — 32 total across sessions](assets/screenshots/mlflow_runs_2.png)
203204

204205
---
205206

@@ -232,7 +233,8 @@ With the code modularized, I built a REST API to serve predictions and a dashboa
232233
| `POST` | `/run_batch` | Trigger monthly batch inference |
233234
| `GET` | `/latest_predictions` | Retrieve latest prediction file |
234235

235-
![FastAPI health check — API live on AWS](assets/screenshots/fastapi_health.png)
236+
<img width="726" height="88" alt="Screenshot 2026-03-08 at 8 55 21 PM" src="https://github.com/user-attachments/assets/2fb2dd3c-7614-4521-a7bf-f4165b8d5fc4" />
237+
236238

237239
**Streamlit** (`app.py`) pulls holdout data from S3, calls the FastAPI `/predict` endpoint, and displays predictions vs actuals with MAE, RMSE, and % error metrics. Users can filter by year, month, and region.
238240

@@ -282,7 +284,8 @@ I set up a fully automated deployment pipeline so every push to `main` builds, p
282284

283285
AWS credentials (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`) are stored as GitHub secrets.
284286

285-
![GitHub Actions — CI/CD workflow runs](assets/screenshots/github_actions.png)
287+
<img width="1708" height="734" alt="Screenshot 2026-03-08 at 8 55 12 PM" src="https://github.com/user-attachments/assets/350dd7e3-7bf1-4193-867a-eb4ce7610d27" />
288+
286289

287290
**Difficulties:**
288291
- The first few pipeline runs failed because of IAM permission issues. The GitHub Actions role didn't have the right policies to push to ECR or update ECS services. I had to create and attach the correct IAM policies, which required understanding the AWS permission model for cross-service access.
@@ -298,12 +301,15 @@ Two buckets store everything the deployed services need at runtime, so container
298301
### ECR — Container Registry
299302
Both Docker images live in ECR and are tagged per-commit for rollback capability.
300303

301-
![AWS ECR — housing-api and housing-streamlit repositories](assets/screenshots/aws_ecr.png)
304+
<img width="1460" height="606" alt="Screenshot 2026-03-08 at 8 55 58 PM" src="https://github.com/user-attachments/assets/90180e6d-5f4a-4dfa-8008-3fce743ed54a" />
305+
306+
302307

303308
### IAM Roles
304309
I created custom roles to give ECS tasks the minimum permissions needed to read from S3:
305310

306-
![AWS IAM Roles — ecs_s3_access, ecsTaskExecutionRole, s3-access-role](assets/screenshots/aws_iam_roles.png)
311+
<img width="1576" height="860" alt="Screenshot 2026-03-08 at 8 55 37 PM" src="https://github.com/user-attachments/assets/716b4cc1-3a03-4c70-bc03-9cd938ef82bc" />
312+
307313

308314
### ECS Cluster — `regression-model-cluster-for-project`
309315
Two services running in the same cluster, both Active:
@@ -313,12 +319,13 @@ Two services running in the same cluster, both Active:
313319
| `regression-model-cluster-for-project-service-07233mgp` | FastAPI prediction API |
314320
| `housing-streamlit-service-5cvxvvhd` | Streamlit dashboard |
315321

316-
![AWS ECS Cluster — 2 active services](assets/screenshots/aws_ecs_cluster.png)
322+
<img width="1526" height="1756" alt="Screenshot 2026-03-08 at 8 56 18 PM" src="https://github.com/user-attachments/assets/3746af61-db7f-429e-a39e-c6a2dd0a7140" />
317323

318324
### Application Load Balancer
319325
An internet-facing ALB (`housing-price-prediction`) routes incoming traffic across two availability zones (us-east-2a, us-east-2b) to the ECS tasks.
320326

321-
![AWS ALB — housing-price-prediction load balancer](assets/screenshots/aws_alb.png)
327+
328+
<img width="1612" height="1530" alt="Screenshot 2026-03-08 at 8 56 44 PM" src="https://github.com/user-attachments/assets/07797732-a8b7-4dea-b41b-440e0edcefff" />
322329

323330
**Difficulties with AWS setup:**
324331
- Setting up the ECS task definitions to inject AWS credentials as environment variables (so the containers can access S3) without hardcoding them took several iterations through IAM roles and task execution policies.

0 commit comments

Comments
 (0)