You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+15-8Lines changed: 15 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -197,9 +197,10 @@ With XGBoost confirmed as the right model class, I ran a proper hyperparameter s
197
197
**Difficulties:**
198
198
- Getting MLflow and Optuna to integrate cleanly took some work. Optuna runs trials in its own loop and MLflow needs a run context — I had to nest the MLflow run inside each Optuna trial callback carefully so experiments didn't bleed into each other.
199
199
- 15 trials felt like a reasonable tradeoff between search quality and time, but with a larger search space some trials landed in clearly bad regions. Pruning would have helped here.
200
+
<imgwidth="2150"height="2122"alt="Screenshot 2026-03-08 at 4 23 03 PM"src="https://github.com/user-attachments/assets/9ecdf447-7158-4e74-9eed-e8c659aff8fe" />
201
+
202
+
<imgwidth="2114"height="2160"alt="Screenshot 2026-03-08 at 4 22 42 PM"src="https://github.com/user-attachments/assets/d25fd238-6e4f-42ca-9334-33cb977725a1" />

236
+
<imgwidth="726"height="88"alt="Screenshot 2026-03-08 at 8 55 21 PM"src="https://github.com/user-attachments/assets/2fb2dd3c-7614-4521-a7bf-f4165b8d5fc4" />
237
+
236
238
237
239
**Streamlit** (`app.py`) pulls holdout data from S3, calls the FastAPI `/predict` endpoint, and displays predictions vs actuals with MAE, RMSE, and % error metrics. Users can filter by year, month, and region.
238
240
@@ -282,7 +284,8 @@ I set up a fully automated deployment pipeline so every push to `main` builds, p
282
284
283
285
AWS credentials (`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`) are stored as GitHub secrets.
<imgwidth="1708"height="734"alt="Screenshot 2026-03-08 at 8 55 12 PM"src="https://github.com/user-attachments/assets/350dd7e3-7bf1-4193-867a-eb4ce7610d27" />
288
+
286
289
287
290
**Difficulties:**
288
291
- The first few pipeline runs failed because of IAM permission issues. The GitHub Actions role didn't have the right policies to push to ECR or update ECS services. I had to create and attach the correct IAM policies, which required understanding the AWS permission model for cross-service access.
@@ -298,12 +301,15 @@ Two buckets store everything the deployed services need at runtime, so container
298
301
### ECR — Container Registry
299
302
Both Docker images live in ECR and are tagged per-commit for rollback capability.
300
303
301
-

304
+
<imgwidth="1460"height="606"alt="Screenshot 2026-03-08 at 8 55 58 PM"src="https://github.com/user-attachments/assets/90180e6d-5f4a-4dfa-8008-3fce743ed54a" />
305
+
306
+
302
307
303
308
### IAM Roles
304
309
I created custom roles to give ECS tasks the minimum permissions needed to read from S3:
305
310
306
-

311
+
<imgwidth="1576"height="860"alt="Screenshot 2026-03-08 at 8 55 37 PM"src="https://github.com/user-attachments/assets/716b4cc1-3a03-4c70-bc03-9cd938ef82bc" />
<imgwidth="1612"height="1530"alt="Screenshot 2026-03-08 at 8 56 44 PM"src="https://github.com/user-attachments/assets/07797732-a8b7-4dea-b41b-440e0edcefff" />
322
329
323
330
**Difficulties with AWS setup:**
324
331
- Setting up the ECS task definitions to inject AWS credentials as environment variables (so the containers can access S3) without hardcoding them took several iterations through IAM roles and task execution policies.
0 commit comments