I'm a Data Scientist with a Master's in Data Science from Merrimack College (2026) specializing in end-to-end ML pipelines, fairness-aware modeling, and large-scale data analysis. My projects work with real-world datasets — 16M+ row U.S. Census data, 4.25M HMDA mortgage applications, and 941K NYC 311 service requests. I build in Python, SQL, and R, deploy dashboards in Tableau and Power BI, and hold an AWS Cloud Practitioner certification. Background in legal operations and data governance. Currently seeking entry-level Data Scientist roles in Greater Boston.
- 🗽 NYC 311 + Weather Correlation Dashboard — Production-grade ETL pipeline ingesting 941K real NYC civic complaints + NOAA weather data via REST APIs; cleaned with Python & PostgreSQL, visualized in an interactive 5-tab Power BI dashboard with automated daily refresh. Live Dashboard
- 🏠 Rent Burden Prediction — Fairness & ML analysis on 16M+ ACS PUMS household records (Logistic Regression, Random Forest, Gradient Boosting); equity analysis across race, sex, and geography for HUD policy context
- 🏦 Home Loan Approval Prediction — ML pipeline on 4.25M real HMDA 2023 mortgage applications; XGBoost ROC-AUC 0.9932, 96.3% accuracy across 121 features
- 📊 Marketing Campaign Effectiveness — End-to-end ROI analysis for Nike Inc. using real Google Trends (pytrends API) + SEC EDGAR 10-K filings; ROAS modeling, lag correlation, and 6-panel dashboard in Python
- 🔄 Customer Churn & CLV Analysis — End-to-end SaaS churn prediction pipeline; synthetic data calibrated to HubSpot 2023 10-K & SaaS Capital benchmarks; Logistic Regression (AUC 0.92) identifying $2.3M annual MRR at risk across 5,000 customers; interactive Tableau dashboard with risk segmentation and intervention ROI modeling. Live Dashboard