{% include analytics.html %}
Snowflake SQL • R • Tableau • Healthcare Quality Metrics
This project analyzes U.S. hospital quality and readmission performance using publicly available CMS datasets. It demonstrates a complete analytics engineering workflow: Snowflake for data preparation and feature engineering, R for predictive modeling, and Tableau for visualization and executive storytelling.
Hospital readmissions are a key indicator of care quality and financial performance. CMS publishes hospital‑level quality metrics, but the data is fragmented, inconsistent, and not modeling‑ready.
This project solves that by building a unified, engineered, validated dataset that supports:
- Predictive modeling
- Quality benchmarking
- Executive dashboards
- Scenario simulation and business impact analysis
The final dataset is clean, standardized, and ready for downstream analytics.
Hospitals face financial penalties for excessive readmissions under the CMS Hospital Readmissions Reduction Program (HRRP).
Understanding which quality factors drive readmissions enables:
- Targeted quality improvement
- Better resource allocation
- Reduced penalties
- Improved patient outcomes
This project identifies those drivers using a structured analytics workflow.
Centers for Medicare & Medicaid Services (CMS)
- Hospital Readmissions Reduction Program (HRRP)
- Mortality Measures
- Patient Experience (HCAHPS)
- Timely & Effective Care (TEC)
- Hospital General Information
- Region Mapping
All datasets are publicly available and widely used in healthcare quality reporting.
- Ingest raw CMS files
- Clean and standardize fields
- Validate data integrity
- Engineer quality metrics
- Produce unified dataset:
CMS_CLEAN_HOSPITAL_QUALITY_FINAL
- Exploratory data analysis
- Predictive modeling
- Feature importance
- Executive dashboard
- Regional comparisons
- Quality performance insights
01_cms_snowflake_pipeline.sql
└─ Creates database, schema, stages, and file formats for CMS ingestion.
02_snowflake_data_preparation.sql
└─ Loads raw CMS datasets, standardizes fields, cleans identifiers, and prepares base tables.
03_snowflake_data_validation.sql
└─ Performs row count checks, duplicate checks, NULL analysis, and numeric range validation.
04_snowflake_data_modeling.sql
└─ Creates modeling‑ready base tables and prepares unified structures for feature engineering.
05_cms_snowflake_feature_engineering.sql
└─ Builds engineered quality metrics (readmission, mortality, patient experience, TEC).
This pipeline is final, validated, and reproducible.
The project generates four major engineered feature domains:
- Readmission Index
- Mortality Index
- Patient Experience Score
- Timely & Effective Care (TEC) Score
- Region Assignment
These features consolidate dozens of CMS measures into interpretable, modeling‑ready metrics.
The final dataset produced in Snowflake is: CMS_CLEAN_HOSPITAL_QUALITY_FINAL
It includes:
- Hospital profile information
- All engineered quality metrics
- Region mapping
- Modeling‑ready numeric fields
- One row per hospital
This dataset is exported for R modeling and Tableau visualization.
git clone
https://github.com/Oswaldketonou/CMS_Hospital_Readmission_Analytics_Case_Study
Execute scripts in order:
01_cms_snowflake_pipeline.sql02_snowflake_data_preparation.sql03_snowflake_data_validation.sql04_snowflake_data_modeling.sql05_cms_snowflake_feature_engineering.sql
Download CSV from Snowflake query results.
- R:
read.csv("cms_clean_hospital_quality_final.csv") - Tableau: Connect → Text File
cms-hospital-quality/
│
├── README.md ← Root README (this file)
│
├── data/
│ ├── raw/
│ └── processed/
│
├── sql/
│ ├── 01_cms_snowflake_pipeline.sql
│ ├── 02_snowflake_data_preparation.sql
│ ├── 03_snowflake_data_validation.sql
│ ├── 04_snowflake_data_modeling.sql
│ ├── 05_cms_snowflake_feature_engineering.sql
│ └── readme.md
│
├── r/
│ └── eda_modeling.R
│
├── tableau/
│ └── dashboard.twbx
│
├── docs/
│ ├── project_narrative.md
│ ├── data_preparation.md
│ ├── data_validation.md
│ ├── data_modeling.md
│ ├── snowflake_pipeline.md
│ ├── cms_snowflake_feature_engineering.md
│ ├── data_dictionary.md
│ ├── methodology.md
│ └── project_notes.md
│
└── visuals/
├── architecture.png
├── dashboard_preview.png
└── feature_importance.png
- R modeling (EDA, regression, random forest, feature importance)
- Tableau dashboard (KPIs, regional insights, quality scoring)
- Final case study narrative
- Scenario simulation and business recommendations
- Data dictionary completion
Waldo Ketonou
Business & Data Analyst | SQL • R • Tableau
Focused on transforming healthcare operations and quality metrics into actionable insights.
In Progress — SQL engineering phase completed; modeling and dashboard phases upcoming.