Predicting Green Gentrification in New York City

CUNY Graduate Center Masters Thesis
Kat Desai, advised by Professor Anita Raja
May, 2025

Abstract

This research presents a machine learning framework for predicting gentrification in New York City, with a focus on the role of green infrastructure in driving urban change. We trained a Random Forest classifier, using historical data on socioeconomic characteristics, housing, and sustainability developments, to identify Census Tracts likely to gentrify over a four-year horizon. Time-series and geospatial feature extraction techniques were used to capture neighborhood dynamics, while missing data was addressed through temporal and spatial interpolation. The final model achieved a balanced accuracy of 85.3% and an F1 score of 86.3%. Model evaluation methods include an ablation study, case studies, error analysis, and Shapley value interpretation.
Results indicate that in the absence of key drivers, green infrastructure features– particularly the number of trees planted and biking-related infrastructure– were among the strongest predictors of gentrification. This suggests that while sustainable urban investments can enhance environmental quality, they may also facilitate neighborhood change and potential displacement. Our findings contribute to ongoing debates about the intersection of sustainability and equity in urban development, and offer insights for planners and policymakers into the spatial equity implications of green urban planning.

More Information

To access the full paper or for additional questions, please contact khyatee.d@gmail.com

Repository Structure

├── README.md
│ 
├── Wrangling .......................... Notebooks used for initial raw data collection and cleaning
│
├── Data
│   ├── Cleaned ........................ Cleaned data files generated through data wrangling
│   └-- Outputs ........................ Predictions, labels, intermediate files
│
├── ML-Flow ............................ Notebooks for each step of the ML process
│   ├── 1-kriging.ipynb ................ Joins all cleaned data and interpolates missing values
│   ├── 2-labeling.ipynb ............... Generates target variables
│   ├── 3-feature_engineering.ipynb .... Creates features from cleaned data
│   └-- 4-modeling.ipynb ............... Uses ML model to generate predictions
│
├── Experiments ........................ Notebooks for analysis of predictions
│   ├── case_studies.ipynb ............. Inspects model predictions in specific Census Tracts of interest
│   ├── clustering.ipynb ............... Experimental clustering of raw data
│   └-- error_analysis.ipynb ........... Analysis of incorrect predictions
│
├── EDA
│   ├── mapping.ipynb .................. Creates choropleth maps of features and predictions
│   └-- functions.py ................... Helper functions
│
├── Reporting .......................... Powerpoint slides and research poster
│
└-- Images ............................. Images produced from EDA

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting Green Gentrification in New York City

CUNY Graduate Center Masters Thesis
Kat Desai, advised by Professor Anita Raja
May, 2025

Abstract

More Information

Repository Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
Data/Cleaned		Data/Cleaned
EDA		EDA
Experiments		Experiments
ML-Flow		ML-Flow
Wrangling		Wrangling
.gitignore		.gitignore
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Predicting Green Gentrification in New York City

CUNY Graduate Center Masters ThesisKat Desai, advised by Professor Anita RajaMay, 2025

Abstract

More Information

Repository Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

CUNY Graduate Center Masters Thesis
Kat Desai, advised by Professor Anita Raja
May, 2025

Packages