Skip to content

Impact Evaluation

Claude edited this page Mar 15, 2026 · 1 revision

Impact Evaluation Module

The impact_evaluation/ module provides Python implementations of three core causal inference methods used in development program evaluation.


Difference-in-Differences (DiD)

File: impact_evaluation/difference_in_differences.py

Estimates treatment effects by comparing outcome changes over time between treatment and control groups.

Functions

Function Purpose
estimate_did(df, outcome, treatment, post, covariates) OLS-based DiD estimation with optional controls
parallel_trends_test(df, outcome, treatment, time_var, pre_periods) Tests pre-treatment trend equivalence
event_study(df, outcome, treatment, time_var, ref_period) Period-specific treatment effects with 95% CIs

Usage

from impact_evaluation.difference_in_differences import estimate_did, parallel_trends_test

model = estimate_did(df, outcome="score", treatment="treated", post="post_period")
print(model.summary())

# Check parallel trends assumption
pt = parallel_trends_test(df, "score", "treated", "year", pre_periods=[2018, 2019, 2020])

Propensity Score Matching (PSM)

File: impact_evaluation/propensity_score_matching.py

Matches treated and control units by estimated treatment probability to reduce selection bias.

Functions

Function Purpose
estimate_propensity_scores(df, treatment, covariates) Logistic regression propensity scores
nearest_neighbor_match(df, treatment, caliper, replace) 1:1 nearest-neighbor matching
balance_table(df, treatment, covariates, matches) Standardised mean differences before/after matching
estimate_att(df, outcome, matches) Average Treatment Effect on the Treated

Usage

from impact_evaluation.propensity_score_matching import *

df, model = estimate_propensity_scores(df, "treated", ["age", "income", "education"])
matches = nearest_neighbor_match(df, "treated", caliper=0.05)
balance = balance_table(df, "treated", ["age", "income", "education"], matches)
att = estimate_att(df, "outcome_score", matches)

Values in smd_after below 0.1 indicate good balance.


Regression Discontinuity Design (RDD)

File: impact_evaluation/regression_discontinuity.py

Estimates treatment effects at a sharp eligibility cutoff using local linear regression.

Functions

Function Purpose
sharp_rdd(df, outcome, running_var, cutoff, bandwidth) Local linear regression at the cutoff
optimal_bandwidth_ik(df, running_var, cutoff) Imbens-Kalyanaraman bandwidth selection
density_test(df, running_var, cutoff) McCrary-style manipulation test

Usage

from impact_evaluation.regression_discontinuity import *

bw = optimal_bandwidth_ik(df, "poverty_score", cutoff=50)
model = sharp_rdd(df, "outcome", "poverty_score", cutoff=50, bandwidth=bw)
manipulation = density_test(df, "poverty_score", cutoff=50)

When to Use Which Method

Method Best For Key Assumption
DiD Policy changes with before/after data Parallel trends in pre-period
PSM Observational data with rich covariates Selection on observables only
RDD Programs with eligibility cutoffs No manipulation of running variable

Clone this wiki locally