Impact Evaluation

Impact Evaluation Module

The impact_evaluation/ module provides Python implementations of three core causal inference methods used in development program evaluation.

Difference-in-Differences (DiD)

File: impact_evaluation/difference_in_differences.py

Estimates treatment effects by comparing outcome changes over time between treatment and control groups.

Functions

Function	Purpose
`estimate_did(df, outcome, treatment, post, covariates)`	OLS-based DiD estimation with optional controls
`parallel_trends_test(df, outcome, treatment, time_var, pre_periods)`	Tests pre-treatment trend equivalence
`event_study(df, outcome, treatment, time_var, ref_period)`	Period-specific treatment effects with 95% CIs

Usage

from impact_evaluation.difference_in_differences import estimate_did, parallel_trends_test

model = estimate_did(df, outcome="score", treatment="treated", post="post_period")
print(model.summary())

# Check parallel trends assumption
pt = parallel_trends_test(df, "score", "treated", "year", pre_periods=[2018, 2019, 2020])

Propensity Score Matching (PSM)

File: impact_evaluation/propensity_score_matching.py

Matches treated and control units by estimated treatment probability to reduce selection bias.

Functions

Function	Purpose
`estimate_propensity_scores(df, treatment, covariates)`	Logistic regression propensity scores
`nearest_neighbor_match(df, treatment, caliper, replace)`	1:1 nearest-neighbor matching
`balance_table(df, treatment, covariates, matches)`	Standardised mean differences before/after matching
`estimate_att(df, outcome, matches)`	Average Treatment Effect on the Treated

Usage

from impact_evaluation.propensity_score_matching import *

df, model = estimate_propensity_scores(df, "treated", ["age", "income", "education"])
matches = nearest_neighbor_match(df, "treated", caliper=0.05)
balance = balance_table(df, "treated", ["age", "income", "education"], matches)
att = estimate_att(df, "outcome_score", matches)

Values in smd_after below 0.1 indicate good balance.

Regression Discontinuity Design (RDD)

File: impact_evaluation/regression_discontinuity.py

Estimates treatment effects at a sharp eligibility cutoff using local linear regression.

Functions

Function	Purpose
`sharp_rdd(df, outcome, running_var, cutoff, bandwidth)`	Local linear regression at the cutoff
`optimal_bandwidth_ik(df, running_var, cutoff)`	Imbens-Kalyanaraman bandwidth selection
`density_test(df, running_var, cutoff)`	McCrary-style manipulation test

Usage

from impact_evaluation.regression_discontinuity import *

bw = optimal_bandwidth_ik(df, "poverty_score", cutoff=50)
model = sharp_rdd(df, "outcome", "poverty_score", cutoff=50, bandwidth=bw)
manipulation = density_test(df, "poverty_score", cutoff=50)

When to Use Which Method

Method	Best For	Key Assumption
DiD	Policy changes with before/after data	Parallel trends in pre-period
PSM	Observational data with rich covariates	Selection on observables only
RDD	Programs with eligibility cutoffs	No manipulation of running variable

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Impact Evaluation

Impact Evaluation Module

Difference-in-Differences (DiD)

Functions

Usage

Propensity Score Matching (PSM)

Functions

Usage

Regression Discontinuity Design (RDD)

Functions

Usage

When to Use Which Method

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally