Propensity Score Matching (PSM) helps reduce selection bias in observational studies by matching treatment and control units with similar propensity scores.
pysmatch is an improved and extended version of pymatch, with modernized modeling, modularized matching utilities, and better support for reproducible workflows.
- Multiple score models: Logistic Regression, KNN, CatBoost
- Flexible balancing: oversampling and undersampling (
balance_strategy) - Standard and exhaustive matching workflows
- Balance diagnostics for categorical and continuous covariates
- Optional Optuna tuning for automated model search
Install from PyPI:
pip install pysmatchInstall optional extras:
pip install "pysmatch[tree]" # CatBoost support
pip install "pysmatch[tune]" # Optuna support
pip install "pysmatch[all]" # all optional dependenciesInstall from source:
git clone https://github.com/miaohancheng/pysmatch.git
cd pysmatch
pip install -e ".[all]"This minimal example runs the full core path with the built-in demo dataset (misc/loan.csv).
import warnings
warnings.filterwarnings("ignore")
import numpy as np
import pandas as pd
from pysmatch.Matcher import Matcher
np.random.seed(42)
data = pd.read_csv("misc/loan.csv")
test = data[data.loan_status == "Default"].copy()
control = data[data.loan_status == "Fully Paid"].copy()
matcher = Matcher(
test=test,
control=control,
yvar="is_default",
exclude=["loan_status"],
)
matcher.fit_scores(
balance=True,
balance_strategy="over",
nmodels=10,
model_type="linear",
n_jobs=2,
)
matcher.predict_scores()
matcher.match(method="min", nmatches=1, threshold=0.001, replacement=False)
print(matcher.matched_data.head())If this works, continue to the full workflow below.
Use domain-relevant covariates and avoid leaking post-treatment variables into matching features.
import pandas as pd
fields = [
"loan_amnt",
"funded_amnt",
"funded_amnt_inv",
"term",
"int_rate",
"installment",
"grade",
"sub_grade",
"loan_status",
]
raw = pd.read_csv("misc/loan.csv", usecols=fields)
test = raw[raw.loan_status == "Default"].copy()
control = raw[raw.loan_status == "Fully Paid"].copy()from pysmatch.Matcher import Matcher
matcher = Matcher(
test=test,
control=control,
yvar="is_default",
exclude=["loan_status"],
)
print("xvars:", matcher.xvars)
print("test/control:", matcher.testn, matcher.controln)fit_scores supports three model types:
linear(logistic regression)knntree(CatBoost, requirespysmatch[tree])
matcher.fit_scores(
balance=True,
balance_strategy="over", # "over" or "under"
nmodels=10,
model_type="linear",
max_iter=200,
n_jobs=2,
)
print("models:", len(matcher.models))
print("avg validation accuracy:", sum(matcher.model_accuracy) / len(matcher.model_accuracy))Optuna path (single tuned model):
# matcher.fit_scores(
# balance=True,
# model_type="tree",
# use_optuna=True,
# n_trials=20,
# )matcher.predict_scores()
matcher.plot_scores()matcher.data now contains a scores column.
import numpy as np
matcher.tune_threshold(
method="min",
nmatches=1,
rng=np.arange(0.0001, 0.0051, 0.0005),
)Choose a threshold that balances quality and retained sample size.
Standard matching:
matcher.match(
method="min",
nmatches=1,
threshold=0.001,
replacement=False,
exhaustive_matching=False,
)
matcher.plot_matched_scores()Exhaustive matching:
matcher.match(
threshold=0.001,
nmatches=1,
exhaustive_matching=True,
)print(matcher.matched_data.head())
print(matcher.record_frequency().head())
matcher.assign_weight_vector()
print(matcher.matched_data[["record_id", "match_id", "weight"]].head())- Standard (
exhaustive_matching=False): uses nearest-neighbor style control selection with configurable method/replacement behavior. - Exhaustive (
exhaustive_matching=True): prioritizes wider control utilization while still respecting threshold constraints.
threshold: max allowed score distancenmatches: controls per treated unitreplacement: whether a control can be reusedmethod:"min"(closest) or"random"(random within threshold)
- Start with
nmatches=1,replacement=False, and a moderate threshold. - If retention is too low, loosen
thresholdgradually. - If balance is weak after matching, tighten threshold or change model/balance strategy.
- For severe class imbalance, test
balance_strategy="under"as sensitivity analysis.
After matching, evaluate covariate balance before causal analysis.
cat_table = matcher.compare_categorical(return_table=True, plot_result=True)
print(cat_table)Interpretation:
- check before/after p-value shifts
- look for reduced proportional differences after matching
cont_table = matcher.compare_continuous(return_table=True, plot_result=True)
print(cont_table)Interpretation:
- compare KS statistics and grouped permutation test p-values
- monitor standardized mean/median differences pre vs post matching
print(matcher.prop_test("grade"))This is usually a NumPy/Pandas binary compatibility issue.
pip install --upgrade --force-reinstall "numpy>=1.26.4" "pandas>=2.1.4"Restart your Python kernel/session after reinstalling.
Run predict_scores() before match().
matcher.fit_scores(...)
matcher.predict_scores()
matcher.match(...)Use a repo-relative path:
pd.read_csv("misc/loan.csv")Usually threshold is too strict or groups are weakly overlapping.
- increase
threshold - try a different
model_type - inspect score distributions with
plot_scores()
If your notebook kernel name is unavailable, switch to an existing kernel (python3) and rerun cells.
- Start with
linearfor strong baseline interpretability. - Use
treefor nonlinear relationships and mixed feature types. - Use
knnas a local-structure baseline and compare sensitivity.
Not necessarily. Very high separability may indicate weak overlap, which can reduce matchability. Balance diagnostics matter more than raw classifier accuracy.
over: usually keeps more majority information; good default.under: faster/smaller training sets; useful for sensitivity checks.
- set
np.random.seed(...) - keep fixed package versions
- record model/matching parameters in experiment logs
- Sekhon, J. S. (2011), Multivariate and propensity score matching software with automated balance optimization: The Matching package for R. Journal of Statistical Software, 42(7), 1-52. Link
- Rosenbaum, P. R., & Rubin, D. B. (1983), The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55. Link
Contributions are welcome. Please open an issue or pull request in this repository.
pysmatch is released under the MIT License.