Propensity Score Matching (PSM) helps estimate the effect of a treatment or policy from observational data by making treated and control groups comparable on observed covariates.
Core Idea
PSM reduces selection bias by matching treated units to similar untreated units with comparable propensity scores P(Treatment|X). After achieving covariate balance, differences in outcomes approximate causal effects under key assumptions.
Assumptions
- Unconfoundedness: Given covariates X, treatment assignment is as-good-as-random.
- Overlap: Each unit had a non-zero probability of being treated or control.
- Stable Unit Treatment Value: No interference and well-defined treatment.
Matching Strategies
- Nearest-neighbor (with/without replacement)
- Caliper (max distance); ratio matching (1:k)
- Common support enforcement and trimming
Diagnostics
- Standardized mean differences (target |SMD| ≤ 0.1)
- Propensity score distribution overlap
- Effective sample size and matched pair counts
Interpretation
Report ATT/ATE with confidence intervals. State the population the estimate applies to (matched sample vs. full population), and provide segment views if heterogeneity is expected.
Common Pitfalls
- Including post-treatment covariates (biases estimates)
- Ignoring poor overlap or extreme propensities
- Stopping at “matched” without verifying balance
When to Consider Alternatives
Use Difference-in-Differences for panel data with parallel trends or Synthetic Control for few treated units with rich time-series.