Transform observational data into causal insights by creating statistically balanced treatment and control groups
Creates statistically equivalent groups by matching on propensity scores, reducing bias from confounding variables. Shows before/after standardized differences.
Calculates Average Treatment Effect on Treated (ATT) with bootstrap confidence intervals, showing the causal impact of treatment on your outcome.
Propensity score distributions by group, love plots showing balance improvement, common support assessment, matching quality metrics, and sample attrition analysis.
Your data needs a treatment_column (binary 0/1 for control/treated), an outcome_column (numeric result to measure), and multiple covariates (pre-treatment characteristics for matching).
Data format: Each row is one unit (customer, patient, etc.). Include all variables that might influence who gets treated: age, income, prior behavior, risk factors. The algorithm uses logistic regression to estimate propensity scores, then matches similar units.
Minimum requirements: At least 50 treated and 50 control units, ideally 1000+ total observations. Need at least 2-3 covariates for meaningful matching. More covariates improve balance but require larger samples.
What you get: Average Treatment Effect on Treated (ATT) with confidence intervals, balance diagnostics showing covariate improvement, matched dataset for further analysis, visual diagnostics including love plots.
A rigorous statistical pipeline that transforms raw observational data into reliable causal estimates
Uses logistic regression to calculate each unit's probability of receiving treatment based on their covariates, creating a single balancing score for matching.
Create comparable groups using nearest-neighbor or caliper matching, then verify covariate balance through standardized mean differences and overlap diagnostics.
Computes Average Treatment Effect on Treated (ATT) using matched samples, with bootstrap confidence intervals (1000 iterations) for robust uncertainty estimates.
When randomized experiments aren't feasible, this method isolates true treatment effects from selection bias, enabling confident causal conclusions from observational data.
Critical for decisions like expanding policy changes, optimizing marketing campaigns, or evaluating medical interventions. By enforcing statistical balance before comparison, we answer the fundamental counterfactual question: "What would have happened without the treatment?" This transparency, combined with visual diagnostics, builds stakeholder confidence in data-driven decisions.
Key Assumptions: No unmeasured confounders (all relevant variables included), stable unit treatment values (SUTVA), and sufficient propensity score overlap between groups.
Turn your observational data into actionable causal insights
Read the article: Propensity Score Matching