What is Survival Analysis?
Survival analysis is a branch of statistics that deals with analyzing the expected duration until an event occurs. Originally developed for medical research, it's now widely used in business for customer churn, equipment reliability, and risk assessment.
Key Concepts
Survival Function
The probability that an individual survives beyond time t: S(t) = P(T > t)
Hazard Function
The instantaneous risk of the event occurring at time t, given survival up to time t.
Censoring
When we don't observe the event for all subjects during the study period. Cox regression handles right-censored data naturally.
Cox Proportional Hazards Model
The Cox model is semi-parametric, modeling the hazard function as:
h(t|x) = h₀(t) × exp(β₁x₁ + β₂x₂ + ... + βₚxₚ)
Key Assumptions
- Proportional Hazards: The hazard ratio between individuals is constant over time
- Log-linearity: Log hazard is a linear function of covariates
- Independence: Observations are independent
Business Applications
Customer Churn Analysis
- Predict when customers are likely to churn
- Identify high-risk segments
- Optimize retention timing
Equipment Reliability
- Predict equipment failure times
- Schedule preventive maintenance
- Warranty analysis
Credit Risk
- Time to default modeling
- Loan prepayment analysis
- Portfolio risk assessment
Interpreting Results
Hazard Ratios
A hazard ratio > 1 indicates increased risk, < 1 indicates decreased risk:
- HR = 2.0: Twice the risk of the event
- HR = 0.5: Half the risk of the event
Survival Curves
Kaplan-Meier curves visualize survival probability over time for different groups.
Model Diagnostics
- Schoenfeld Residuals: Test proportional hazards assumption
- Martingale Residuals: Check functional form of covariates
- Deviance Residuals: Identify outliers
- Concordance Index: Model discrimination ability
Advanced Topics
Time-Varying Covariates
Handle variables that change over time, like customer engagement metrics.
Stratified Cox Models
Allow baseline hazards to vary across strata when proportional hazards assumption is violated.
Competing Risks
Account for multiple types of events (e.g., customer may churn or upgrade).