Ridge adds an L2 penalty to linear regression, shrinking coefficients toward zero to reduce variance—particularly helpful when predictors are correlated.
When to Use
- Many correlated predictors (multicollinearity)
- Need stability over sparsity (unlike Lasso)
- Baseline models for forecasting and scoring
Tuning
- Cross‑validate alpha (penalty strength)
- Inspect validation curves for under/over‑regularization
Preprocessing
- Scale features (z‑score); encode categoricals
- Handle missing data; consider interaction terms if justified
Diagnostics
- Residual plots, RMSE/MAE/R²
- Coefficient shrinkage and stability across folds
Compare
- Lasso (L1) for sparsity; Elastic Net balances L1/L2 for correlated groups
- Tree‑based models for nonlinearities and interactions