XGBoost fits strong nonlinear models on tabular data and provides feature‑level explanations via SHAP.
Tuning Essentials
- Tree depth, learning rate, n_estimators
- Subsample/colsample for regularization
- Lambda/alpha for L2/L1 penalties
Metrics
Choose metrics by objective: ROC‑AUC/PR‑AUC for classification, RMSE/MAE for regression. Use cross‑validation and early stopping.
Explainability
Global importance (gain/SHAP) and local SHAP value plots help validate driver logic and communicate results.
Common Pitfalls
- Data leakage from target‑informed features or across splits
- Insufficient cross‑validation or misaligned metrics
- Ignoring class imbalance in classification