Gradient boosting for regression with 5-fold cross-validation, automatic early stopping, and SHAP-based feature contributions.
Automatic hyperparameter selection with early stopping (10 rounds). Fixed parameters: max_depth=6, learning_rate=0.1, subsample=0.8, colsample_bytree=0.8.
Four importance measures: Gain (improvement per split), Cover (data coverage), Frequency (usage count), and SHAP contributions (average absolute impact).
Uses reg:squarederror objective with RMSE evaluation. Automatic conversion of categorical features to numeric (0-based encoding). Handles missing values by replacement.
Provide features array (column names) and target (numeric column for regression). Categorical features are automatically converted to 0-based numeric encoding.
Algorithm runs 5-fold cross-validation to find optimal number of trees (up to 100 rounds with early stopping). Calculates R², RMSE, MAE, and generates feature importance using Gain, Cover, Frequency, and SHAP contributions.
From data prep to explainable predictions
Convert categorical features to 0-based numeric encoding. Replace missing values (0 for features, mean for target). Create DMatrix for XGBoost processing.
Run 5-fold CV with fixed parameters (depth=6, eta=0.1, subsample=0.8). Early stopping after 10 rounds without improvement. Select best iteration automatically.
Calculate 4 importance metrics: Gain, Cover, Frequency, SHAP contributions. Generate residual plots, actual vs predicted visualizations, and CV performance curves.
XGBoost regression with automatic parameter selection through 5-fold cross-validation and early stopping to prevent overfitting.
Provides four complementary feature importance measures: Gain (split improvement), Cover (observation coverage), Frequency (usage count), and SHAP contributions (average absolute impact). Fixed hyperparameters ensure consistent, reproducible results.
Note: Currently supports regression only (reg:squarederror). Categorical features automatically converted to numeric. Missing values replaced with 0 (features) or mean (target). Max 100 boosting rounds with early stopping after 10 rounds.