DECISION TREE

Simplified Random Forest

Single decision tree regression using rpart with complexity parameter cp=0.01. Provides interpretable variable importance and standard regression metrics.

What Makes This Robust

Single Tree Model

Uses rpart (Recursive Partitioning and Regression Trees) with complexity parameter cp=0.01 for pruning control. No ensemble or bagging.

Variable Importance

Calculates variable importance based on improvement in node purity. Falls back to equal importance (1.0) for all features if not available.

Regression Metrics

Provides R², RMSE, and MAE. Calculates residuals for diagnostic plots. Returns fitted values vs actual for visualization.

What You Need to Provide

Regression dataset required

Provide features array and target column name (numeric for regression only). Data automatically converted to data frame format if needed.

Algorithm builds single rpart tree with cp=0.01, calculates predictions on training data, computes R²/RMSE/MAE, and extracts variable importance from tree structure.

Tabular Schema / features + target

Quick Specs

Algorithmrpart single tree
CP Parameter0.01 (fixed)
Taskregression only
OutputsR², RMSE, MAE, importance

How We Train

Simple tree building process

1

Data Preparation

Convert JSON/list data to data frame format. Build formula string from features and target. No preprocessing or encoding needed.

2

Tree Building

Build single rpart tree with cp=0.01 complexity parameter. No cross-validation or parameter tuning. Uses default rpart settings.

3

Metrics & Importance

Calculate predictions on training data. Compute R², RMSE, MAE. Extract variable importance (or assign 1.0 if unavailable).

Why This Analysis Matters

Single decision tree implementation using rpart, providing interpretable model structure with variable importance rankings.

This is a simplified version, not a true Random Forest ensemble. Uses single tree with cp=0.01 for basic regression tasks. Training predictions only (no train/test split). Variable importance based on node purity improvements.

Note: Not an actual Random Forest - uses single rpart tree. No bagging, bootstrapping, or ensemble methods. Regression only, no classification support. All metrics calculated on training data.

Ready to Ensemble?

Train a robust baseline with importances

Read the article: Random Forest