PCA reduces dimensionality by rotating to uncorrelated components that maximize variance. It clarifies structure and speeds up downstream modeling.

Preparation

  • Standardize features (z‑score) so scale doesn’t dominate
  • Handle missing values and remove or cap extreme outliers
  • Optionally whiten for unit variance across components

Choosing Components

  • Scree plot elbow and cumulative variance thresholds (e.g., 90–95%)
  • Domain constraints: interpretability vs. compression
  • Cross‑validate downstream model performance with k components

Interpreting Results

  • Loadings reveal which features drive each component
  • Biplots combine scores and loadings to visualize structure
  • Component scores can replace raw features in models

Caveats & Alternatives

  • PCA is linear and sensitive to scaling
  • For nonlinear manifolds: t‑SNE/UMAP for visualization
  • For sparsity/interpretability: Sparse PCA or factor analysis
Run PCA Back to Service Page