K‑Means partitions observations into K clusters by minimizing within‑cluster variance. Proper scaling and K selection are critical.

Preparation

  • Standardize features; optional PCA for decorrelation or visualization
  • Handle outliers that can distort centroids

Choosing K

  • Elbow (inertia) and silhouette score across candidate K
  • Stability across random inits and subsamples

Outputs

  • Cluster labels and centroids
  • Profile tables: means, counts, and top differentiating features
  • Silhouette plots and separation visuals

When Not to Use

If clusters are non‑spherical or uneven density, consider DBSCAN/HDBSCAN or Gaussian Mixtures.

Run K‑Means Back to Service Page