EXPLORATORY

Correlation Analysis

Correlation matrix with p-values using pairwise complete observations. Strength categories, confidence intervals for Pearson, network visualization for strong correlations (>0.5).

What Makes This Powerful

Strength Categories

Automatic categorization: negligible (<0.3), weak (0.3-0.5), moderate (0.5-0.7), strong (0.7-0.9), very strong (≥0.9). P-values calculated for all pairs.

Visualization Suite

Heatmap with all correlations, top 3 significant scatter plots, network graph for correlations >0.5. Sorted by absolute correlation strength.

Statistical Testing

P-values via cor.test for each pair. Confidence intervals for Pearson only. Significance level from confidence_level parameter (default 0.95).

What You Need to Provide

Numeric variables required

Provide dataset with numeric columns. Specify variables array or uses all numeric columns. Method: pearson (default), spearman, or kendall.

Algorithm uses cor() with pairwise.complete.obs, calculates p-values via cor.test loop, categorizes strength, finds significant correlations based on confidence_level, generates network edges for |r| > 0.5.

Schema Preview / observations × numeric features [+ label]

Quick Specs

MethodsPearson, Spearman, Kendall
MissingPairwise complete obs
NetworkShows |r| > 0.5
ScatterTop 3 significant

How We Explore Relationships

From profiling to actionable insights

1

Calculate Matrix

Select numeric columns only, use cor() with pairwise.complete.obs, loop through pairs for cor.test p-values.

2

Categorize & Test

Apply strength thresholds (negligible to very strong), extract confidence intervals for Pearson, identify significant correlations at specified confidence level.

3

Visualize Results

Create heatmap data, select top 3 significant pairs for scatter plots, build network edges for |r| > 0.5, sort by absolute correlation.

Why This Analysis Matters

Comprehensive correlation analysis with automatic strength categorization and significance testing for all variable pairs.

Provides correlation matrix, p-value matrix, and strength categories. Network visualization highlights strong relationships (|r| > 0.5). Top 3 significant correlations shown in scatter plots. Summary statistics include strongest positive/negative and mean/median absolute correlations.

Note: Uses pairwise complete observations. Confidence intervals only for Pearson. Network threshold fixed at 0.5. Maximum 3 scatter plots generated. No partial correlations or FDR correction.

Ready to Explore Relationships?

Map structure, reduce redundancy, and speed up modeling

Read the article: Correlation vs Causation