Analysis overview and configuration

Configuration

Analysis TypeMann Whitney

CompanyDigital Marketing Platform

ObjectiveCompare total ad exposure between test groups using non-parametric Mann-Whitney U test to account for non-normal distribution

Analysis Date2026-03-09

Processing Idmann_whitney_test_20260309_131116

Total Observations500

Module Parameters

Parameter	Value	_row
alternative	two.sided	alternative
confidence_level	0.95	confidence_level
continuity_correction	TRUE	continuity_correction

Mann Whitney analysis for Digital Marketing Platform

Interpretation

Purpose

This analysis compares total ad exposure between two marketing groups (ad vs. psa) using the Mann-Whitney U test, a non-parametric statistical method chosen because the data violates normality assumptions. The objective is to determine whether meaningful differences exist in exposure levels between these test groups despite their unequal sample sizes and skewed distributions.

Key Findings

P-Value (0.008): Statistically significant difference detected between groups, well below the 0.05 threshold, indicating the observed difference is unlikely due to chance
Hodges-Lehmann Estimate (32): The ad group shows a median exposure approximately 32 units higher than the psa group, with 95% confidence interval spanning 8–63 units
Rank-Biserial Correlation (-0.315): Medium negative effect size indicating the psa group tends toward lower exposure values
Group Medians: Ad group median of 79 versus psa group median of 35 reflects substantially higher typical exposure in the ad condition
Sample Imbalance: 475 ad observations versus 25 psa observations creates asymmetric comparison power

Interpretation

The analysis provides strong statistical evidence that ad exposure differs significantly between groups. The ad group experiences higher median exposure (79 vs. 35), with the difference estimated at 32

Data preprocessing and column mapping

Data Quality

Initial Rows500

Final Rows500

Rows Removed0

Retention Rate100

Data Quality

Metric	Value
Initial Rows	500
Final Rows	500
Rows Removed	0
Retention Rate	100%

Processed 500 observations, retained 500 (100.0%) after cleaning

Interpretation

Purpose

This section documents the data preprocessing pipeline for the Mann-Whitney U test comparing ad and psa groups. Perfect data retention (100%) indicates no rows were removed during cleaning, meaning all 500 observations proceeded directly to statistical analysis without filtering or exclusion.

Key Findings

Initial Rows: 500 observations entered the pipeline
Final Rows: 500 observations retained for analysis (100% retention rate)
Rows Removed: 0 - No data loss occurred during preprocessing
Data Quality: No filtering, imputation, or exclusion steps were applied

Interpretation

The complete retention of all 500 rows suggests either exceptionally clean source data or minimal preprocessing requirements. This is significant for the Mann-Whitney U test results, as the full sample (475 ad, 25 psa) directly informed the statistical comparison. The absence of data removal means no selection bias was introduced through filtering, preserving the original group imbalance (95% ad vs. 5% psa) that characterizes the dataset.

Context

The lack of train/test split indicates this was a descriptive statistical analysis rather than predictive modeling. The severe group imbalance (19:1 ratio) persisted through preprocessing, which may affect the robustness of the significant p-value (0.008) despite the medium effect size observed.

Key Metrics

initial_rows: 500
final_rows: 500
rows_removed: 0

Key Findings

Finding	Value
Statistical Significance	Yes (p=0.0079)
Effect Size	Medium (r=-0.315)
ad Median	79.00 (IQR: 129.50)
psa Median	35.00 (IQR: 94.00)
Median Difference (H-L)	32.00 (95% CI: 8.00 to 63.00)
Sample Sizes	n1=475, n2=25

Summary

Bottom Line: There IS a statistically significant difference between ad and psa (p=0.0079). The effect size is medium (rank-biserial correlation = -0.315), indicating a moderate practical difference.

Key Findings:
• Compared 475 observations from ad vs 25 from psa
• Medians: 79.00 vs 35.00 (difference: 44.00)
• Medium effect (rank-biserial: -0.315)
• Non-parametric test used due to non-normal distributions

Recommendation: Based on both statistical significance and meaningful effect size, we recommend taking action based on this group difference.

Interpretation

Purpose

This analysis compares two groups (ad and psa) using a Mann-Whitney U test to determine whether meaningful differences exist between them. The test was selected because both groups violated normality assumptions, making it the appropriate non-parametric alternative to a t-test. Understanding whether these groups differ statistically and practically is essential for informed decision-making.

Key Findings

Statistical Significance: p-value of 0.008 indicates a statistically significant difference between groups (below the 0.05 threshold)
Median Difference: Ad group median is 79 versus psa group median of 35—a 44-unit gap favoring the ad group
Effect Size: Rank-biserial correlation of -0.315 represents a medium effect, confirming the difference is not merely statistical noise but practically meaningful
Sample Imbalance: Ad group (n=475) vastly outnumbers psa group (n=25), which may affect generalizability
Distribution Shape: Both groups show right-skewed, non-normal distributions with substantial variability (IQRs of 129.5 and 94 respectively)

Interpretation

The ad group demonstrates consistently higher values than the psa group across the distribution. The Hodges-Lehmann estimate of 32 (95% CI: 8

Visual comparison of distributions between two groups

Interpretation

Purpose

This distribution comparison visualizes how measurements differ between the ad and psa groups, revealing their underlying data shapes. It serves as critical visual evidence supporting the choice of non-parametric testing, since both groups violate normality assumptions (Shapiro-Wilk p-values < 0.001).

Key Findings

Skewness (ad group): 0.87 - The ad group exhibits moderate positive skew, with a longer tail extending toward higher values (max=1328)
Skewness (psa group): Comparatively lower spread, with maximum value of 334, indicating a more compressed distribution
Value Range: Ad spans -98.49 to 1404.07 versus psa's narrower range, reflecting greater variability in the larger sample (n=475 vs n=25)
Distribution Shape: Both groups show non-normal distributions, justifying the Mann-Whitney U test over parametric alternatives

Interpretation

The overlapping histograms demonstrate that the ad group has substantially greater dispersion and right-skewness compared to psa. This visual pattern aligns with the Mann-Whitney U test result (p=0.008), confirming a statistically significant difference in central tendency. The Hodges-Lehmann estimate of 32 units represents the median difference between groups, with

Median and interquartile range comparison between groups

Interpretation

Purpose

This section visualizes the distribution and central tendency of values across two groups (ad and psa) using box plots. It provides a clear, visual comparison of medians, spread, and outliers—essential for understanding whether the groups differ meaningfully in their typical values and variability.

Key Findings

Ad Median: 79.00 with IQR of 129.50—indicating the middle 50% of ad values spans a wider range
Psa Median: 35.00 with IQR of 94.00—showing lower central values and slightly tighter middle distribution
Median Difference: Ad group has a median 44 units higher than psa, suggesting systematically higher values
Spread Pattern: Both groups show right-skewed distributions (skew=1.0), with ad extending to 1,328 versus psa's maximum of 334

Interpretation

The box plot comparison reveals that the ad group consistently exhibits higher values than psa across the distribution. The ad median (79) is more than double the psa median (35), and the wider IQR in ad reflects greater variability in the middle 50% of observations. This visual evidence aligns with the Mann-Whitney U test result (p=0.008), confirming a statistically significant difference between groups with a medium effect size.

Rank positions showing the basis of the U statistic

Interpretation

Purpose

The rank distribution reveals how the Mann-Whitney U test assigns ranks to observations from both groups when combined. This visualization demonstrates the foundation of the statistical test: if one group systematically occupies higher or lower ranks, it indicates a meaningful difference in central tendency between the groups, independent of the original scale.

Key Findings

Rank Range: Spans from 4 to 500 across all observations, with mean rank of 250.5—indicating balanced coverage of the ranking spectrum
Group Imbalance: Ad group dominates with 475 observations (95%) versus PSA's 25 (5%), creating inherent asymmetry in rank distribution
Rank-Biserial Correlation: -0.315 shows PSA group occupies systematically lower ranks despite smaller sample size, suggesting genuinely lower values independent of group size

Interpretation

The negative rank-biserial correlation (-0.315) indicates the PSA group concentrates in lower rank positions, meaning PSA observations tend to have smaller original values than AD observations. This rank-based difference, combined with the significant p-value (0.008), confirms a statistically meaningful shift in the distribution's location. The Mann-Whitney U statistic (7807.5) quantifies this rank separation, providing evidence that the groups differ beyond random variation.

Context

Rank-based testing

Mann-Whitney U test statistics and p-value

Metric	Value
Mann-Whitney U	7807.50
P-Value	0.0079
Rank-Biserial Correlation	-0.315
Hodges-Lehmann Estimate	32.000
95% CI Lower	8.000
95% CI Upper	63.000

Interpretation

Purpose

This section presents the Mann-Whitney U test results, a non-parametric statistical test appropriate for comparing two independent groups with non-normal distributions. It determines whether the ad and psa groups have statistically significantly different distributions, which is essential for validating whether observed differences are genuine rather than due to random variation.

Key Findings

Mann-Whitney U Statistic: 7807.50 - Represents the test statistic calculated from ranked data across both groups
P-Value: 0.0079 - Falls below the 0.05 significance threshold, indicating strong evidence against the null hypothesis
Statistical Significance: TRUE - The difference between groups is statistically significant at the 95% confidence level

Interpretation

The p-value of 0.0079 provides strong evidence that the ad and psa groups have meaningfully different distributions. This finding aligns with the descriptive statistics showing the ad group has a higher median (79 vs. 35) and greater spread. The Mann-Whitney U test was appropriately chosen because both groups failed normality tests (Shapiro-Wilk p-values near 0), making it more reliable than parametric alternatives.

Context

The severe sample size imbalance (475 ad vs. 25 psa observations) should be considered when interpreting results

Effect size and practical significance assessment

Interpretation

Purpose

This section quantifies the practical magnitude of the difference between the ad and psa groups beyond statistical significance. While the p-value (0.008) confirms a difference exists, effect size metrics reveal how large that difference is in real-world terms, which is essential for assessing whether the finding has meaningful practical importance.

Key Findings

Rank-Biserial Correlation: -0.315 (Medium) - Indicates a medium-strength practical difference favoring the psa group, with the negative value reflecting lower values in the psa distribution relative to ad
Hodges-Lehmann Estimate: 32 units (95% CI: 8–63) - The robust median difference between groups is approximately 32 units, with reasonable confidence the true difference falls between 8 and 63 units
Effect Magnitude: Medium - Confirms the difference is neither negligible nor exceptionally large

Interpretation

The statistically significant p-value is paired with a medium effect size, meaning the ad group (median=79) genuinely differs from the psa group (median=35) in practical terms. The 32-unit median difference represents a meaningful gap, though the wide confidence interval (8–63) reflects uncertainty due to the small psa sample (n=25) and high variability in both groups.

Context

These

Descriptive statistics for each group

Group	N	Median	Q1	Q3	IQR	Min	Max	Mean	SD
ad	475	79	35.5	165	129.5	1	1328	134.9	165
psa	25	35	11	105	94	1	334	76.68	94.83

Interpretation

Purpose

This section provides descriptive statistics for each group, emphasizing median and interquartile range (IQR) rather than mean and standard deviation. This approach is essential here because both groups failed normality tests (Shapiro-Wilk p < 0.001), making median-based metrics more robust and interpretable for non-normal distributions.

Key Findings

Ad Group Median: 79 (IQR: 129.5) — substantially higher central tendency than PSA group
PSA Group Median: 35 (IQR: 94) — lower median with slightly tighter spread
Median Difference: 44-unit gap between groups, consistent with the Hodges-Lehmann estimate of 32 (95% CI: 8–63)
Spread Comparison: Both groups show similar relative variability (IQR ranges), but Ad group extends to higher maximum values (1,328 vs. 334)

Interpretation

The Ad group demonstrates consistently higher values across the distribution compared to the PSA group. The Mann-Whitney U test (p = 0.008) confirms this difference is statistically significant. The negative rank-biserial correlation (−0.315, medium effect) indicates PSA values tend to rank lower, supporting the hypothesis that these groups differ meaning

Shapiro-Wilk normality tests justifying non-parametric approach

Group	Shapiro_W	Shapiro_p_value	Normality
ad	0.6908	0.0000	Rejected
psa	0.7760	0.0001	Rejected

Interpretation

Purpose

This section validates the statistical method choice for comparing the two groups (ad vs. psa). Since parametric tests assume normally distributed data, the Shapiro-Wilk normality test determines whether a non-parametric Mann-Whitney U test is appropriate. This justification is critical for ensuring the validity of the significance findings reported in the overall analysis.

Key Findings

Group 1 (ad) Shapiro-Wilk p-value: <0.001 - Highly significant departure from normality; the distribution is substantially non-normal
Group 2 (psa) Shapiro-Wilk p-value: 0.0001 - Significant departure from normality; similarly non-normal
Either Non-Normal: TRUE - Both groups fail the normality assumption, confirming non-parametric testing is required

Interpretation

Both the ad group (n=475) and psa group (n=25) exhibit significant departures from normality, as evidenced by p-values far below the 0.05 threshold. This non-normality is consistent with the observed positive skewness (1.0) and right-tailed distributions visible in the boxplot data, where maximum values substantially exceed medians. The Mann-Whitney U test (p=0.008) is therefore the appropriate choice for comparing these groups

Analysis Overview

Configuration

Module Parameters

Interpretation

Purpose

Key Findings

Interpretation

Data Preprocessing

Data Quality

Data Quality

Interpretation

Purpose

Key Findings

Interpretation

Context

Executive Summary

Key Metrics

Key Findings

Summary

Interpretation

Purpose

Key Findings

Interpretation

Distribution Comparison

Interpretation

Purpose

Key Findings

Interpretation

Box Plot Comparison

Interpretation

Purpose

Key Findings

Interpretation

Rank Distribution

Interpretation

Purpose

Key Findings

Interpretation

Context

Test Results

Interpretation

Purpose

Key Findings

Interpretation

Context

Effect Size

Interpretation

Purpose

Key Findings

Interpretation

Context

Summary Statistics

Interpretation

Purpose

Key Findings

Interpretation

Normality Diagnostics

Interpretation

Purpose

Key Findings

Interpretation