Overview

Analysis Overview

Independent Samples t-Test Configuration

Analysis overview and configuration

Configuration

Analysis TypeT Test
CompanyEducational Research Institute
ObjectiveCompare math scores between male and female students using an independent samples t-test
Analysis Date2026-03-12
Processing Idtest_1773376459
Total Observations1000

Module Parameters

ParameterValue_row
alternativetwo.sidedalternative
confidence_level0.95confidence_level
significance_level0.05significance_level
var_equalFALSEvar_equal
T Test analysis for Educational Research Institute

Interpretation

Purpose

This analysis compares math scores between male and female students using an independent samples t-test on 1,000 observations. The objective is to determine whether statistically significant differences exist in academic performance between genders, providing evidence-based insights into student achievement patterns.

Key Findings

  • Mean Difference: -5.095 points (females score lower than males on average)
  • Statistical Significance: p-value = 0.0000 indicates the difference is highly statistically significant
  • Effect Size (Cohen's d): -0.341 represents a small practical effect despite statistical significance
  • 95% Confidence Interval: -6.947 to -3.243 points, confirming the difference is real and consistent
  • Sample Composition: 518 females vs. 482 males with comparable variability (SD ~15 points each)

Interpretation

Male students demonstrate significantly higher math scores than female students, with males averaging 68.73 compared to females at 63.63. While the 5-point difference is statistically robust (t = -5.398, df = 997.98), the small effect size indicates this difference, though real, represents modest practical significance. Both groups show similar score distributions (IQR = 20, range 0-100), suggesting comparable variability in performance within each gender.

###

Data Preparation

Data Preprocessing

Data Quality & Group Validation

Data preprocessing and column mapping

Data Quality

Initial Rows1000
Final Rows1000
Rows Removed0
Retention Rate100

Data Quality

MetricValue
Initial Rows1,000
Final Rows1,000
Rows Removed0
Retention Rate100%
Processed 1,000 observations, retained 1,000 (100.0%) after cleaning

Interpretation

Purpose

This section documents the data preprocessing pipeline for a comparative analysis examining differences between female and male groups. Perfect data retention (100%) indicates no rows were removed during cleaning, meaning the full dataset of 1,000 observations proceeded to statistical testing without data loss or exclusion criteria applied.

Key Findings

  • Retention Rate: 100% (1,000 of 1,000 rows retained) - No observations were filtered or removed during preprocessing
  • Rows Removed: 0 - The dataset required no cleaning interventions, suggesting either high initial data quality or minimal validation criteria
  • Sample Composition: Balanced groups (518 female, 482 male) were preserved intact through the pipeline

Interpretation

The complete retention of all observations supports the validity of the subsequent Welch's t-test results, which compared mean values across both groups. With no data loss, the statistical power and representativeness of the analysis remain uncompromised. The balanced group sizes (approximately 52% female, 48% male) were maintained, enabling fair comparison of the -5.095 mean difference observed between groups.

Context

No train/test split was applied, indicating this analysis focused on descriptive comparison rather than predictive modeling. The absence of documented transformations suggests the raw values (0–100 scale) were analyzed directly, though the Shapiro-Wilk

Executive Summary

Executive Summary

Test Results & Recommendations

Key Metrics

initial_rows
1000
final_rows
1000
rows_removed
0

Key Findings

FindingValue
Statistical SignificanceYes (p=0.0000)
Effect SizeSmall (d=-0.341)
female Mean63.63 (SD: 15.49)
male Mean68.73 (SD: 14.36)
Mean Difference (95% CI)-5.095 (95% CI: -6.947 to -3.243)
Sample Sizesn1=518, n2=482

Summary

Bottom Line: There IS a statistically significant difference between female and male (p=0.0000). The effect size is small (Cohen's d = -0.341), indicating a small practical difference.

Key Findings:
• Compared 518 observations from female vs 482 from male
• Means: 63.63 vs 68.73 (difference: 5.095)
• Small effect (Cohen's d: -0.341)
• Used Welch's t-test

Recommendation: Both statistical significance and meaningful effect size support taking action based on this group difference.

Interpretation

EXECUTIVE SUMMARY

Purpose

This analysis compares a measured outcome between female and male populations using a rigorous statistical test. The findings directly address whether meaningful differences exist between these groups, which is critical for understanding population-level patterns and informing targeted strategies.

Key Findings

  • Statistical Significance: p-value = 0.0000 - The difference between groups is highly unlikely due to chance alone
  • Mean Difference: Males score 5.1 points higher than females (68.73 vs 63.63 on a 0-100 scale)
  • Effect Size: Cohen's d = -0.341 - While statistically significant, the practical magnitude is small
  • Sample Balance: 518 females and 482 males provide robust statistical power with no data loss
  • Normality Caveat: Both groups show slight deviations from normality (p < 0.05), though Welch's t-test is robust to this violation

Interpretation

The analysis confirms a statistically significant difference between groups with 99.99% confidence. Males consistently score approximately 5 points higher. However, the small effect size (Cohen's d = -0.341) indicates this difference, while real, represents modest practical separation. The 95% confidence interval (-6.95 to -3.24) excludes zero, reinfor

Figure 4

Distribution Comparison

Overlapping Density Curves by Group

Visual comparison of distributions between two groups

Interpretation

Purpose

This density overlay visualization compares the distribution shapes and central tendencies between female and male groups. It provides a visual foundation for understanding whether observed differences are driven by shifts in the entire distribution or concentrated in specific regions, complementing the statistical test results.

Key Findings

  • Mean Difference: Males score 5.1 points higher (68.73 vs. 63.63), representing a rightward shift in the male distribution
  • Spread Similarity: Both groups show comparable variability (SD: 15.49 female, 14.36 male), indicating consistent dispersion across groups
  • Distribution Shape: Both distributions appear approximately symmetric (skew ≈ -0.08), suggesting the difference is primarily a location shift rather than shape distortion
  • Overlap Pattern: Substantial curve overlap indicates considerable within-group variation relative to between-group differences

Interpretation

The density curves reveal that while males demonstrate a statistically significant higher mean (p < 0.001), the distributions overlap considerably. This aligns with the small effect size (Cohen's d = -0.341), indicating the practical magnitude of difference is modest despite statistical significance. The parallel spread patterns suggest the groups have homogeneous variance, supporting the equal variance assumption used in the Welch's t-test.

Context

The visual representation assumes kernel density estimation accuracy. The range extension (−

Figure 5

Box Plot Comparison

Means, IQR, and Individual Points by Group

Means and spread comparison between groups via box plots

Interpretation

Purpose

This section visualizes the distribution and central tendency of values across gender groups through box plots. It provides an intuitive way to compare group differences in location, spread, and variability—essential for understanding whether observed differences are meaningful or attributable to natural variation.

Key Findings

  • Mean Difference: Males score 5.1 points higher (68.73 vs. 63.63), a statistically significant gap (p < 0.001)
  • Spread Consistency: Both groups show similar variability (SD: 15.49 for females, 14.36 for males), with identical interquartile ranges (IQR = 20)
  • Distribution Shape: Both groups display symmetric distributions (skew ≈ 0.02) across the 0–100 scale, with comparable medians (65 vs. 69)

Interpretation

The box plots reveal that while males consistently score higher on average, the distributions largely overlap, indicating substantial within-group variation. The small effect size (Cohen's d = -0.341) confirms that despite statistical significance, the practical difference is modest. Both groups span the full measurement range, suggesting the underlying construct varies considerably within each gender.

Context

These visual comparisons complement the Welch's t-test results. Note that both groups violated normality assumptions (Shapiro-

Figure 6

Normality Diagnostics (QQ Plot)

Sample vs Theoretical Quantiles by Group

QQ plots and Shapiro-Wilk tests to assess normality assumption

Interpretation

Purpose

This section evaluates whether the data meets the normality assumption required for valid t-test inference. Normality diagnostics are critical because violations can affect the reliability of p-values and confidence intervals, particularly with smaller samples. Understanding departures from normality helps contextualize the robustness of the group comparison findings.

Key Findings

  • Shapiro-Wilk p-value (Female): 0.0035 - Statistically significant departure from normality; the female group distribution deviates from a normal curve
  • Shapiro-Wilk p-value (Male): 0.0380 - Marginal but significant departure from normality; the male group shows slight non-normal behavior
  • Variance Equality (F-test): p = 0.0902 - Variances are approximately equal across groups, supporting the use of Welch's t-test
  • QQ Plot Pattern: Sample values show slight deviations at distribution tails, consistent with bounded data (0–100 range)

Interpretation

Both groups exhibit statistically significant departures from normality, though the effect is modest. The near-equal variances (p > 0.05) justify the Welch's t-test choice, which is robust to moderate normality violations. The significant gender difference (t = -5

Figure 7

Effect Size

Cohen's d and Mean Difference with 95% CI

Cohen's d effect size and practical significance assessment

Interpretation

Purpose

This section quantifies the practical significance of the observed difference between female and male groups. While statistical significance (p < 0.001) confirms the difference is real, effect size measures whether that difference is meaningful in practical terms. Cohen's d standardizes the difference relative to variability, enabling comparison across studies and contexts.

Key Findings

  • Cohen's d: -0.341 (Small) - The difference falls within the "small" range (0.2–0.5), indicating modest practical significance despite strong statistical evidence
  • Mean Difference: -5.095 units (95% CI: -6.947 to -3.243) - Males scored approximately 5 points higher on average, with high confidence the true difference lies between 3.2 and 6.9 units
  • Confidence Interval: The narrow CI excludes zero, reinforcing that the difference is consistent and reliable across repeated sampling

Interpretation

The statistically significant t-test result is tempered by a small effect size, meaning the groups differ reliably but not dramatically. Males average 5 points higher than females, but this 5-point gap represents only about one-third of a standard deviation—a clinically or practically modest distinction. The tight confidence interval confirms precision in estimation despite the small magnitude.

Context

Effect size complements p-values by addressing "how much

Table 8

Test Results

t-Statistic, Degrees of Freedom, and P-Value

t-test statistics, p-value, and detailed results table

MetricValue
t-statistic-5.3980
Degrees of Freedom997.98
p-value0.0000
Mean Difference-5.095
95% CI Lower-6.947
95% CI Upper-3.243
Cohen's d-0.341
Effect MagnitudeSmall

Interpretation

Purpose

This section presents the statistical hypothesis test results comparing values between female and male groups. It determines whether observed differences are statistically significant or likely due to random variation, providing the quantitative foundation for rejecting or accepting the null hypothesis of equal population means.

Key Findings

  • t-statistic: -5.398 - Indicates males score approximately 5.4 standard errors higher than females, with the negative sign reflecting the direction of difference
  • p-value: 0.0000 (8.42e-08) - Extremely small probability that this difference occurred by chance alone
  • Degrees of Freedom: 997.98 - Reflects the large sample size (n=1000) providing robust statistical power
  • Significance: TRUE - Result meets the conventional α=0.05 threshold for statistical significance

Interpretation

The Welch's t-test conclusively demonstrates a statistically significant difference between groups. With a p-value far below 0.05, we reject the null hypothesis that female and male means are equal. The mean difference of -5.095 points (95% CI: -6.947 to -3.243) indicates males consistently score higher. However, Cohen's d of -0.341 reveals this difference is practically small in magnitude, suggesting statistical significance does not necessarily imply large real-world impact.

Context

Table 9

Summary Statistics

Descriptive Statistics by Group

Descriptive statistics for each group

GroupNMeanSDMedianIQRMinMax
female51863.6315.4965200100
male48268.7314.36692027100

Interpretation

Purpose

This section provides descriptive statistics for each group to establish baseline characteristics before statistical comparison. By reporting both mean and median alongside standard deviation, it enables assessment of central tendency and spread—critical for understanding whether the groups differ systematically and whether the data meet assumptions for parametric testing.

Key Findings

  • Female Group (n=518): Mean=63.63, SD=15.49, Median=65 — slightly lower central tendency with comparable variability
  • Male Group (n=482): Mean=68.73, SD=14.36, Median=69 — approximately 5-point higher mean with marginally tighter spread
  • Distributional Symmetry: Both groups show near-zero skewness (0.02), indicating symmetric distributions despite Shapiro-Wilk test violations

Interpretation

The 5.1-point mean difference (males higher) forms the basis for the subsequent t-test comparison. Both groups exhibit similar spread (SD ~15), supporting the equal variances assumption confirmed by the F-test (p=0.090). Median values closely track means, suggesting minimal outlier influence despite non-normality flags. This consistency between mean and median strengthens confidence in the parametric test results.

Context

Non-normality detected via Shapiro-Wilk tests (p<0.05) reflects sensitivity

Want to run this analysis on your own data? Upload CSV — Free Analysis See Pricing