Context and Data Preparation

Analysis Overview and Data Quality

OV

Categorical Analysis Overview

Comprehensive Survey Response Analysis

Analysis overview and configuration

Categorical Analysis
Educational Research Institute
Analyze survey responses to identify factors affecting test performance
Module Configuration
significance_level 0.05
min_group_size 5
categorical_vars gender, race/ethnicity, parental level of education, lunch, test preparation course
target_vars math score, reading score, writing score
primary_target math score
pass_threshold 50
grade_boundaries 90, 80, 70, 60
Processing ID
test_1772252404
IN

Key Insights

Categorical Analysis Overview

Analysis Overview & Setup

Purpose

This analysis examines how categorical demographic and program factors relate to student test performance across three subjects (math, reading, writing). The study uses 1,000 complete survey responses to identify which student characteristics and interventions significantly influence academic outcomes, supporting evidence-based educational decision-making.

Key Findings

  • Data Integrity: All 1,000 observations retained with zero rows removed, ensuring complete analysis without data loss
  • Categorical Variables: 5 demographic/program factors across 17 total categories analyzed (gender, race/ethnicity, parental education, lunch program, test prep)
  • Performance Metrics: Mean math score of 66.09 (SD=15.16); grade distribution heavily skewed toward lower grades (F=28.5%, A=5.2%)
  • Statistical Significance: All 15 ANOVA tests significant (p<0.05); zero significant chi-square associations among categorical variables
  • Strongest Effect: Lunch program status on math scores (η²=0.123), with standard lunch students scoring 11.11 points higher than free/reduced lunch students
  • Score Correlations: Reading-writing correlation exceptionally strong (r=0.955); math moderately correlated with both (r≈0.81)

Interpretation

The analysis reveals that while

IN

Key Insights

Categorical Analysis Overview

Analysis Overview & Setup

Purpose

This analysis examines how categorical demographic and program factors relate to student test performance across three subjects (math, reading, writing). The study uses 1,000 complete survey responses to identify which student characteristics and interventions significantly influence academic outcomes, supporting evidence-based educational decision-making.

Key Findings

  • Data Integrity: All 1,000 observations retained with zero rows removed, ensuring complete analysis without data loss
  • Categorical Variables: 5 demographic/program factors across 17 total categories analyzed (gender, race/ethnicity, parental education, lunch program, test prep)
  • Performance Metrics: Mean math score of 66.09 (SD=15.16); grade distribution heavily skewed toward lower grades (F=28.5%, A=5.2%)
  • Statistical Significance: All 15 ANOVA tests significant (p<0.05); zero significant chi-square associations among categorical variables
  • Strongest Effect: Lunch program status on math scores (η²=0.123), with standard lunch students scoring 11.11 points higher than free/reduced lunch students
  • Score Correlations: Reading-writing correlation exceptionally strong (r=0.955); math moderately correlated with both (r≈0.81)

Interpretation

The analysis reveals that while

PP

Data Preprocessing

Data Quality & Completeness

1,000
Final Observations

Data preprocessing and column mapping

Data Pipeline
1,000
Initial Records
1,000
Clean Records
Column Mapping
respondent_gender
gender
race_ethnicity
race/ethnicity
parental_education
parental level of education
lunch_program
lunch
test_prep
test preparation course
math_score
math score
reading_score
reading score
writing_score
writing score
1,000 Records
MCP Analytics
IN

Key Insights

Data Preprocessing

Purpose

This section documents the data preprocessing pipeline for a 1,000-observation dataset analyzing student performance across multiple demographic and academic dimensions. Data quality and retention rates are critical because they directly affect the validity of the 15 ANOVA tests and 10 chi-square tests performed in the subsequent analysis, ensuring conclusions about group differences are based on complete, uncompromised data.

Key Findings

  • Retention Rate: 100% (1,000 of 1,000 rows retained) - No observations were removed during preprocessing, indicating either exceptionally clean source data or minimal validation criteria applied
  • Rows Removed: 0 - Complete dataset preservation suggests no missing values, duplicates, or outliers were flagged for exclusion
  • Data Integrity: Full sample size maintained across all 5 categorical variables and 3 target variables (math, reading, writing scores), enabling robust statistical testing with adequate group sizes (minimum n=5)

Interpretation

The perfect retention rate indicates the dataset entered analysis without data quality issues requiring remediation. This supports the reliability of findings showing significant ANOVA effects (all 15 tests p<0.05) and strong score correlations (reading-writing r=0.955). However, the absence of any data cleaning may suggest either pre-cleaned source data or that validation thresholds were not stringent, which could mask underlying

IN

Key Insights

Data Preprocessing

Purpose

This section documents the data preprocessing pipeline for a 1,000-observation dataset analyzing student performance across multiple demographic and academic dimensions. Data quality and retention rates are critical because they directly affect the validity of the 15 ANOVA tests and 10 chi-square tests performed in the subsequent analysis, ensuring conclusions about group differences are based on complete, uncompromised data.

Key Findings

  • Retention Rate: 100% (1,000 of 1,000 rows retained) - No observations were removed during preprocessing, indicating either exceptionally clean source data or minimal validation criteria applied
  • Rows Removed: 0 - Complete dataset preservation suggests no missing values, duplicates, or outliers were flagged for exclusion
  • Data Integrity: Full sample size maintained across all 5 categorical variables and 3 target variables (math, reading, writing scores), enabling robust statistical testing with adequate group sizes (minimum n=5)

Interpretation

The perfect retention rate indicates the dataset entered analysis without data quality issues requiring remediation. This supports the reliability of findings showing significant ANOVA effects (all 15 tests p<0.05) and strong score correlations (reading-writing r=0.955). However, the absence of any data cleaning may suggest either pre-cleaned source data or that validation thresholds were not stringent, which could mask underlying

Executive Summary

Key Findings from Comprehensive Categorical Analysis

TLDR

Executive Summary

Key Findings & Recommendations

1000
Total Respondents

Key Performance Indicators

Total respondents
1,000
Significant chi square
0
Significant anova
15
Strongest group effect
lunch on math score (eta2=0.1231)
Avg pass rate
89.7
Math reading corr
81.8%

Key Findings

Key findings

Finding Value
Total Respondents 1,000
Categorical Variables 5
Significant Associations (Chi-Square) 0 of 10
Significant Group Effects (ANOVA) 15 of 15
Strongest Factor lunch on math score (eta2=0.1231)
Primary Target Mean 66.09 (SD: 15.16)
Score Correlations (Math-Reading) r = 0.8176
Pass Rate 89.7%
Grade A Students 52
Significant Post-Hoc Pairs 43

Executive Summary

Bottom Line: Analyzed 1,000 survey responses across 5 categorical variables and 3 numeric outcomes.

Key Findings:
• 0 of 10 categorical pairs show significant associations
• 15 of 15 ANOVA tests reveal significant group differences
• Strongest factor: lunch on math score (eta2=0.1231)
• Score correlations are strong (math-reading r=0.8176)
• Pass rate: 89.7% (threshold: 50)
• 43 significant pairwise differences (Tukey HSD)

Recommendation: Focus interventions on the factors with the largest effect sizes (eta-squared). Demographic groups with low pass rates and high F-grade concentrations should be prioritized for support programs.

IN

Key Insights

Executive Summary

Purpose

This analysis examined 1,000 respondents across 5 categorical demographic variables to identify which factors most strongly predict performance outcomes (math, reading, and writing scores). The objective was to determine whether demographic characteristics independently associate with achievement and which groups show the largest performance gaps.

Key Findings

  • Categorical Independence: 0 of 10 chi-square tests showed significant associations between demographic pairs, indicating demographic variables operate independently rather than clustering together
  • Strong Group Effects: All 15 ANOVA tests were significant (p < 0.05), confirming that demographic groups differ meaningfully on all three score outcomes
  • Largest Practical Effect: Lunch program status on math scores (η² = 0.1231)—students on free/reduced lunch averaged 58.92 vs. 70.03 for standard lunch, an 11-point gap
  • Score Coherence: Math-reading correlation of 0.818 and reading-writing correlation of 0.955 indicate strong alignment across subjects
  • Performance Distribution: 89.7% pass rate overall, but only 52 students earned A grades while 285 earned F grades, revealing a bimodal distribution

Interpretation

The data reveals a paradox: demographic variables do not interact with each other (no chi-square significance), yet each independently predicts achievement differences. This

IN

Key Insights

Executive Summary

Purpose

This analysis examined 1,000 respondents across 5 categorical demographic variables to identify which factors most strongly predict performance outcomes (math, reading, and writing scores). The objective was to determine whether demographic characteristics independently associate with achievement and which groups show the largest performance gaps.

Key Findings

  • Categorical Independence: 0 of 10 chi-square tests showed significant associations between demographic pairs, indicating demographic variables operate independently rather than clustering together
  • Strong Group Effects: All 15 ANOVA tests were significant (p < 0.05), confirming that demographic groups differ meaningfully on all three score outcomes
  • Largest Practical Effect: Lunch program status on math scores (η² = 0.1231)—students on free/reduced lunch averaged 58.92 vs. 70.03 for standard lunch, an 11-point gap
  • Score Coherence: Math-reading correlation of 0.818 and reading-writing correlation of 0.955 indicate strong alignment across subjects
  • Performance Distribution: 89.7% pass rate overall, but only 52 students earned A grades while 285 earned F grades, revealing a bimodal distribution

Interpretation

The data reveals a paradox: demographic variables do not interact with each other (no chi-square significance), yet each independently predicts achievement differences. This

Score Distributions

Density curves showing how math, reading, writing, and total scores are distributed

SD

Score Distributions

Density Curves by Subject

66.09
Score Types

Density distributions for math, reading, writing, and total scores

66.09
primary target mean
15.16
primary target sd
203.31
total score mean
IN

Key Insights

Score Distributions

Purpose

This section visualizes how test scores are distributed across the student population, revealing whether performance follows normal patterns or exhibits skewness that might indicate floor/ceiling effects. Understanding score distributions is essential for identifying whether assessment difficulty is appropriately calibrated and whether demographic disparities (explored in earlier sections) reflect genuine performance gaps or measurement artifacts.

Key Findings

  • Math Score Mean: 66.09 (SD: 15.16) – The primary target shows moderate central tendency with substantial variability, suggesting diverse performance levels
  • Total Score Mean: 203.31 (SD: 42.77) – Composite performance across three subjects averages near two-thirds of the maximum possible range
  • Distribution Shape: Positive skew (0.66) indicates a right-tailed distribution, with more students clustering below the mean and a tail extending toward higher scores
  • Score Range: Values span from near-zero to 100+ (math) and up to 329 (total), confirming full utilization of the scoring scale without obvious ceiling effects

Interpretation

The moderate skew and spread suggest reasonably normal assessment performance without severe floor or ceiling constraints. The math score mean of 66.09 aligns with the overall pass rate of 89.7% (threshold: 50), indicating most students exceed minimum competency. However, the right skew combined with 28.

IN

Key Insights

Score Distributions

Purpose

This section visualizes how test scores are distributed across the student population, revealing whether performance follows normal patterns or exhibits skewness that might indicate floor/ceiling effects. Understanding score distributions is essential for identifying whether assessment difficulty is appropriately calibrated and whether demographic disparities (explored in earlier sections) reflect genuine performance gaps or measurement artifacts.

Key Findings

  • Math Score Mean: 66.09 (SD: 15.16) – The primary target shows moderate central tendency with substantial variability, suggesting diverse performance levels
  • Total Score Mean: 203.31 (SD: 42.77) – Composite performance across three subjects averages near two-thirds of the maximum possible range
  • Distribution Shape: Positive skew (0.66) indicates a right-tailed distribution, with more students clustering below the mean and a tail extending toward higher scores
  • Score Range: Values span from near-zero to 100+ (math) and up to 329 (total), confirming full utilization of the scoring scale without obvious ceiling effects

Interpretation

The moderate skew and spread suggest reasonably normal assessment performance without severe floor or ceiling constraints. The math score mean of 66.09 aligns with the overall pass rate of 89.7% (threshold: 50), indicating most students exceed minimum competency. However, the right skew combined with 28.

Score Correlations

Pearson correlation matrix revealing inter-subject relationships

SC

Score Correlations

Pearson Correlation Matrix

0.818
Score Pairs

Pearson correlation matrix among math, reading, and writing scores

0.818
math reading corr
0.803
math writing corr
0.955
reading writing corr
IN

Key Insights

Score Correlations

Purpose

This section quantifies the strength of relationships between the three test score domains. Understanding score correlations reveals whether academic performance is domain-specific or reflects a unified underlying ability. All correlations are statistically significant (p < 0.001), indicating these relationships are robust and not due to chance.

Key Findings

  • Reading-Writing Correlation (r = 0.955): Exceptionally strong relationship; students’ reading and writing performance are nearly interchangeable, suggesting these skills are highly interdependent or measure similar cognitive abilities.
  • Math-Reading Correlation (r = 0.818): Strong positive relationship; students excelling in math tend to perform well in reading, though the relationship is notably weaker than reading-writing.
  • Math-Writing Correlation (r = 0.803): Slightly weaker than math-reading; math performance shows the most independence from writing skills among the three pairs.

Interpretation

The correlation hierarchy reveals that reading and writing form a tightly integrated skill cluster, while math operates somewhat independently. This pattern suggests students may have distinct mathematical aptitude separate from language-based competencies. However, all correlations exceed 0.80, confirming that strong overall academic ability manifests across all three domains. The near-perfect reading-writing correlation (0.955) indicates these subjects could be treated as a single construct in predict

IN

Key Insights

Score Correlations

Purpose

This section quantifies the strength of relationships between the three test score domains. Understanding score correlations reveals whether academic performance is domain-specific or reflects a unified underlying ability. All correlations are statistically significant (p < 0.001), indicating these relationships are robust and not due to chance.

Key Findings

  • Reading-Writing Correlation (r = 0.955): Exceptionally strong relationship; students’ reading and writing performance are nearly interchangeable, suggesting these skills are highly interdependent or measure similar cognitive abilities.
  • Math-Reading Correlation (r = 0.818): Strong positive relationship; students excelling in math tend to perform well in reading, though the relationship is notably weaker than reading-writing.
  • Math-Writing Correlation (r = 0.803): Slightly weaker than math-reading; math performance shows the most independence from writing skills among the three pairs.

Interpretation

The correlation hierarchy reveals that reading and writing form a tightly integrated skill cluster, while math operates somewhat independently. This pattern suggests students may have distinct mathematical aptitude separate from language-based competencies. However, all correlations exceed 0.80, confirming that strong overall academic ability manifests across all three domains. The near-perfect reading-writing correlation (0.955) indicates these subjects could be treated as a single construct in predict

Categorical Distributions

Frequency analysis showing respondent composition across all categorical variables

CD

Categorical Distributions

Frequency Analysis by Variable

5
Variables Analyzed

Frequency distribution of each categorical variable showing counts and percentages

5
num categorical vars
17
num categories total
1000
total respondents
IN

Key Insights

Categorical Distributions

Purpose

This section establishes the baseline composition of the dataset by documenting how respondents distribute across five key demographic and program variables. Understanding these categorical distributions is essential for interpreting subsequent statistical tests and identifying whether certain groups are over- or under-represented, which affects the generalizability of findings across the analysis.

Key Findings

  • Gender Distribution: Nearly balanced with 51.8% female (n=518) and 48.2% male (n=482), providing adequate representation for gender-based comparisons
  • Lunch Program Participation: Heavily skewed toward standard lunch (64.5%, n=645) versus free/reduced (35.5%, n=355), indicating a majority-minority split
  • Test Preparation Course: Only 35.8% (n=358) completed test prep, while 64.2% (n=642) did not, showing low adoption of preparation resources
  • Parental Education: Distributed across six levels with associate’s degree (22.2%) and high school (19.6%) as most common, reflecting diverse educational backgrounds
  • Race/Ethnicity: Concentrated in group C (31.9%) with smaller representation in group A (8.9%), showing unequal demographic composition

Interpretation

The categorical landscape reveals a dataset with balanced gender representation but substantial disparities in socioeconomic indicators (lunch program

IN

Key Insights

Categorical Distributions

Purpose

This section establishes the baseline composition of the dataset by documenting how respondents distribute across five key demographic and program variables. Understanding these categorical distributions is essential for interpreting subsequent statistical tests and identifying whether certain groups are over- or under-represented, which affects the generalizability of findings across the analysis.

Key Findings

  • Gender Distribution: Nearly balanced with 51.8% female (n=518) and 48.2% male (n=482), providing adequate representation for gender-based comparisons
  • Lunch Program Participation: Heavily skewed toward standard lunch (64.5%, n=645) versus free/reduced (35.5%, n=355), indicating a majority-minority split
  • Test Preparation Course: Only 35.8% (n=358) completed test prep, while 64.2% (n=642) did not, showing low adoption of preparation resources
  • Parental Education: Distributed across six levels with associate’s degree (22.2%) and high school (19.6%) as most common, reflecting diverse educational backgrounds
  • Race/Ethnicity: Concentrated in group C (31.9%) with smaller representation in group A (8.9%), showing unequal demographic composition

Interpretation

The categorical landscape reveals a dataset with balanced gender representation but substantial disparities in socioeconomic indicators (lunch program

Categorical Relationships

Chi-square independence tests reveal associations between categorical variables

CR

Categorical Relationships

Chi-Square Independence Testing

10
Tests Performed

Cross-tabulation analysis with chi-square independence tests and Cramer's V association strength

10
total chi square tests
0
significant chi square
parental level of education x test preparation course (V=0.098)
strongest association
IN

Key Insights

Categorical Relationships

Purpose

This section evaluates whether categorical demographic and program variables are statistically independent of each other. Understanding these relationships is essential for identifying confounding factors and determining whether observed performance differences across groups stem from demographic characteristics or program participation patterns.

Key Findings

  • Total Chi-Square Tests: 10 pairs tested; 0 showed statistical significance (p < 0.05)
  • Strongest Association: Parental education × test preparation course (Cramer’s V = 0.098) — still a weak association
  • Independence Pattern: All categorical variables demonstrate statistical independence, with p-values ranging from 0.06 to 0.95

Interpretation

The absence of significant associations indicates that demographic characteristics (gender, race/ethnicity, parental education) and program participation (lunch type, test prep completion) operate independently within this population. This independence is analytically valuable because it suggests that performance differences observed across demographic groups are unlikely to be confounded by unequal distribution of test preparation or lunch program participation. The weak Cramer’s V values (all ≤ 0.10) confirm minimal practical association strength.

Context

These findings assume adequate cell sizes and random sampling. The independence of categorical variables strengthens the validity of subsequent ANOVA analyses examining performance differences across demographic groups, as group membership is not systematically linked to program exposure.

IN

Key Insights

Categorical Relationships

Purpose

This section evaluates whether categorical demographic and program variables are statistically independent of each other. Understanding these relationships is essential for identifying confounding factors and determining whether observed performance differences across groups stem from demographic characteristics or program participation patterns.

Key Findings

  • Total Chi-Square Tests: 10 pairs tested; 0 showed statistical significance (p < 0.05)
  • Strongest Association: Parental education × test preparation course (Cramer’s V = 0.098) — still a weak association
  • Independence Pattern: All categorical variables demonstrate statistical independence, with p-values ranging from 0.06 to 0.95

Interpretation

The absence of significant associations indicates that demographic characteristics (gender, race/ethnicity, parental education) and program participation (lunch type, test prep completion) operate independently within this population. This independence is analytically valuable because it suggests that performance differences observed across demographic groups are unlikely to be confounded by unequal distribution of test preparation or lunch program participation. The weak Cramer’s V values (all ≤ 0.10) confirm minimal practical association strength.

Context

These findings assume adequate cell sizes and random sampling. The independence of categorical variables strengthens the validity of subsequent ANOVA analyses examining performance differences across demographic groups, as group membership is not systematically linked to program exposure.

Multi-Subject Comparison

Side-by-side comparison of math, reading, and writing scores across demographics

MS

Multi-Subject Comparison

All Scores by Demographic Group

3
Subjects Compared

All three scores compared across each demographic group

3
num target vars
5
num categorical vars
IN

Key Insights

Multi-Subject Comparison

Purpose

This section compares performance across three academic subjects (math, reading, writing) within each demographic group to identify whether achievement gaps are consistent or subject-specific. Understanding these patterns reveals whether certain groups face universal barriers or experience advantages/disadvantages in particular subjects, informing targeted intervention strategies.

Key Findings

  • Gender Gap Reversal: Males score 5.1 points higher in math (68.73 vs 63.63), but females score 7.14 points higher in reading and 9.16 points higher in writing, indicating fundamentally different performance profiles by subject
  • Test Preparation Impact: Completion shows consistent gains across all three subjects, with the largest effect on writing (9.91-point gap), suggesting preparation benefits compound across domains
  • Mean Score Range: Across all demographic groups and subjects, scores range from 58.92 to 75.68 (mean=67.97), with relatively low variance (SD=4.09), indicating moderate consistency in performance patterns
  • Parental Education Influence: This variable shows the most granular breakdown (18 observations), suggesting education level creates differentiated subject performance profiles

Interpretation

The data reveals that demographic disparities are not uniform across subjects. Gender shows the most dramatic subject-specific variation, with males excelling in quantitative reasoning but females demonstrating stronger literacy skills. Test preparation’s consistent positive

IN

Key Insights

Multi-Subject Comparison

Purpose

This section compares performance across three academic subjects (math, reading, writing) within each demographic group to identify whether achievement gaps are consistent or subject-specific. Understanding these patterns reveals whether certain groups face universal barriers or experience advantages/disadvantages in particular subjects, informing targeted intervention strategies.

Key Findings

  • Gender Gap Reversal: Males score 5.1 points higher in math (68.73 vs 63.63), but females score 7.14 points higher in reading and 9.16 points higher in writing, indicating fundamentally different performance profiles by subject
  • Test Preparation Impact: Completion shows consistent gains across all three subjects, with the largest effect on writing (9.91-point gap), suggesting preparation benefits compound across domains
  • Mean Score Range: Across all demographic groups and subjects, scores range from 58.92 to 75.68 (mean=67.97), with relatively low variance (SD=4.09), indicating moderate consistency in performance patterns
  • Parental Education Influence: This variable shows the most granular breakdown (18 observations), suggesting education level creates differentiated subject performance profiles

Interpretation

The data reveals that demographic disparities are not uniform across subjects. Gender shows the most dramatic subject-specific variation, with males excelling in quantitative reasoning but females demonstrating stronger literacy skills. Test preparation’s consistent positive

Group Score Comparisons

Mean scores across demographic groups with 95% confidence intervals

GC

Group Score Comparisons

Mean Scores with Confidence Intervals

66.09
Groups Compared

Mean scores compared across categorical groups with 95% confidence intervals

66.09
primary target mean
15.16
primary target sd
math score
primary target name
IN

Key Insights

Group Score Comparisons

Purpose

This section identifies which demographic and program groups achieve higher or lower math scores, revealing performance disparities across the student population. Understanding these group differences is essential for identifying where targeted interventions may be needed and which factors most strongly influence academic outcomes.

Key Findings

  • Lunch Program Effect: Standard lunch students score 70.03 vs. 58.92 for free/reduced (11.11-point gap)—the largest observed difference with non-overlapping confidence intervals
  • Test Preparation Impact: Completed test prep yields 69.7 vs. 64.08 for no prep (5.62-point gap), indicating meaningful preparation benefits
  • Gender Gap: Males average 68.73 vs. females at 63.63 (5.10-point difference), statistically significant
  • Race/Ethnicity Range: Scores span 61.63–64.46 across groups, with narrower variation than lunch or test prep factors
  • Overall Spread: Mean scores range from 58.92 to 73.82 across all groups, representing substantial performance variation

Interpretation

The data reveals that socioeconomic status (lunch program) and test preparation are the strongest differentiators of math performance, with effect sizes substantially larger than demographic factors. The 11-point lunch program gap suggests resource disparities significantly impact achievement.

IN

Key Insights

Group Score Comparisons

Purpose

This section identifies which demographic and program groups achieve higher or lower math scores, revealing performance disparities across the student population. Understanding these group differences is essential for identifying where targeted interventions may be needed and which factors most strongly influence academic outcomes.

Key Findings

  • Lunch Program Effect: Standard lunch students score 70.03 vs. 58.92 for free/reduced (11.11-point gap)—the largest observed difference with non-overlapping confidence intervals
  • Test Preparation Impact: Completed test prep yields 69.7 vs. 64.08 for no prep (5.62-point gap), indicating meaningful preparation benefits
  • Gender Gap: Males average 68.73 vs. females at 63.63 (5.10-point difference), statistically significant
  • Race/Ethnicity Range: Scores span 61.63–64.46 across groups, with narrower variation than lunch or test prep factors
  • Overall Spread: Mean scores range from 58.92 to 73.82 across all groups, representing substantial performance variation

Interpretation

The data reveals that socioeconomic status (lunch program) and test preparation are the strongest differentiators of math performance, with effect sizes substantially larger than demographic factors. The 11-point lunch program gap suggests resource disparities significantly impact achievement.

ANOVA Testing Results

Statistical significance and effect sizes for all group comparisons

AN

ANOVA Results

Statistical Significance of Group Differences

15
Significant Tests

ANOVA F-tests for group differences with eta-squared effect sizes

categorical_var target_var f_statistic p_value eta_squared effect_size
gender math score 28.980 0.000 0.028 Small effect
gender reading score 63.350 0.000 0.060 Small effect
gender writing score 99.590 0.000 0.091 Medium effect
race/ethnicity math score 14.590 0.000 0.055 Small effect
race/ethnicity reading score 5.620 0.000 0.022 Small effect
race/ethnicity writing score 7.160 0.000 0.028 Small effect
parental level of education math score 6.520 0.000 0.032 Small effect
parental level of education reading score 9.290 0.000 0.045 Small effect
parental level of education writing score 14.440 0.000 0.068 Medium effect
lunch math score 140.120 0.000 0.123 Medium effect
lunch reading score 55.520 0.000 0.053 Small effect
lunch writing score 64.160 0.000 0.060 Medium effect
test preparation course math score 32.540 0.000 0.032 Small effect
test preparation course reading score 61.960 0.000 0.059 Small effect
test preparation course writing score 108.350 0.000 0.098 Medium effect
15
total anova tests
15
significant anova
IN

Key Insights

ANOVA Results

Purpose

This section identifies which demographic and program factors create meaningful differences in student test scores across math, reading, and writing. All 15 ANOVA tests yielded statistically significant results (p < 0.05), indicating that every examined factor—gender, race/ethnicity, parental education, lunch program status, and test preparation—meaningfully differentiates student performance.

Key Findings

  • Lunch Program on Math Score: Eta² = 0.1231 (medium effect) — the strongest predictor of performance variation, explaining 12.3% of score differences
  • Test Preparation on Writing: Eta² = 0.10 (medium effect) — second-strongest factor, with substantial impact on writing outcomes
  • Gender on Writing Score: Eta² = 0.09 (medium effect) — notable gender disparities in writing performance
  • All 15 Tests Significant: Every categorical variable shows statistically reliable group differences across all three subjects

Interpretation

The universal significance across all tests reveals that student demographics and program participation are systematically linked to achievement outcomes. Lunch program status emerges as the dominant factor, particularly for math, suggesting socioeconomic barriers substantially influence performance. While most effects remain small to medium in magnitude, their consistency across subjects and variables indicates multiple reinforcing pathways affecting student success rather than single dominant causes.

Context

IN

Key Insights

ANOVA Results

Purpose

This section identifies which demographic and program factors create meaningful differences in student test scores across math, reading, and writing. All 15 ANOVA tests yielded statistically significant results (p < 0.05), indicating that every examined factor—gender, race/ethnicity, parental education, lunch program status, and test preparation—meaningfully differentiates student performance.

Key Findings

  • Lunch Program on Math Score: Eta² = 0.1231 (medium effect) — the strongest predictor of performance variation, explaining 12.3% of score differences
  • Test Preparation on Writing: Eta² = 0.10 (medium effect) — second-strongest factor, with substantial impact on writing outcomes
  • Gender on Writing Score: Eta² = 0.09 (medium effect) — notable gender disparities in writing performance
  • All 15 Tests Significant: Every categorical variable shows statistically reliable group differences across all three subjects

Interpretation

The universal significance across all tests reveals that student demographics and program participation are systematically linked to achievement outcomes. Lunch program status emerges as the dominant factor, particularly for math, suggesting socioeconomic barriers substantially influence performance. While most effects remain small to medium in magnitude, their consistency across subjects and variables indicates multiple reinforcing pathways affecting student success rather than single dominant causes.

Context

Post-Hoc Pairwise Comparisons

Tukey HSD identifies which specific groups differ significantly

PH

Post-Hoc Comparisons

Tukey HSD Pairwise Tests

84
Significant Pairs

Tukey HSD pairwise comparisons identifying which specific groups differ

target_var grouping_var group1 group2 diff p_adj significant
math score gender male female 5.100 0.000 Yes
reading score gender male female -7.140 0.000 Yes
writing score gender male female -9.160 0.000 Yes
math score race/ethnicity group B group A 1.820 0.872 No
math score race/ethnicity group C group A 2.830 0.497 No
math score race/ethnicity group D group A 5.730 0.014 Yes
math score race/ethnicity group E group A 12.190 0.000 Yes
math score race/ethnicity group C group B 1.010 0.945 No
math score race/ethnicity group D group B 3.910 0.044 Yes
math score race/ethnicity group E group B 10.370 0.000 Yes
math score race/ethnicity group D group C 2.900 0.129 No
math score race/ethnicity group E group C 9.360 0.000 Yes
math score race/ethnicity group E group D 6.460 0.000 Yes
reading score race/ethnicity group B group A 2.680 0.601 No
reading score race/ethnicity group C group A 4.430 0.080 No
reading score race/ethnicity group D group A 5.360 0.022 Yes
reading score race/ethnicity group E group A 8.350 0.000 Yes
reading score race/ethnicity group C group B 1.750 0.678 No
reading score race/ethnicity group D group B 2.680 0.295 No
reading score race/ethnicity group E group B 5.680 0.004 Yes
reading score race/ethnicity group D group C 0.930 0.940 No
reading score race/ethnicity group E group C 3.930 0.058 No
reading score race/ethnicity group E group D 3.000 0.277 No
writing score race/ethnicity group B group A 2.930 0.551 No
writing score race/ethnicity group C group A 5.150 0.035 Yes
writing score race/ethnicity group D group A 7.470 0.001 Yes
writing score race/ethnicity group E group A 8.730 0.000 Yes
writing score race/ethnicity group C group B 2.230 0.485 No
writing score race/ethnicity group D group B 4.550 0.013 Yes
writing score race/ethnicity group E group B 5.810 0.005 Yes
43
significant posthoc pairs
IN

Key Insights

Post-Hoc Comparisons

Purpose

This section identifies which specific demographic and program groups differ significantly from each other across the three test scores. After confirming that group differences exist (via ANOVA), Tukey’s HSD test pinpoints exactly which pairs diverge, with p-values adjusted to prevent false positives from multiple comparisons. This granular view reveals where performance gaps are most pronounced.

Key Findings

  • Significant Pairwise Comparisons: 43 of 84 tested pairs show statistically significant differences (adjusted p < 0.05), indicating substantial variation across demographic segments
  • Gender Disparities: Males score 5.1 points higher in math but 7.14–9.16 points lower in reading and writing, revealing a consistent subject-specific gender pattern
  • Lunch Program Effect: Standard lunch students outperform free/reduced lunch students by 7–7.8 points across reading and writing, representing one of the largest documented gaps
  • Test Preparation Impact: Completion of test prep correlates with 5.6–9.9 point advantages across all three subjects
  • Race/Ethnicity Variation: Groups D and E show significantly higher math scores than Groups A and B (differences of 5.7–12.2 points), while smaller gaps exist within middle groups

Interpretation

The 43 significant comparisons confirm that

IN

Key Insights

Post-Hoc Comparisons

Purpose

This section identifies which specific demographic and program groups differ significantly from each other across the three test scores. After confirming that group differences exist (via ANOVA), Tukey’s HSD test pinpoints exactly which pairs diverge, with p-values adjusted to prevent false positives from multiple comparisons. This granular view reveals where performance gaps are most pronounced.

Key Findings

  • Significant Pairwise Comparisons: 43 of 84 tested pairs show statistically significant differences (adjusted p < 0.05), indicating substantial variation across demographic segments
  • Gender Disparities: Males score 5.1 points higher in math but 7.14–9.16 points lower in reading and writing, revealing a consistent subject-specific gender pattern
  • Lunch Program Effect: Standard lunch students outperform free/reduced lunch students by 7–7.8 points across reading and writing, representing one of the largest documented gaps
  • Test Preparation Impact: Completion of test prep correlates with 5.6–9.9 point advantages across all three subjects
  • Race/Ethnicity Variation: Groups D and E show significantly higher math scores than Groups A and B (differences of 5.7–12.2 points), while smaller gaps exist within middle groups

Interpretation

The 43 significant comparisons confirm that

Association Patterns

Standardized residuals showing deviations from expected independence

AP

Association Patterns

Observed vs Expected Frequencies

parental level of education x test preparation course (V=0.098)
Strongest Association

Mosaic plot data showing observed vs expected frequencies with standardized residuals

parental level of education x test preparation course (V=0.098)
strongest association
IN

Key Insights

Association Patterns

Purpose

This section identifies where categorical variables show the strongest associations in the dataset. The mosaic plot visualizes the relationship between gender and race/ethnicity, with residuals indicating which demographic combinations occur more or less frequently than statistical independence would predict. Understanding these patterns helps identify whether demographic groups are distributed evenly across the population or show meaningful clustering.

Key Findings

  • Strongest Association: Parental education × test preparation course (Cramér’s V = 0.098) - This represents the weakest association among all categorical pairs tested, indicating minimal dependency between these variables
  • Gender × Race/Ethnicity Residuals: Range from -1.49 to +1.54, all below the ±2 threshold for notable deviation - No cell combinations show statistically meaningful departures from expected frequencies
  • Pattern Observed: All 10 chi-square tests yielded non-significant results (p > 0.05), suggesting categorical variables operate largely independently of one another

Interpretation

Despite testing 10 categorical variable pairs, none demonstrate meaningful statistical associations. The residuals clustering near zero indicate that observed frequencies closely match expected values under independence. This suggests demographic characteristics (gender, race/ethnicity, parental education, lunch program status, test preparation) are distributed relatively uniformly across the sample without strong interdependencies.

Context

These weak associations contrast sharply with the strong

IN

Key Insights

Association Patterns

Purpose

This section identifies where categorical variables show the strongest associations in the dataset. The mosaic plot visualizes the relationship between gender and race/ethnicity, with residuals indicating which demographic combinations occur more or less frequently than statistical independence would predict. Understanding these patterns helps identify whether demographic groups are distributed evenly across the population or show meaningful clustering.

Key Findings

  • Strongest Association: Parental education × test preparation course (Cramér’s V = 0.098) - This represents the weakest association among all categorical pairs tested, indicating minimal dependency between these variables
  • Gender × Race/Ethnicity Residuals: Range from -1.49 to +1.54, all below the ±2 threshold for notable deviation - No cell combinations show statistically meaningful departures from expected frequencies
  • Pattern Observed: All 10 chi-square tests yielded non-significant results (p > 0.05), suggesting categorical variables operate largely independently of one another

Interpretation

Despite testing 10 categorical variable pairs, none demonstrate meaningful statistical associations. The residuals clustering near zero indicate that observed frequencies closely match expected values under independence. This suggests demographic characteristics (gender, race/ethnicity, parental education, lunch program status, test preparation) are distributed relatively uniformly across the sample without strong interdependencies.

Context

These weak associations contrast sharply with the strong

Performance Tiers

Letter grade distribution and pass rates by demographic group

PT

Performance Tiers

Grade Distribution by Demographics

89.7
Pass Rate

Grade distribution (A-F) and pass/fail rates with demographic breakdowns

89.7
avg pass rate
52
grade A count
285
grade F count
IN

Key Insights

Performance Tiers

Purpose

This section evaluates student performance distribution across letter grades (A–F) and identifies which demographic groups concentrate in top versus bottom performance tiers. Understanding grade distribution reveals whether performance gaps observed in earlier analyses translate into meaningful disparities in final outcomes, directly addressing how socioeconomic and demographic factors influence academic achievement.

Key Findings

  • Pass Rate (≥50): 89.7% - The vast majority of students meet the minimum threshold, indicating broad baseline competency across the cohort
  • Grade A Concentration: Only 52 students (5.2%) achieve top-tier performance, revealing a narrow elite group despite high pass rates
  • Grade F Prevalence: 285 students (28.5%) receive failing grades, the single largest grade category, indicating substantial performance inequality
  • Grade Distribution Skew: The distribution is heavily right-skewed with 54.2% of students earning C or below, while only 19.8% earn B or above

Interpretation

The high pass rate masks a deeply stratified performance landscape. While 89.7% of students technically pass, the concentration of F grades (28.5%) and scarcity of A grades (5.2%) demonstrate that most passing students cluster in the C–D range. This pattern aligns with earlier ANOVA findings showing significant group effects by lunch program (η²=0.123

IN

Key Insights

Performance Tiers

Purpose

This section evaluates student performance distribution across letter grades (A–F) and identifies which demographic groups concentrate in top versus bottom performance tiers. Understanding grade distribution reveals whether performance gaps observed in earlier analyses translate into meaningful disparities in final outcomes, directly addressing how socioeconomic and demographic factors influence academic achievement.

Key Findings

  • Pass Rate (≥50): 89.7% - The vast majority of students meet the minimum threshold, indicating broad baseline competency across the cohort
  • Grade A Concentration: Only 52 students (5.2%) achieve top-tier performance, revealing a narrow elite group despite high pass rates
  • Grade F Prevalence: 285 students (28.5%) receive failing grades, the single largest grade category, indicating substantial performance inequality
  • Grade Distribution Skew: The distribution is heavily right-skewed with 54.2% of students earning C or below, while only 19.8% earn B or above

Interpretation

The high pass rate masks a deeply stratified performance landscape. While 89.7% of students technically pass, the concentration of F grades (28.5%) and scarcity of A grades (5.2%) demonstrate that most passing students cluster in the C–D range. This pattern aligns with earlier ANOVA findings showing significant group effects by lunch program (η²=0.123