Analysis overview and configuration
| Parameter | Value | _row |
|---|---|---|
| position_adjustment | true | position_adjustment |
| confidence_level | 0.95 | confidence_level |
| min_impressions | 10 | min_impressions |
| decision_threshold | 0.05 | decision_threshold |
This analysis evaluates whether title changes improved click-through rates (CTR) across 7 pages, with statistical adjustment for average position shifts. The test compares a control group (12 initial observations) against a treatment group (7 initial observations), examining whether position-adjusted CTR improvements are statistically significant and practically meaningful.
While the treatment variant demonstrates a substantial adjusted CTR improvement and consistent position gains, the analysis lacks statistical power to confirm most improvements are real rather than random variation. The single promoted page shows genuine significance, but the
Data preprocessing and column mapping
| Metric | Value |
|---|---|
| Initial Rows | 19 |
| Final Rows | 7 |
| Rows Removed | 12 |
| Retention Rate | 36.8% |
This section documents the data filtering applied before statistical analysis of the A/B test results. The 63.2% removal rate reflects quality control measures necessary to ensure only valid page-level comparisons enter the analysis. Understanding retention is critical because it directly impacts the reliability of conclusions drawn about treatment effectiveness across the tested pages.
The aggressive filtering ensures statistical validity by excluding underpowered comparisons. With only 7 pages retained, the analysis focuses on pages meeting minimum data quality standards. This explains the low average statistical power (0.165) observed across tests—even retained pages have limited treatment impressions (mean: 238 vs. control: 1,362), creating inherent power constraints that directly contribute to the 85.7% inconclusive verdict rate.
No train/test split is
| finding | value |
|---|---|
| Overall Verdict | 1 promote, 0 rollback, 0 neutral, 6 keep running |
| Pages Tested | 7 |
| Statistically Significant | 1 (14.3%) |
| Win Rate | 100.0% (1/1) |
| Biggest Winner | linear-discriminant-analysis-lda-practical-guide-for-data-driven-decisions.html (+1.44x) |
| Biggest Loser | support-vector-machine-svm-practical-guide-for-data-driven-decisions.html (-0.31x) |
| Average Adjusted CTR Lift | 0.27x |
| Inconclusive (Need More Data) | 6 (85.7%) |
This section synthesizes the overall A/B testing results across 7 pages to assess whether the SEO title optimization strategy achieved its business objective. It provides decision-makers with a clear bottom-line assessment of test performance, statistical confidence, and readiness for deployment.
The experiment identified one clear winner—the product bundle affinity tutorial achieved a 72% adjusted CTR lift with p-value of 0.01. However, 86% of tested pages remain inconclusive due to low statistical power (average power: 0.165), suggesting insufficient sample sizes or effect sizes too small to detect reliably. The 100% win rate reflects only one significant result, limiting confidence in the broader strategy's effectiveness.
The treatment group received substantially fewer impressions (1,
Position-Adjusted CTR Lift per Page with 95% Confidence Intervals
This forest plot visualizes position-adjusted CTR lift across 7 tested pages, isolating the treatment effect from natural position-based click variations. By adjusting for search position, the analysis reveals whether observed CTR changes reflect genuine content improvements or simply result from ranking shifts. This is critical for understanding whether the treatment causally improved user engagement.
The experiment reveals mixed results when controlling for position effects. While the average lift appears positive, 86% of pages remain inconclusive—their confidence intervals encompass zero, meaning observed differences could plausibly be due to chance. The single significant winner (Shopify tutorial) demonstrates the treatment can work, but most pages require additional data to confirm whether improvements are real or statistical artifacts.
Side-by-Side Control vs Treatment Metrics (Before Position Adjustment)
| page | control_impressions | control_clicks | control_ctr | control_position | treatment_impressions | treatment_clicks | treatment_ctr | treatment_position | raw_ctr_lift | position_change |
|---|---|---|---|---|---|---|---|---|---|---|
| articles/arima-practical-guide-for-data-driven-decisions.html | 1303 | 2 | 0.0015 | 15.91 | 282 | 0 | 0 | 10.56 | -0.0015 | -5.35 |
| articles/association-rules-apriori-practical-guide-for-data-driven-decisions.html | 919 | 1 | 0.0011 | 12.73 | 291 | 1 | 0.0034 | 10.66 | 0.0023 | -2.07 |
| articles/linear-discriminant-analysis-lda-practical-guide-for-data-driven-decisions.html | 1082 | 2 | 0.0018 | 15.63 | 195 | 3 | 0.0154 | 11.61 | 0.0136 | -4.02 |
| articles/one-class-svm-practical-guide-for-data-driven-decisions.html | 1567 | 1 | 6.00e-04 | 9.22 | 422 | 1 | 0.0024 | 8.24 | 0.0018 | -0.98 |
| articles/session-based-recommendations-practical-guide-for-data-driven-decisions.html | 1639 | 0 | 0 | 6.21 | 262 | 0 | 0 | 8.27 | 0 | 2.06 |
| articles/support-vector-machine-svm-practical-guide-for-data-driven-decisions.html | 1331 | 3 | 0.0023 | 13.54 | 176 | 0 | 0 | 11.14 | -0.0023 | -2.4 |
| tutorials/how-to-use-product-bundle-affinity-analysis-in-shopify-step-by-step-tutorial.html | 1693 | 0 | 0 | 9.1 | 41 | 1 | 0.0244 | 6.26 | 0.0244 | -2.84 |
This section presents raw, unadjusted performance metrics to establish a baseline comparison between control and treatment periods. It serves as the foundation for understanding why position-adjusted analysis is necessary—the treatment variant achieved a higher raw CTR (0.65% vs 0.10%), but this difference may be partially or entirely attributable to improved search ranking (avg_position improved by 2.23 positions) rather than title quality alone.
The raw metrics suggest treatment outperforms, but this comparison conflates two distinct factors: title quality and search ranking. Since treatment pages ranked higher, they received more favorable visibility, making it impossible to isolate whether improved CTR stems
Position vs CTR Scatter with Expected CTR Curve
This scatter plot isolates the relationship between search position and click-through rate to determine whether the treatment's CTR improvements stem from better positioning or from title/content changes that drive clicks independent of position. By overlaying observed data against the expected CTR curve, it reveals whether the treatment variant is outperforming or underperforming industry benchmarks at its achieved positions.
The treatment's +0.27x adjusted CTR lift appears largely attributable to improved search positioning rather than superior title or content relevance. With treatment pages ranking ~2.2 positions higher on average, they naturally capture more clicks according to industry expectations. Only the promoted tutorial page (0.72x lift) shows meaningful divergence from the curve, suggesting genuine title/content superiority beyond positional advantage.
Low absolute
Position Adjustment Calculation Breakdown
| page | variant | raw_ctr | position_val | expected_ctr | adjusted_ctr | adjustment_factor |
|---|---|---|---|---|---|---|
| articles/arima-practical-guide-for-data-driven-decisions.html | control | 0.0015 | 15.91 | 0.0059 | 0.2558 | 0.2558 |
| articles/association-rules-apriori-practical-guide-for-data-driven-decisions.html | control | 0.0011 | 12.73 | 0.0081 | 0.136 | 0.136 |
| articles/linear-discriminant-analysis-lda-practical-guide-for-data-driven-decisions.html | control | 0.0018 | 15.63 | 0.0061 | 0.2971 | 0.2971 |
| articles/one-class-svm-practical-guide-for-data-driven-decisions.html | control | 6.00e-04 | 9.22 | 0.015 | 0.04 | 0.04 |
| articles/session-based-recommendations-practical-guide-for-data-driven-decisions.html | control | 0 | 6.21 | 0.034 | 0 | 0 |
| articles/support-vector-machine-svm-practical-guide-for-data-driven-decisions.html | control | 0.0023 | 13.54 | 0.0075 | 0.3058 | 0.3058 |
| tutorials/how-to-use-product-bundle-affinity-analysis-in-shopify-step-by-step-tutorial.html | control | 0 | 9.1 | 0.015 | 0 | 0 |
| articles/arima-practical-guide-for-data-driven-decisions.html | treatment | 0 | 10.56 | 0.0096 | 0 | 0 |
| articles/association-rules-apriori-practical-guide-for-data-driven-decisions.html | treatment | 0.0034 | 10.66 | 0.0095 | 0.3565 | 0.3565 |
| articles/linear-discriminant-analysis-lda-practical-guide-for-data-driven-decisions.html | treatment | 0.0154 | 11.61 | 0.0089 | 1.736 | 1.736 |
| articles/one-class-svm-practical-guide-for-data-driven-decisions.html | treatment | 0.0024 | 8.24 | 0.02 | 0.12 | 0.12 |
| articles/session-based-recommendations-practical-guide-for-data-driven-decisions.html | treatment | 0 | 8.27 | 0.02 | 0 | 0 |
| articles/support-vector-machine-svm-practical-guide-for-data-driven-decisions.html | treatment | 0 | 11.14 | 0.0092 | 0 | 0 |
| tutorials/how-to-use-product-bundle-affinity-analysis-in-shopify-step-by-step-tutorial.html | treatment | 0.0244 | 6.26 | 0.034 | 0.7176 | 0.7176 |
This section isolates the title/content quality impact from ranking position effects by normalizing actual CTR against industry benchmarks for each position. Since the treatment variant achieved better average positions (9.53 vs 11.76), adjusted CTR reveals whether improved clicks stem from better rankings alone or from genuinely stronger title performance. This is critical for evaluating whether the treatment represents a true quality improvement.
The adjustment mechanism reveals that treatment pages don't merely benefit from ranking improvements—they demonstrate stronger intrinsic appeal relative to their positions. The
Significance Testing Results with P-Values and Power Analysis
| page | p_value | adjusted_p_value | ci_lower | ci_upper | power | is_significant | sample_size_adequate |
|---|---|---|---|---|---|---|---|
| articles/arima-practical-guide-for-data-driven-decisions.html | 1 | 1 | -0.0052 | 0.0021 | 0.1536 | False | True |
| articles/association-rules-apriori-practical-guide-for-data-driven-decisions.html | 0.9749 | 1 | -0.007 | 0.0117 | 0.0949 | False | True |
| articles/linear-discriminant-analysis-lda-practical-guide-for-data-driven-decisions.html | 0.0305 | 0.2137 | -0.007 | 0.034 | 0.3621 | False | True |
| articles/one-class-svm-practical-guide-for-data-driven-decisions.html | 0.8958 | 1 | -0.0046 | 0.008 | 0.1046 | False | True |
| articles/session-based-recommendations-practical-guide-for-data-driven-decisions.html | 1 | 1 | 0 | 0 | 0 | False | True |
| articles/support-vector-machine-svm-practical-guide-for-data-driven-decisions.html | 1 | 1 | -0.0071 | 0.0025 | 0.1447 | False | True |
| tutorials/how-to-use-product-bundle-affinity-analysis-in-shopify-step-by-step-tutorial.html | 0.0017 | 0.012 | -0.0353 | 0.0841 | 0.295 | True | True |
This section evaluates whether observed treatment effects are statistically reliable or likely due to chance. Using two-proportion z-tests with Bonferroni correction, it controls for false positives when testing multiple pages simultaneously. Understanding statistical significance is critical for distinguishing genuine improvements from noise in the experiment.
The low power (16.5%) means the experiment lacks sufficient sample size to reliably detect treatment effects across most pages. Six of seven pages remain inconclusive—not because the treatment failed, but because the data volume is insufficient to distinguish signal from noise. The single significant result (Shopify tutorial, p=0.01) passed the corrected threshold, but the broader pattern suggests most observed differences are statistically indistinguishable from zero
Per-Page Recommendations with Estimated Click Uplift
| page | verdict | adjusted_ctr_lift | p_value | estimated_monthly_click_uplift |
|---|---|---|---|---|
| articles/arima-practical-guide-for-data-driven-decisions.html | keep_running | -0.2558 | 1 | 0 |
| articles/association-rules-apriori-practical-guide-for-data-driven-decisions.html | keep_running | 0.2205 | 1 | 0 |
| articles/linear-discriminant-analysis-lda-practical-guide-for-data-driven-decisions.html | keep_running | 1.438 | 0.2137 | 0 |
| articles/one-class-svm-practical-guide-for-data-driven-decisions.html | keep_running | 0.08 | 1 | 0 |
| articles/session-based-recommendations-practical-guide-for-data-driven-decisions.html | keep_running | 0 | 1 | 0 |
| articles/support-vector-machine-svm-practical-guide-for-data-driven-decisions.html | keep_running | -0.3058 | 1 | 0 |
| tutorials/how-to-use-product-bundle-affinity-analysis-in-shopify-step-by-step-tutorial.html | promote | 0.7176 | 0.012 | 30 |
This section synthesizes per-page experiment results into clear verdicts based on statistical significance and effect size. It translates raw statistical findings into actionable categories—promote, rollback, neutral, or keep running—enabling stakeholders to make informed decisions about which title treatments should be deployed, reverted, or extended with additional data collection.
The experiment reveals a highly imbalanced power distribution: one clear winner emerged from statistical testing, but the majority of pages remain underpowered. The wide range of adjusted CTR lifts (-0.31x to +1