Association rules reveal items that frequently occur together. With proper thresholds and validation, they drive cross‑sell bundles, promotions, and layout decisions.
Quick Overview
Inputs
- Dataset: transactions (long) or binary matrix (wide)
- Data format:
transaction_itemsorbinary_matrix - Columns:
transaction_column,item_column(long format) - Thresholds:
min_support,min_confidence,min_lift,max_length,top_n_rules - Optional:
userContext,processing_id
What
- Convert rows to transactions and compute basket stats
- Mine rules with Apriori; compute support, confidence, lift, leverage, conviction
- Filter by lift; rank top rules by lift/confidence/support
- Extract frequent itemsets, network edges, and scatter data
Why
- Identify cross‑sell opportunities and recommendation seeds
- Design planograms and promo pairings based on true affinities
- Prioritize rules by lift and actionability to drive revenue
Outputs
- Metrics: transactions, unique items, basket size, total/strong rules, avg confidence/lift
- Tables: top_rules, frequent_itemsets, top_items, bundle_recommendations, category_patterns, rule_summary
- Datasets: network_data, scatter_data, confidence_matrix, lift_distribution
Key Metrics
- Support: P(A ∪ B) — fraction of baskets containing the itemset
- Confidence: P(B | A) — likelihood of B given A
- Lift: confidence(A→B)/support(B) — >1 suggests positive association
- Conviction: (1 − support(B))/(1 − confidence(A→B)) — penalizes false positives
Constraints
- Min thresholds: support, confidence, lift (and conviction)
- Include/exclude items or categories; mine closed or maximal itemsets
- Slice by customer, time, or category to find targeted patterns
Validation
- Holdout evaluation and backtests to check rule stability
- Guard against spurious co‑occurrence, especially with popular items
- Prioritize rules with high lift and business actionability
Applications
- Cross‑sell bundles and recommendation seeds
- Aisle adjacency and planogram design
- Promo pairing and coupon targeting