A/B Test Statistics Calculator
Calculate statistical significance for A/B tests - know when your results are real, not random chance.
When to Use This Skill
-
Test analysis - Determine if results are statistically significant
-
Sample planning - Calculate required sample size before testing
-
Duration estimation - Know how long to run experiments
-
Power analysis - Ensure tests can detect meaningful differences
What Claude Does vs What You Decide
Claude Does You Decide
Structures analysis frameworks Metric definitions
Identifies patterns in data Business interpretation
Creates visualization templates Dashboard design
Suggests optimization areas Action priorities
Calculates statistical measures Decision thresholds
Dependencies
pip install scipy numpy click
Commands
Check Significance
python scripts/main.py significance --control 1000,50 --variant 1000,65 python scripts/main.py significance --control 5000,250 --variant 5000,300 --confidence 0.99
Calculate Sample Size
python scripts/main.py sample-size --baseline 0.05 --mde 0.02 python scripts/main.py sample-size --baseline 0.10 --mde 0.01 --power 0.90
Estimate Duration
python scripts/main.py duration --traffic 1000 --baseline 0.05 --mde 0.02
Examples
Example 1: Analyze Test Results
Control: 1000 visitors, 50 conversions (5%)
Variant: 1000 visitors, 65 conversions (6.5%)
python scripts/main.py significance --control 1000,50 --variant 1000,65
Output:
A/B Test Results
─────────────────────────
Control: 5.00% (50/1000)
Variant: 6.50% (65/1000)
Lift: +30.0%
Statistical Analysis
─────────────────────────
p-value: 0.089
Confidence: 91.1%
Result: NOT SIGNIFICANT (need 95%)
Recommendation: Continue test for more data
Example 2: Plan Sample Size
Baseline 5% conversion, want to detect 20% relative lift (1% absolute)
python scripts/main.py sample-size --baseline 0.05 --mde 0.01
Output:
Sample Size Calculator
──────────────────────────────
Baseline conversion: 5.0%
Minimum detectable effect: 1.0% (20% relative)
Target conversion: 6.0%
Required per variant: 3,842 visitors
Total required: 7,684 visitors
At 1000 daily visitors: ~8 days
Key Concepts
Term Definition
p-value Probability result is due to chance
Confidence 1 - p-value (usually want 95%+)
Power Probability of detecting real effect (usually 80%)
MDE Minimum Detectable Effect - smallest lift worth detecting
Lift Relative improvement (variant - control) / control
When Results Are Significant
p-value Confidence Verdict
< 0.01
99% Highly Significant ✓
< 0.05
95% Significant ✓
< 0.10
90% Marginally Significant
≥ 0.10 < 90% Not Significant ✗
Skill Boundaries
What This Skill Does Well
-
Structuring data analysis
-
Identifying patterns and trends
-
Creating visualization frameworks
-
Calculating statistical measures
What This Skill Cannot Do
-
Access your actual data
-
Replace statistical expertise
-
Make business decisions
-
Guarantee prediction accuracy
Related Skills
-
cohort-analysis - Analyze user cohorts
-
funnel-analyzer - Analyze conversion funnels
Skill Metadata
- Mode: centaur
category: analytics subcategory: statistics dependencies: [scipy, numpy] difficulty: intermediate time_saved: 3+ hours/week