statistical-analysis

Probability, distributions, hypothesis testing, and statistical inference. Use for A/B testing, experimental design, or statistical validation.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "statistical-analysis" with this command: npx skills add pluginagentmarketplace/custom-plugin-ai-data-scientist/pluginagentmarketplace-custom-plugin-ai-data-scientist-statistical-analysis

Statistical Analysis

Apply statistical methods to understand data and validate findings.

Quick Start

from scipy import stats
import numpy as np

# Descriptive statistics
data = np.array([1, 2, 3, 4, 5])
print(f"Mean: {np.mean(data)}")
print(f"Std: {np.std(data)}")

# Hypothesis testing
group1 = [23, 25, 27, 29, 31]
group2 = [20, 22, 24, 26, 28]
t_stat, p_value = stats.ttest_ind(group1, group2)
print(f"P-value: {p_value}")

Core Tests

T-Test (Compare Means)

# One-sample: Compare to population mean
stats.ttest_1samp(data, 100)

# Two-sample: Compare two groups
stats.ttest_ind(group1, group2)

# Paired: Before/after comparison
stats.ttest_rel(before, after)

Chi-Square (Categorical Data)

from scipy.stats import chi2_contingency

observed = np.array([[10, 20], [15, 25]])
chi2, p_value, dof, expected = chi2_contingency(observed)

ANOVA (Multiple Groups)

f_stat, p_value = stats.f_oneway(group1, group2, group3)

Confidence Intervals

from scipy import stats

confidence_level = 0.95
mean = np.mean(data)
se = stats.sem(data)
ci = stats.t.interval(confidence_level, len(data)-1, mean, se)

print(f"95% CI: [{ci[0]:.2f}, {ci[1]:.2f}]")

Correlation

# Pearson (linear)
r, p_value = stats.pearsonr(x, y)

# Spearman (rank-based)
rho, p_value = stats.spearmanr(x, y)

Distributions

# Normal
x = np.linspace(-3, 3, 100)
pdf = stats.norm.pdf(x, loc=0, scale=1)

# Sampling
samples = np.random.normal(0, 1, 1000)

# Test normality
stat, p_value = stats.shapiro(data)

A/B Testing Framework

def ab_test(control, treatment, alpha=0.05):
    """
    Run A/B test with statistical significance

    Returns: significant (bool), p_value (float)
    """
    t_stat, p_value = stats.ttest_ind(control, treatment)

    significant = p_value < alpha
    improvement = (np.mean(treatment) - np.mean(control)) / np.mean(control) * 100

    return {
        'significant': significant,
        'p_value': p_value,
        'improvement': f"{improvement:.2f}%"
    }

Interpretation

P-value < 0.05: Reject null hypothesis (statistically significant)

P-value >= 0.05: Fail to reject null (not significant)

Common Pitfalls

  • Multiple testing without correction
  • Small sample sizes
  • Ignoring assumptions (normality, independence)
  • Confusing correlation with causation
  • p-hacking (searching for significance)

Troubleshooting

Common Issues

Problem: Non-normal data for t-test

# Check normality first
stat, p = stats.shapiro(data)
if p < 0.05:
    # Use non-parametric alternative
    stat, p = stats.mannwhitneyu(group1, group2)  # Instead of ttest_ind

Problem: Multiple comparisons inflating false positives

from statsmodels.stats.multitest import multipletests

# Apply Bonferroni correction
p_values = [0.01, 0.03, 0.04, 0.02, 0.06]
rejected, p_adjusted, _, _ = multipletests(p_values, method='bonferroni')

Problem: Underpowered study (sample too small)

from statsmodels.stats.power import TTestIndPower

# Calculate required sample size
power_analysis = TTestIndPower()
sample_size = power_analysis.solve_power(
    effect_size=0.5,  # Medium effect (Cohen's d)
    power=0.8,        # 80% power
    alpha=0.05        # 5% significance
)
print(f"Required n per group: {sample_size:.0f}")

Problem: Heterogeneous variances

# Check with Levene's test
stat, p = stats.levene(group1, group2)
if p < 0.05:
    # Use Welch's t-test (default in scipy)
    t, p = stats.ttest_ind(group1, group2, equal_var=False)

Problem: Outliers affecting results

from scipy.stats import zscore

# Detect outliers (|z| > 3)
z_scores = np.abs(zscore(data))
clean_data = data[z_scores < 3]

# Or use robust statistics
median = np.median(data)
mad = np.median(np.abs(data - median))  # Median Absolute Deviation

Debug Checklist

  • Check sample size adequacy (power analysis)
  • Test normality assumption (Shapiro-Wilk)
  • Test homogeneity of variance (Levene's)
  • Check for outliers (z-scores, IQR)
  • Apply multiple testing correction if needed
  • Report effect sizes, not just p-values

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

machine learning

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

deep-learning

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

data-engineering

No summary provided by upstream source.

Repository SourceNeeds Review