Adaptive Walk-Forward Epoch Selection (AWFES)
Machine-readable reference for adaptive epoch selection within Walk-Forward Optimization (WFO). Optimizes training epochs per-fold using Walk-Forward Efficiency (WFE) as the objective.
When to Use This Skill
Use this skill when:
-
Selecting optimal training epochs for ML models in WFO
-
Avoiding overfitting via Walk-Forward Efficiency metrics
-
Implementing per-fold adaptive epoch selection
-
Computing efficient frontiers for epoch-performance trade-offs
-
Carrying epoch priors across WFO folds
Quick Start
from adaptive_wfo_epoch import AWFESConfig, compute_efficient_frontier
Generate epoch candidates from search bounds and granularity
config = AWFESConfig.from_search_space( min_epoch=100, max_epoch=2000, granularity=5, # Number of frontier points )
config.epoch_configs → [100, 211, 447, 945, 2000] (log-spaced)
Per-fold epoch sweep
for fold in wfo_folds: epoch_metrics = [] for epoch in config.epoch_configs: is_sharpe, oos_sharpe = train_and_evaluate(fold, epochs=epoch) wfe = config.compute_wfe(is_sharpe, oos_sharpe, n_samples=len(fold.train)) epoch_metrics.append({"epoch": epoch, "wfe": wfe, "is_sharpe": is_sharpe})
# Select from efficient frontier
selected_epoch = compute_efficient_frontier(epoch_metrics)
# Carry forward to next fold as prior
prior_epoch = selected_epoch
Methodology Overview
What This Is
Per-fold adaptive epoch selection where:
-
Train models across a range of epochs (e.g., 400, 800, 1000, 2000)
-
Compute WFE = OOS_Sharpe / IS_Sharpe for each epoch count
-
Find the "efficient frontier" - epochs maximizing WFE vs training cost
-
Select optimal epoch from frontier for OOS evaluation
-
Carry forward as prior for next fold
What This Is NOT
-
NOT early stopping: Early stopping monitors validation loss continuously; this evaluates discrete candidates post-hoc
-
NOT Bayesian optimization: No surrogate model; direct evaluation of all candidates
-
NOT nested cross-validation: Uses temporal WFO, not shuffled splits
Academic Foundations
Concept Citation Key Insight
Walk-Forward Efficiency Pardo (1992, 2008) WFE = OOS_Return / IS_Return as robustness metric
Deflated Sharpe Ratio Bailey & López de Prado (2014) Adjusts for multiple testing
Pareto-Optimal HP Selection Bischl et al. (2023) Multi-objective hyperparameter optimization
Warm-Starting Nomura & Ono (2021) Transfer knowledge between optimization runs
See references/academic-foundations.md for full literature review.
Core Formula: Walk-Forward Efficiency
def compute_wfe( is_sharpe: float, oos_sharpe: float, n_samples: int | None = None, ) -> float | None: """Walk-Forward Efficiency - measures performance transfer.
WFE = OOS_Sharpe / IS_Sharpe
Interpretation (guidelines, not hard thresholds):
- WFE ≥ 0.70: Excellent transfer (low overfitting)
- WFE 0.50-0.70: Good transfer
- WFE 0.30-0.50: Moderate transfer (investigate)
- WFE < 0.30: Severe overfitting (likely reject)
The IS_Sharpe minimum is derived from signal-to-noise ratio,
not a fixed magic number. See compute_is_sharpe_threshold().
Reference: Pardo (2008) "The Evaluation and Optimization of Trading Strategies"
"""
# Data-driven threshold: IS_Sharpe must exceed 2σ noise floor
min_is_sharpe = compute_is_sharpe_threshold(n_samples) if n_samples else 0.1
if abs(is_sharpe) < min_is_sharpe:
return None
return oos_sharpe / is_sharpe
Principled Configuration Framework
All parameters are derived from first principles or data characteristics. AWFESConfig provides unified configuration with log-spaced epoch generation, Bayesian variance derivation from search space, and market-specific annualization factors.
See references/configuration-framework.md for the full AWFESConfig class and compute_is_sharpe_threshold() implementation.
Guardrails (Principled Guidelines)
-
G1: WFE Thresholds - 0.30 (reject), 0.50 (warning), 0.70 (target) based on practitioner consensus
-
G2: IS_Sharpe Minimum - Data-driven threshold: 2/sqrt(n) adapts to sample size
-
G3: Stability Penalty - Adaptive threshold derived from WFE variance prevents epoch churn
-
G4: DSR Adjustment - Deflated Sharpe corrects for epoch selection multiplicity via Gumbel distribution
See references/guardrails.md for full implementations of all guardrails.
WFE Aggregation Methods
Under the null hypothesis, WFE follows a Cauchy distribution (no defined mean). Always prefer median or pooled methods:
-
Pooled WFE: Precision-weighted by sample size (best for variable fold sizes)
-
Median WFE: Robust to outliers (best for suspected regime changes)
-
Weighted Mean: Inverse-variance weighting (best for homogeneous folds)
See references/wfe-aggregation.md for implementations and selection guide.
Efficient Frontier Algorithm
Pareto-optimal epoch selection: an epoch is on the frontier if no other epoch dominates it (better WFE AND lower training time). The AdaptiveEpochSelector class maintains state across folds with adaptive stability penalties.
See references/efficient-frontier.md for the full algorithm and carry-forward mechanism.
Anti-Patterns
Anti-Pattern Symptom Fix Severity
Expanding window (range bars) Train size grows per fold Use fixed sliding window CRITICAL
Peak picking Best epoch always at sweep boundary Expand range, check for plateau HIGH
Insufficient folds effective_n < 30 Increase folds or data span HIGH
Ignoring temporal autocorr Folds correlated Use purged CV, gap between folds HIGH
Overfitting to IS IS >> OOS Sharpe Reduce epochs, add regularization HIGH
sqrt(252) for crypto Inflated Sharpe Use sqrt(365) or sqrt(7) weekly MEDIUM
Single epoch selection No uncertainty quantification Report confidence interval MEDIUM
Meta-overfitting Epoch selection itself overfits Limit to 3-4 candidates max HIGH
CRITICAL: Never use expanding window for range bar ML training. See references/anti-patterns.md for the full analysis (Section 7).
Decision Tree
See references/epoch-selection-decision-tree.md for the full practitioner decision tree.
Start │ ├─ IS_Sharpe > compute_is_sharpe_threshold(n)? ──NO──> Mark WFE invalid, use fallback │ │ (threshold = 2/√n, adapts to sample size) │ YES │ │ ├─ Compute WFE for each epoch │ │ ├─ Any WFE > 0.30? ──NO──> REJECT all epochs (severe overfit) │ │ (guideline, not hard threshold) │ YES │ │ ├─ Compute efficient frontier │ │ ├─ Apply AdaptiveStabilityPenalty │ │ (threshold derived from WFE variance) └─> Return selected epoch
Integration with rangebar-eval-metrics
This skill extends rangebar-eval-metrics:
Metric Source Used For Reference
sharpe_tw
WFE numerator (OOS) and denominator (IS) range-bar-metrics.md
n_bars
Sample size for aggregation weights metrics-schema.md
psr , dsr
Final acceptance criteria sharpe-formulas.md
prediction_autocorr
Validate model isn't collapsed ml-prediction-quality.md
is_collapsed
Model health check ml-prediction-quality.md
Extended risk metrics Deep risk analysis (optional) risk-metrics.md
Recommended Workflow
-
Compute base metrics using rangebar-eval-metrics:compute_metrics.py
-
Feed to AWFES for epoch selection with sharpe_tw as primary signal
-
Validate with psr > 0.85 and dsr > 0.50 before deployment
-
Monitor is_collapsed and prediction_autocorr for model health
OOS Application Phase
AWFES uses Nested WFO with three data splits per fold (Train 60% / Val 20% / Test 20%) with 6% embargo gaps at each boundary. The per-fold workflow: epoch sweep on train, WFE computation on validation, Bayesian update, final model training on train+val, evaluation on test.
See references/oos-workflow.md for the complete workflow with diagrams, BayesianEpochSelector class, and apply_awfes_to_test() implementation. Also see references/oos-application.md for the extended reference.
Epoch Smoothing Methods
Bayesian updating (recommended) provides principled, uncertainty-aware smoothing. Alternatives include EMA and SMA. Initialization via AWFESConfig.from_search_space() derives variances from the epoch range automatically.
See references/epoch-smoothing-methods.md for all methods, formulas, and initialization strategies. See references/epoch-smoothing.md for extended mathematical analysis.
OOS Metrics Specification
Three-tier metric hierarchy for test evaluation:
-
Tier 1 (Primary): sharpe_tw , hit_rate , cumulative_pnl , positive_sharpe_folds , wfe_test
-
Tier 2 (Risk): max_drawdown , calmar_ratio , profit_factor , cvar_10pct
-
Tier 3 (Statistical): psr , dsr , binomial_pvalue , hac_ttest_pvalue
See references/oos-metrics-implementation.md for full metric tables, compute_oos_metrics() , and fold aggregation code. See references/oos-metrics.md for threshold justifications.
Look-Ahead Bias Prevention
CRITICAL (v3 fix): TEST must use prior_bayesian_epoch (from prior folds only), NOT val_optimal_epoch . The Bayesian update happens AFTER test evaluation, ensuring information flows only from past to present.
See references/look-ahead-bias-v3.md for the v3 fix details, embargo requirements, validation checklist, and anti-patterns. See references/look-ahead-bias.md for detailed examples.
References
Topic Reference File
Academic Literature academic-foundations.md
Mathematical Formulation mathematical-formulation.md
Configuration Framework configuration-framework.md
Guardrails guardrails.md
WFE Aggregation wfe-aggregation.md
Efficient Frontier efficient-frontier.md
Decision Tree epoch-selection-decision-tree.md
Anti-Patterns anti-patterns.md
OOS Workflow oos-workflow.md
OOS Application oos-application.md
Epoch Smoothing Methods epoch-smoothing-methods.md
Epoch Smoothing Analysis epoch-smoothing.md
OOS Metrics Impl oos-metrics-implementation.md
OOS Metrics Thresholds oos-metrics.md
Look-Ahead Bias (v3) look-ahead-bias-v3.md
Look-Ahead Bias Examples look-ahead-bias.md
Feature Sets feature-sets.md
xLSTM Implementation xlstm-implementation.md
Range Bar Metrics range-bar-metrics.md
Troubleshooting troubleshooting.md
Full Citations
-
Bailey, D. H., & López de Prado, M. (2014). The deflated Sharpe ratio: Correcting for selection bias, backtest overfitting and non-normality. The Journal of Portfolio Management, 40(5), 94-107.
-
Bischl, B., et al. (2023). Multi-Objective Hyperparameter Optimization in Machine Learning. ACM Transactions on Evolutionary Learning and Optimization.
-
López de Prado, M. (2018). Advances in Financial Machine Learning. Wiley. Chapter 7.
-
Nomura, M., & Ono, I. (2021). Warm Starting CMA-ES for Hyperparameter Optimization. AAAI Conference on Artificial Intelligence.
-
Pardo, R. E. (2008). The Evaluation and Optimization of Trading Strategies, 2nd Edition. John Wiley & Sons.