Skill: Benchmarking & Performance

When to use this skill

After adding or modifying a strategy
To validate that a strategy is profitable
To compare different configurations
Before going from paper trading to live

Available scripts

Script Usage

scripts/quick_benchmark.sh SYMBOL [DAYS]

Quick benchmark

scripts/validate_strategy.sh STRATEGY

Multi-period validation

Key metrics to monitor

Profitability metrics

Metric Description Acceptable threshold

Total Return Total return over period

0%

Win Rate % of winning trades

50% (trend) or > 40% (mean rev)

Profit Factor Gains / Losses

1.5

Average Trade Average P&L per trade

0

Risk metrics

Metric Description Acceptable threshold

Sharpe Ratio Risk-adjusted return

1.0 (good), > 2.0 (excellent)

Sortino Ratio Same but penalizes downside

1.5

Max Drawdown Maximum loss from peak < 20%

Time in Market % of time with position Depends on strategy

Interpretation

Sharpe Ratio: < 0.5 → Bad, don't use 0.5-1 → Mediocre, needs improvement 1-2 → Good 2-3 → Very good

3 → Excellent (or suspicious, check overfitting)

Max Drawdown: < 10% → Conservative 10-20% → Moderate 20-30% → Aggressive

30% → Dangerous

Benchmark commands

Simple benchmark

Backtest on one symbol

cargo run --bin benchmark -- --symbol AAPL --days 365

Backtest on multiple symbols

cargo run --bin benchmark -- --symbols "AAPL,GOOGL,MSFT" --days 365

Advanced benchmark

Parallel mode (multi-core)

cargo run --bin benchmark -- --parallel --symbols "AAPL,GOOGL,MSFT"

With sequential comparison

cargo run --bin benchmark -- --compare-sequential

Parameter matrix

cargo run --bin benchmark_matrix

Available scripts

Stock benchmark

./scripts/benchmark_stocks.sh

Market regime benchmark

./scripts/run_regime_benchmarks.sh

Automatic benchmark

./scripts/auto_benchmark.sh

Strategy validation workflow

Step 1: Initial backtest

cargo run --bin benchmark -- --strategy <STRATEGY> --days 365

Verify:

Sharpe Ratio > 1.0
Max Drawdown < 20%
Win Rate consistent with strategy type
Profit Factor > 1.5

Step 2: Test on different periods

Bull period

cargo run --bin benchmark -- --start 2021-01-01 --end 2021-12-31

Bear period

cargo run --bin benchmark -- --start 2022-01-01 --end 2022-12-31

Volatile period

cargo run --bin benchmark -- --start 2020-02-01 --end 2020-04-30

The strategy must be profitable (or at least not lose too much) in ALL conditions.

Step 3: Multi-symbol test

cargo run --bin benchmark -- --symbols "AAPL,MSFT,GOOGL,AMZN,META"

Verify result consistency across different assets.

Step 4: Stress test

Test on crash periods:

COVID crash: February-March 2020
2022 Bear market: January-October 2022
Flash crashes: Verify resilience

Pitfalls to avoid

Overfitting

Symptoms:

Sharpe Ratio > 3 on backtest
Performance degrades in live/forward test
Too many optimized parameters

Solutions:

Use train/test split
Test on out-of-sample data
Prefer simple strategies

Look-ahead bias

Symptom: Using future data in decisions

Solution: Verify indicators only use past data

Survivorship bias

Symptom: Only testing on assets that still exist

Solution: Include delisted assets in backtests

Key files

File Description

src/bin/benchmark.rs

Main benchmark CLI

src/bin/benchmark_matrix.rs

Parameter matrix tests

src/application/optimization/parallel_benchmark.rs

Parallel execution

src/application/optimization/benchmark_metrics.rs

Benchmark metrics

src/domain/performance/metrics.rs

Sharpe, Sortino, Drawdown calculation

benchmark_results/

Saved results

Checklist before production

Positive backtests on 2+ years of data
Sharpe Ratio > 1.0 on different periods
Acceptable Max Drawdown (< 20% recommended)
Tested on bull, bear AND sideways markets
No sign of overfitting
Paper trading validated for 1+ month

benchmarking & performance

Safety Notice

Copy this and send it to your AI assistant to learn