Forecasting Expert Knowledge
Superforecasting Principles
Based on research by Philip Tetlock and the Good Judgment Project:
-
Triage: Focus on questions that are hard enough to be interesting but not so hard they're unknowable
-
Break problems apart: Decompose big questions into smaller, researchable sub-questions (Fermi estimation)
-
Balance inside and outside views: Use both specific evidence AND base rates from reference classes
-
Update incrementally: Adjust predictions in small steps as new evidence arrives (Bayesian updating)
-
Look for clashing forces: Identify factors pulling in opposite directions
-
Distinguish signal from noise: Weight signals by their reliability and relevance
-
Calibrate: Your 70% predictions should come true ~70% of the time
-
Post-mortem: Analyze why predictions went wrong, not just celebrate the right ones
-
Avoid the narrative trap: A compelling story is not the same as a likely outcome
-
Collaborate: Aggregate views from diverse perspectives
Signal Taxonomy
Signal Types
Type Description Weight Example
Leading indicator Predicts future movement High Job postings surge → company expanding
Lagging indicator Confirms past movement Medium Quarterly earnings → business health
Base rate Historical frequency High "80% of startups fail within 5 years"
Expert opinion Informed prediction Medium Analyst forecast, CEO statement
Data point Factual measurement High Revenue figure, user count, benchmark
Anomaly Deviation from pattern High Unusual trading volume, sudden hiring freeze
Structural change Systemic shift Very High New regulation, technology breakthrough
Sentiment shift Collective mood change Medium Media tone change, social media trend
Signal Strength Assessment
STRONG signal (high predictive value):
- Multiple independent sources confirm
- Quantitative data (not just opinions)
- Leading indicator with historical track record
- Structural change with clear causal mechanism
MODERATE signal (some predictive value):
- Single authoritative source
- Expert opinion from domain specialist
- Historical pattern that may or may not repeat
- Lagging indicator (confirms direction)
WEAK signal (limited predictive value):
- Social media buzz without substance
- Single anecdote or case study
- Rumor or unconfirmed report
- Opinion from non-specialist
Confidence Calibration
Probability Scale
95% — Almost certain (would bet 19:1) 90% — Very likely (would bet 9:1) 80% — Likely (would bet 4:1) 70% — Probable (would bet 7:3) 60% — Slightly more likely than not 50% — Toss-up (genuine uncertainty) 40% — Slightly less likely than not 30% — Unlikely (but plausible) 20% — Very unlikely (but possible) 10% — Extremely unlikely 5% — Almost impossible (but not zero)
Calibration Rules
-
NEVER use 0% or 100% — nothing is absolutely certain
-
If you haven't done research, default to the base rate (outside view)
-
Your first estimate should be the reference class base rate
-
Adjust from the base rate using specific evidence (inside view)
-
Typical adjustment: ±5-15% per strong signal, ±2-5% per moderate signal
-
If your gut says 80% but your analysis says 55%, trust the analysis
Brier Score
The gold standard for measuring prediction accuracy:
Brier Score = (predicted_probability - actual_outcome)^2
actual_outcome = 1 if prediction came true, 0 if not
Perfect score: 0.0 (you're always right with perfect confidence) Coin flip: 0.25 (saying 50% on everything) Terrible: 1.0 (100% confident, always wrong)
Good forecaster: < 0.15 Average forecaster: 0.20-0.30 Bad forecaster: > 0.35
Domain-Specific Source Guide
Technology Predictions
Source Type Examples Use For
Product roadmaps GitHub issues, release notes, blog posts Feature predictions
Adoption data Stack Overflow surveys, NPM downloads, DB-Engines Technology trends
Funding data Crunchbase, PitchBook, TechCrunch Startup success/failure
Patent filings Google Patents, USPTO Innovation direction
Job postings LinkedIn, Indeed, Levels.fyi Technology demand
Benchmark data TechEmpower, MLPerf, Geekbench Performance trends
Finance Predictions
Source Type Examples Use For
Economic data FRED, BLS, Census Macro trends
Earnings SEC filings, earnings calls Company performance
Analyst reports Bloomberg, Reuters, S&P Market consensus
Central bank Fed minutes, ECB statements Interest rates, policy
Commodity data EIA, OPEC reports Energy/commodity prices
Sentiment VIX, put/call ratio, AAII survey Market mood
Geopolitics Predictions
Source Type Examples Use For
Official sources Government statements, UN reports Policy direction
Think tanks RAND, Brookings, Chatham House Analysis
Election data Polls, voter registration, 538 Election outcomes
Trade data WTO, customs data, trade balances Trade policy
Military data SIPRI, defense budgets, deployments Conflict risk
Diplomatic signals Ambassador recalls, sanctions, treaties Relations
Climate Predictions
Source Type Examples Use For
Scientific data IPCC, NASA, NOAA Climate trends
Energy data IEA, EIA, IRENA Energy transition
Policy data COP agreements, national plans Regulation
Corporate data CDP disclosures, sustainability reports Corporate action
Technology data BloombergNEF, patent filings Clean tech trends
Investment data Green bond issuance, ESG flows Capital allocation
Reasoning Chain Construction
Template
PREDICTION: [Specific, falsifiable claim]
-
REFERENCE CLASS (Outside View) Base rate: [What % of similar events occur?] Reference examples: [3-5 historical analogues]
-
SPECIFIC EVIDENCE (Inside View) Signals FOR (+): a. [Signal] — strength: [strong/moderate/weak] — adjustment: +X% b. [Signal] — strength: [strong/moderate/weak] — adjustment: +X%
Signals AGAINST (-): a. [Signal] — strength: [strong/moderate/weak] — adjustment: -X% b. [Signal] — strength: [strong/moderate/weak] — adjustment: -X%
-
SYNTHESIS Starting probability (base rate): X% Net adjustment: +/-Y% Final probability: Z%
-
KEY ASSUMPTIONS
- [Assumption 1]: If wrong, probability shifts to [W%]
- [Assumption 2]: If wrong, probability shifts to [V%]
-
RESOLUTION Date: [When can this be resolved?] Criteria: [Exactly how to determine if correct] Data source: [Where to check the outcome]
Prediction Tracking & Scoring
Prediction Ledger Format
{ "id": "pred_001", "created": "2025-01-15", "prediction": "OpenAI will release GPT-5 before July 2025", "confidence": 0.65, "domain": "tech", "time_horizon": "2025-07-01", "reasoning_chain": "...", "key_signals": ["leaked roadmap", "compute scaling", "hiring patterns"], "status": "active|resolved|expired", "resolution": { "date": "2025-06-30", "outcome": true, "evidence": "Released June 15, 2025", "brier_score": 0.1225 }, "updates": [ {"date": "2025-03-01", "new_confidence": 0.75, "reason": "New evidence: leaked demo"} ] }
Accuracy Report Template
ACCURACY DASHBOARD
Total predictions: N Resolved predictions: N (N correct, N incorrect, N partial) Active predictions: N Expired (unresolvable):N
Overall accuracy: X% Brier score: 0.XX
Calibration: Predicted 90%+ → Actual: X% (N predictions) Predicted 70-89% → Actual: X% (N predictions) Predicted 50-69% → Actual: X% (N predictions) Predicted 30-49% → Actual: X% (N predictions) Predicted <30% → Actual: X% (N predictions)
Strengths: [domains/types where you perform well] Weaknesses: [domains/types where you perform poorly]
Cognitive Bias Checklist
Before finalizing any prediction, check for these biases:
Anchoring: Am I fixated on the first number I encountered?
-
Fix: Deliberately consider the base rate before looking at specific evidence
Availability bias: Am I overweighting recent or memorable events?
-
Fix: Check the actual frequency, not just what comes to mind
Confirmation bias: Am I only looking for evidence that supports my prediction?
-
Fix: Actively search for contradicting evidence (steel-man the opposite)
Narrative bias: Am I choosing a prediction because it makes a good story?
-
Fix: Boring predictions are often more accurate
Overconfidence: Am I too sure?
-
Fix: If you've never been wrong at this confidence level, you're probably overconfident
Scope insensitivity: Am I treating very different scales the same?
-
Fix: Be specific about magnitudes and timeframes
Recency bias: Am I extrapolating recent trends too far?
-
Fix: Check longer time horizons and mean reversion patterns
Status quo bias: Am I defaulting to "nothing will change"?
- Fix: Consider structural changes that could break the status quo
Contrarian Mode
When enabled, for each consensus prediction:
-
Identify what the consensus view is
-
Search for evidence the consensus is wrong
-
Consider: "What would have to be true for the opposite to happen?"
-
If credible contrarian evidence exists, include a contrarian prediction
-
Always label contrarian predictions clearly with the consensus for comparison