Idea Tournament

A structured framework for generating diverse research ideas through tree-based expansion, then selecting the strongest candidate via Elo-rated pairwise tournaments across four quality dimensions.

When to Use This Skill

User has a research direction from research-ideation and needs concrete, ranked ideas
User wants to systematically compare multiple research ideas before committing
User asks about idea ranking, competitive selection, or proposal generation
User wants to explore variations of a research concept and select the best one
User mentions "idea tournament", "rank ideas", "compare approaches", "research proposal", "which idea is best"

From Direction to Proposal

The gap between "I have a research direction" and "I have a concrete proposal" is where most researchers stall. They either commit to their first idea (missing better alternatives) or endlessly brainstorm without converging (analysis paralysis).

The tournament solves both problems. Phase 1 forces breadth — you generate up to N_I=21 candidates (the paper's maximum) by systematically varying technique, domain, and formulation. Phase 2 forces convergence — pairwise Elo comparisons identify the strongest idea without requiring you to hold all candidates in your head simultaneously.

Before starting:

Load prior knowledge from Ideation Memory (M_I):
- Refer to the evo-memory skill → Read M_I at /memory/ideation-memory.md
- Select the top-2 entries (k_I=2) most relevant to the user's current goal by comparing each entry's Summary and Retrieval Tags against the goal
- Feasible directions become tree seeds — incorporate them as Level 1 branches in Phase 1
- Unsuccessful directions (fundamental failures only) are used during pruning — prune any tree branch that matches a fundamental failure pattern
- If M_I doesn't exist yet (first cycle), skip this step
Retrieve relevant literature L for the user goal G. The paper defines idea tree search as IdeaTreeSearch(G, L, K_I) — literature is a formal input alongside the user goal and retrieved memory. Use web search or provided papers to ground idea generation in existing work.

Phase 1: Tree-Structured Idea Generation

Expand a seed idea into a tree of candidates by varying one axis per level. The tree structure ensures diversity — each branch explores a fundamentally different variation rather than minor tweaks of the same concept.

The Three Axes

Level	Axis	What Varies	Example
0	Seed	Starting research direction	"Efficient LLM inference"
1	Technique	The core technical approach	Pruning, quantization, distillation
2	Domain	The application context	Edge devices, multi-modal, long-context
3	Formulation	The problem framing	Latency-constrained, memory-constrained, accuracy-preserving

Expansion Process

Level 0 — Seed (1 node): Start with the research direction from research-ideation. This is your root node.

Level 1 — Technique variants (3 nodes): Generate 3 fundamentally different technical approaches to the seed direction. These should be distinct paradigms, not variations of the same technique. Reflect carefully to verify each is genuinely different.

Level 2 — Domain adaptations (6-9 nodes): For each Level 1 node, generate 2-3 domain-specific adaptations. How does this technique apply differently in different contexts? What domain-specific constraints create new challenges?

Level 3 — Formulation variants (up to N_I=21 total leaves): For each Level 2 node, refine into 1-3 specific problem formulations. A formulation pins down the exact problem statement — the inputs, outputs, constraints, and evaluation criteria. The paper sets N_I=21 as the maximum number of candidate ideas. If the tree produces fewer than 15 leaves, expand Level 2 or Level 3 further. If more than 21, prune to stay within the N_I limit.

Per-Node Cycle: Propose → Review → Refine

For each new node:

Propose: Write a 2-3 sentence description of the idea
Review: Evaluate critically — Is this genuinely different from sibling nodes? Is it at least plausible?
Refine: Sharpen the description based on the review. Remove vague language. Make the novelty claim specific.

Pruning

After expanding each level, prune clearly infeasible branches. A branch is "clearly infeasible" if:

It requires resources fundamentally unavailable (e.g., proprietary datasets you can't access)
It contradicts well-established theoretical results
It duplicates an existing, well-established solution with no meaningful variation
It appears in evo-memory's unsuccessful directions as a fundamental failure (not implementation failure)

Important: Pruning removes only the obviously unworkable. Do NOT prune ideas that are risky, unconventional, or outside your current expertise — these are exactly the ideas tournaments are designed to evaluate fairly.

Save the complete tree to /idea-tree.md.

See references/tree-search-protocol.md for detailed expansion rules and diversity metrics.

Phase 2: Elo Tournament Ranking

Rank all leaf candidates through pairwise comparisons on four quality dimensions. Swiss-system pairing keeps the number of comparisons manageable while still producing reliable rankings.

The Four Dimensions

Dimension	Weight	What It Measures
Novelty	25%	How different is this from existing published work?
Feasibility	25%	Can this be implemented and validated within reasonable time and resources?
Relevance	25%	Does this address an important, open problem in the field?
Clarity	25%	Is the idea well-defined enough to start working on immediately?

All dimensions are weighted equally. Researchers tend to overweight novelty and underweight feasibility — equal weights correct this bias.

Tournament Mechanics

Starting Elo: 1500 for all candidates.

K-factor: 32 (standard for new players; large enough that a few matches significantly move ratings).

Swiss-system pairing (4-5 rounds):

Round 1: Random pairing
Subsequent rounds: Pair candidates with similar current Elo ratings, avoiding rematches
4-5 rounds is sufficient for 15-21 candidates to produce stable rankings

Per-match process:

Present both candidates side by side with their full descriptions
Score each on all 4 dimensions (1-10 scale)
Compute composite scores (average of 4 dimensions)
Determine the match winner (higher composite score)
Update Elo ratings using the standard formula (see elo-ranking-guide.md for the formula, worked example, and convergence criteria)

Save rankings to /idea-rankings.md.

See references/elo-ranking-guide.md for the detailed rubric and convergence criteria.

Phase 3: Direction Summarization

Synthesize the top-3 ranked ideas into a "promising directions" summary. This serves two purposes: it preserves optionality (the best idea may combine elements from multiple candidates), and it feeds into evo-memory for future cycles.

Summarization Process

For each of the top-3 ideas:

Extract the core research direction (abstract away from specific implementation details)
Identify the key insight that makes this direction promising
Note the primary risk or uncertainty
Check against evo-memory — has this direction been explored before? What was learned?

Then synthesize across the top-3:

What common threads run through the top-3? These may suggest an even stronger combined direction.
What dimensions do the top ideas excel in? Are there patterns (e.g., all top ideas score high on feasibility but moderate on novelty)?
What's missing? Are there important aspects of the original seed that none of the top ideas address?

Save to /direction-summary.md.

After completion, trigger evo-memory IDE (Idea Direction Evolution) to update Ideation Memory with the promising directions identified.

Phase 4: Proposal Extension

Extend the tournament winner (rank #1) into a full research proposal with enough detail to begin implementation.

Proposal Structure

The paper defines proposal P as containing 5 sections: background, related work, method, experiment plan, and expected results. We extend this with a 6th practical section (risks and mitigations).

1. Background: Define the exact problem — inputs, outputs, constraints, and why existing solutions are insufficient. Be specific: "LLM inference on edge devices with <2GB memory while maintaining >90% of full-model accuracy" is a background statement; "make LLMs faster" is not. Include context and motivation.

2. Related Work: Position the idea within the existing literature. What has been tried? What are the gaps? This should draw on the literature L retrieved during Phase 1 tree generation.

3. Proposed Method: Describe the technical approach at a level of detail sufficient for implementation. Include the key insight that differentiates this from prior work. State assumptions explicitly. List 3 testable contributions.

4. Experiment Plan: Datasets, baselines, metrics, and ablation design. This should align with what experiment-pipeline Stage 4 will need. Include both quantitative metrics and qualitative evaluation where appropriate.

5. Expected Results: Quantitative targets (e.g., "15-20% latency reduction with <2% accuracy loss") and qualitative expectations. Being specific about expected results forces you to think about whether the idea is realistic.

6. Risks and Mitigations (practical extension): Technical risks that could prevent success, and fallback plans for each. A proposal without risks is either dishonest or insufficiently analyzed. This section is not in the paper but is valuable for practical research planning.

Save to /research-proposal.md.

See references/proposal-extension.md for detailed guidance on each section.

Counterintuitive Tournament Rules

Prioritize these rules during idea generation and ranking:

Quantity before quality: Generate many candidates before evaluating any. Premature filtering kills diversity. You can't know which idea is strongest until you've seen the alternatives — and the best ideas often emerge from unexpected branches of the tree.
Vary one axis per level: Changing multiple axes simultaneously produces ideas that are different but not meaningfully diverse. Each level of the tree should explore ONE dimension of variation, so you understand exactly what makes each branch unique.
Feasibility is not optional: Brilliant but infeasible ideas waste entire research cycles. A novel idea that can't be validated within your constraints is not a contribution — it's a thought experiment. Weight feasibility equally with novelty.
The tournament finds surprises: Structured pairwise comparison often reveals that your initial favorite isn't actually the strongest idea. Trust the rankings over your gut feeling. If the results surprise you, that means the tournament is working — it's surfacing information you wouldn't have found through intuition alone.
Pruning is not selecting: Prune only clearly infeasible branches. The tournament handles quality ranking. If you aggressively prune before the tournament, you're substituting your initial intuition for systematic comparison — exactly the bias the tournament is designed to correct.
Top-3, not top-1: Summarizing the top 3 directions (not just the winner) preserves optionality. The best final approach may combine elements from multiple top candidates. Committing to exactly one idea too early discards valuable signal.

Handoff to Planning

When the tournament is complete and the proposal is written, pass these artifacts to paper-planning:

Artifact	Source Phase	Used By
Research proposal (5+1 sections)	Phase 4	Story design, experiment planning
Idea tree (full structure)	Phase 1	Related work positioning
Elo rankings with scores	Phase 2	Justification for chosen direction
Direction summary (top-3)	Phase 3	Fallback directions if primary fails
Tournament scorecards	Phase 2	Understanding idea strengths/weaknesses

Also pass results to evo-memory for evolution updates:

Trigger IDE (Idea Direction Evolution) with the top-3 directions from Phase 3

Skill Integration

Before Starting (load memory)

Refer to the evo-memory skill to read Ideation Memory: → Read M_I at /memory/ideation-memory.md

After Phase 3 (update memory)

Refer to the evo-memory skill and trigger IDE: → Run IDE protocol with /direction-summary.md

After Phase 4 (handoff to planning)

Refer to the paper-planning skill: → Pass /research-proposal.md

Reference Navigation

Topic	Reference File	When to Use
Tree expansion rules and diversity	tree-search-protocol.md	Generating diverse idea candidates
Elo formula, rubric, and pairing	elo-ranking-guide.md	Running the tournament
Proposal section guidance	proposal-extension.md	Writing the research proposal
Idea candidate template	idea-candidate-template.md	Describing individual ideas
Ranking scorecard template	ranking-scorecard-template.md	Recording pairwise comparisons
Direction summary template	direction-summary-template.md	Synthesizing top-3 directions

idea-tournament

Safety Notice

Copy this and send it to your AI assistant to learn