prove-it

Prove It

When to use

The user asserts certainty: “always”, “never”, “guaranteed”, “optimal”, “cannot fail”, “no downside”, “100%”.
The user asks for a devil’s advocate or proof.
The claim feels too clean for the domain.

Round cadence (mandatory)

Definition: one "turn" means one assistant reply.
Default: continuous (no approvals). Run rounds 1-10 plus Oracle synthesis in the current assistant turn. Do not rely on a future assistant turn for default behavior.
Hard completion gate: a default continuous reply is incomplete unless it contains rounds 1-10 in order, one distinct Round Ledger + Knowledge Delta block for each round, and an Oracle synthesis block after round 10.
Fail closed: if you notice you have emitted only round 1 or are about to conclude early, do not finalize; continue immediately with rounds 2-10 in the same reply.
In default mode, after each round, publish:
Round Ledger
Knowledge Delta
Round integrity is mandatory:
Execute rounds 1-10 in numeric order, with one distinct gauntlet focus per round.
Emit a separate Round Ledger + Knowledge Delta block for every round; do not merge multiple rounds into one pseudo-round or summarize "rounds 2-9" together.
Do not treat prior prose as implicitly satisfying skipped rounds unless the user explicitly asks to continue from a known checkpoint.
Early disproof does not end the gauntlet: if round 1 already falsifies the original claim, continue rounds 2-9 against the strongest surviving refined claim, then publish Oracle synthesis.
If confidence remains low after Oracle synthesis, continue with additional rounds (11+) in the same reply when feasible, then publish an updated Oracle synthesis.
Do not ask for permission to continue. In default mode, do not wait for "next" between rounds. Pause only when you must ask the user a question or the user says "stop".
Step mode (explicit): if the user asks to "pause" / "step" / "one round at a time", run one round then wait for "next".
Turn autoloop (explicit): if the user asks for "autoloop" / "one round per turn", run exactly one gauntlet round per assistant turn and continue on the next turn until Oracle synthesis.
Full auto mode (explicit): "full auto" / "fast mode" is an alias for the default continuous behavior.
Keep it compact: prefer terse bullets and phrase-level evidence so the full gauntlet fits in one reply.

Mode invocation

Mode Default? How to invoke Cadence

Continuous yes (no phrase) / "continuous" / "don't stop" / "exhaust all rounds" rounds 1-10 + Oracle in one turn; publish Round Ledger + Knowledge Delta after each round

Step mode no "step mode" / "pause each round" / "pause" / "step" / "one round at a time" 1 round/turn; wait for "next"

Turn autoloop no "autoloop" / "one round per turn" 1 round/turn; continue on the next turn until Oracle

Full auto no "full auto" / "fast mode" alias for Continuous

Quick start

Restate the claim and its scope.
Default to continuous. If the user explicitly requests "step mode" or "turn autoloop", use that instead.
Run rounds 1-10 plus Oracle synthesis in the same reply, publishing a distinct Round Ledger + Knowledge Delta block after each round.
Do not emit a top-level verdict, conclusion, or stop point before round 10; any interim refinement stays inside the round blocks.
Treat "continuous", "don't stop", and "exhaust all rounds" as the default continuous mode.
If confidence remains low, run additional rounds (11+) and publish an updated Oracle synthesis.

Ten-round gauntlet

Counterexamples: smallest concrete break.
Logic traps: missing quantifiers/premises.
Boundary cases: zero/one/max/empty/extreme scale.
Adversarial inputs: worst-case distributions/abuse.
Alternative paradigms: different model flips the conclusion.
Operational constraints: latency/cost/compliance/availability.
Probabilistic uncertainty: variance, tail risk, sampling bias.
Comparative baselines: “better than what?”, which metric?
Meta-test: fastest disproof experiment.
Oracle synthesis: tightest surviving claim with boundaries. If confidence is still low, repeat rounds 1-9 as needed, then re-run Oracle synthesis.

Round self-prompt bank (pick exactly 1)

Internal self-prompts for selecting round focus. Do not ask the user unless blocked.

Counterexamples: What is the smallest input that breaks this?
Logic traps: What unstated assumption must hold?
Boundary cases: Which boundary is most likely in real use?
Adversarial: What does worst-case input look like?
Alternative paradigm: What objective makes the opposite true?
Operational: Which dependency/policy is a hard stop?
Uncertainty: What distribution shift flips the result?
Baseline: Better than what, on which metric?
Meta-test: What experiment would change your mind fastest?
Oracle: What explicit boundaries keep this honest?

Core artifacts

Argument map

Claim: Premises:

P1:
P2: Hidden assumptions:
A1: Weak links:
W1: Disproof tests:
T1: Refined claim:

Round Ledger (update every round)

Round: <1-10 (or 11+)> Focus: Claim scope: New evidence: New counterexample: Remaining gaps: Next round:

Knowledge Delta (publish every round)

New:
Updated:
Invalidated:

Claim boundary table

Boundary type	Valid when	Invalid when	Assumptions	Stressors
Scale
Data quality
Environment
Adversary

Next-tests plan

Test	Data needed	Success threshold	Stop condition

Domain packs

Performance

Use when the claim is about speed, latency, throughput, or resources.

Clarify: median vs tail latency vs throughput.
Identify workload shape (spiky vs steady) and bottleneck resource.

Product

Use when the claim is about user impact, adoption, or behavior.

Clarify user segment and success metric.
State the baseline/counterfactual.
Name the likely unintended behavior/tradeoff.

Oracle synthesis template (round 10 / as needed)

Original claim: Refined claim: Boundaries:

Valid when:
Invalid when: Confidence trail:
Evidence:
Gaps: Next tests:
...

Deliverable format (per turn)

Round number + focus.
Round Ledger + Knowledge Delta.
At most one question for the user (only when blocked).
In default continuous mode, run rounds 1-10 + Oracle synthesis in the same reply, with one distinct per-round block per round.
In step mode, run one round and wait for "next".
In turn autoloop, run one round in that turn and continue to the next round on the next turn.
In full auto (or "fast mode"), follow the same behavior as default continuous mode.

Continuous-mode completion checklist

Present Round 1 through Round 10 in numeric order.
Include exactly one Round Ledger and one Knowledge Delta section inside each round block.
Place Oracle synthesis after round 10, not before.
A reply missing any of the above is invalid for continuous mode and must be continued before finalizing.

Activation cues

"always" / "never" / "guaranteed" / "optimal" / "cannot fail" / "no downside" / "100%"
"prove it" / "devil's advocate" / "stress test" / "rigor"

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

grill-me

creative-problem-solver

complexity-mitigator