evidence-binder

Evidence Binder (NO PROSE)

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "evidence-binder" with this command: npx skills add willoscar/research-units-pipeline-skills/willoscar-research-units-pipeline-skills-evidence-binder

Evidence Binder (NO PROSE)

Goal: convert a paper-level pool into a subsection-addressable evidence plan.

This skill is the bridge from “Evidence Bank” → “Writer”: the writer should only use evidence IDs bound to the current subsection.

Why this matters for writing quality:

  • Weak/undifferentiated bindings force the writer to either pad prose or cite out-of-scope.

  • Treat binding_gaps as a routing signal: fix upstream evidence/mapping instead of "writing around" missing evidence.

Inputs

  • outline/subsection_briefs.jsonl

  • outline/mapping.tsv

  • papers/evidence_bank.jsonl

  • Optional:

  • citations/ref.bib (to validate cite keys when evidence items carry citations)

Outputs

  • outline/evidence_bindings.jsonl (1 JSONL record per subsection)

  • outline/evidence_binding_report.md (summary; bullets + small tables)

  • Includes gaps (missing required evidence fields) and tag mix (selected evidence tags) so subsection-specific evidence needs are visible.

Output format (outline/evidence_bindings.jsonl )

JSONL (one object per H3 subsection). Best-effort fields (keep deterministic):

  • sub_id , title

  • paper_ids (papers in-scope for this subsection, from mapping.tsv )

  • mapped_bibkeys (bibkeys mapped to this subsection)

  • bibkeys (a selected subset to encourage subsection-first citations)

  • evidence_ids (selected evidence items from papers/evidence_bank.jsonl )

  • evidence_counts (small summary by claim_type / tag / evidence_level)

  • binding_rationale (short bullets; why the selected evidence covers this subsection’s axes / desired tags)

  • binding_gaps (list[str]; required evidence fields not covered by selected evidence; drives the evidence self-loop upstream)

A150++ density contract (default)

  • Use queries.md:per_subsection as the width contract (A150++ default: 28).

  • Bind enough evidence to make writing concretely executable without out-of-scope pressure:

  • mapped_bibkeys : >= per_subsection

  • evidence_ids : >= per_subsection - 4 (A150++: >=24)

  • bibkeys (selected): >= 20 (so each H3 has a usable citation pool, not just a long mapped list)

Binding policy (how strict to be)

  • Subsection-first by default: the writer should primarily cite bibkeys and use evidence_ids bound to this sub_id .

  • Allow limited within-chapter reuse: citations from sibling H3s within the same H2 chapter may be reused for background/evaluation framing, but:

  • keep >=3 subsection-specific citations per H3 (avoid “free cite drift”)

  • avoid cross-chapter reuse unless the outline explicitly calls for it

Workflow (NO PROSE)

  • Read outline/subsection_briefs.jsonl to understand each H3’s scope/rq/axes.

  • Read outline/mapping.tsv to know which papers belong to each subsection.

  • Read papers/evidence_bank.jsonl and select a subsection-scoped set of evidence_id items per H3.

  • If citations/ref.bib exists, sanity-check that any cite keys referenced by selected evidence items are defined.

  • Write outline/evidence_bindings.jsonl and outline/evidence_binding_report.md .

Freeze policy

  • If outline/evidence_bindings.refined.ok exists, the script will not overwrite outline/evidence_bindings.jsonl .

  • Treat this marker as an explicit refinement/completion signal (especially in strict runs): only create it after you have checked binding_gaps and tag mix look subsection-specific.

Heterogeneity sanity check (avoid recipe-like bindings)

A common hidden failure mode is mechanical uniformity: every H3 ends up with the same claim_type /tag mix , which hides what each subsection is actually missing and pushes the writer toward generic prose.

Before you mark bindings as refined:

  • Scan outline/evidence_binding_report.md : different H3 should show meaningfully different tag mix / claim_type balance.

  • If most H3 look identical, treat it as a binder smell: tighten required_evidence_fields , adjust selection rationale, or enrich the evidence bank, then rerun.

Script

Quick Start

  • python .codex/skills/evidence-binder/scripts/run.py --help

  • python .codex/skills/evidence-binder/scripts/run.py --workspace workspaces/<ws>

All Options

  • --workspace <dir> : workspace root

  • --unit-id <U###> : unit id (optional; for logs)

  • --inputs <semicolon-separated> : override inputs (rare; prefer defaults)

  • --outputs <semicolon-separated> : override outputs (rare; prefer defaults)

  • --checkpoint <C#> : checkpoint id (optional; for logs)

Examples

  • Bind evidence IDs after building the evidence bank:

  • Ensure papers/evidence_bank.jsonl exists.

  • Run: python .codex/skills/evidence-binder/scripts/run.py --workspace workspaces/<ws>

Troubleshooting

Issue: some subsections have too few evidence IDs

Fix:

  • Strengthen papers/evidence_bank.jsonl via paper-notes (more extractable evidence items).

  • Or broaden the mapped paper set for the subsection via section-mapper , then rerun binder.

Issue: binding_gaps is non-empty (missing evidence types)

What it means:

  • The subsection brief requires certain evidence fields (e.g., benchmarks/metrics/security/tooling), but the bound evidence items do not cover them.

Fix (self-loop upstream):

  • Prefer enriching papers/evidence_bank.jsonl / papers/paper_notes.jsonl for mapped papers (extract benchmark/metric/failure-mode details).

  • If the mapping is weak for that evidence type, expand outline/mapping.tsv for the subsection and rerun binder.

  • If the requirement is unrealistic for the subsection’s scope, revise outline/subsection_briefs.jsonl:required_evidence_fields and rerun binder.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

pdf-text-extractor

No summary provided by upstream source.

Repository SourceNeeds Review
Research

latex-compile-qa

No summary provided by upstream source.

Repository SourceNeeds Review
Research

draft-polisher

No summary provided by upstream source.

Repository SourceNeeds Review
Research

citation-verifier

No summary provided by upstream source.

Repository SourceNeeds Review