bench-debug

/bench-debug <doc_id>

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "bench-debug" with this command: npx skills add opendataloader-project/opendataloader-pdf/opendataloader-project-opendataloader-pdf-bench-debug

/bench-debug <doc_id>

Compares parsing output with ground-truth for a specific document and analyzes failure causes.

Usage

/bench-debug 01030000000189

Execution Steps

Run benchmark for the specific document

./scripts/bench.sh --doc-id <doc_id>

Compare files

  • Ground-truth: tests/benchmark/ground-truth/markdown/<doc_id>.md

  • Prediction: tests/benchmark/prediction/opendataloader/markdown/<doc_id>.md

  • Original PDF: tests/benchmark/pdfs/<doc_id>.pdf

Analyze differences

  • Missing/extra text locations

  • Table structure differences (TEDS score causes)

  • Heading level mismatches (MHS score causes)

  • Reading order errors (NID score causes)

Identify root causes

  • Which PDF elements caused the issue

  • Which Java core components are involved

Suggest improvements

  • Java classes/methods that need modification

  • Expected impact scope

Reference Files

  • ground-truth/reference.json : Per-document element info (categories, coordinates, etc.)

  • java/opendataloader-pdf-core/ : Core parsing logic

Example Output

Document 01030000000189 Analysis:

Overall: 0.2763 (one of the worst performing documents)

Issues:

  1. 2 of 3 tables not detected (TEDS: 0.15)

    • Table boundary detection failed
    • Related code: TableDetector.java
  2. Reading order errors (NID: 0.45)

    • Multi-column layout handling failed
    • Related code: ColumnDetector.java

Recommended Actions:

  • Adjust clustering threshold in TableDetector
  • Improve multi-column detection logic

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

image-gen

Generate AI images from text prompts. Triggers on: "生成图片", "画一张", "AI图", "generate image", "配图", "create picture", "draw", "visualize", "generate an image".

Archived SourceRecently Updated
General

explainer

Create explainer videos with narration and AI-generated visuals. Triggers on: "解说视频", "explainer video", "explain this as a video", "tutorial video", "introduce X (video)", "解释一下XX(视频形式)".

Archived SourceRecently Updated