pdf2markdown

Convert PDF and image documents to clean Markdown via the PDF2Markdown CLI. Use when the user wants to extract text from PDFs, convert PDFs to markdown, parse document structure, or process images (JPEG, PNG, GIF, WebP, TIFF, BMP) into structured content. Also use when they say "convert this PDF", "parse this document", "extract text from PDF", "parse async", or "large file" (up to 100MB). Must be pre-installed and authenticated.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "pdf2markdown" with this command: npx skills add QThans/pdf2markdown

PDF2Markdown CLI

Convert PDF and image documents to Markdown. Supports both pdf2markdown and pdf2md commands.

Run pdf2markdown --help or pdf2md <command> --help for options.

Prerequisites

Install and authenticate. Check with pdf2markdown --status.

pdf2markdown login
# or set PDF2MARKDOWN_API_KEY

If not ready, see rules/install.md. For output handling, see rules/security.md.

Workflow

NeedCommandWhen
Convert PDF/imageparseFile under ~30MB, have path or URL
Large file (async)parse-asyncFile over ~30MB, or sync returns file_too_large error

Quick start

Parse (sync, ~30MB):

pdf2markdown document.pdf -o .pdf2markdown/output.md
pdf2markdown parse --url "https://example.com/doc.pdf" -o .pdf2markdown/doc.md
pdf2markdown parse file1.pdf file2.png -o .pdf2markdown/

# JSON output
pdf2markdown parse document.pdf --format json -o .pdf2markdown/result.json

Parse-async (large files, up to 100MB):

# Submit and wait
pdf2markdown parse-async large.pdf --wait -o .pdf2markdown/output.md
pdf2markdown parse-async --url "https://cdn.example.com/big.pdf" --wait -o .pdf2markdown/doc.md

# Submit only (poll later)
pdf2markdown parse-async large.pdf  # returns task_id
pdf2markdown parse-async <task_id> --status
pdf2markdown parse-async <task_id> --result -o .pdf2markdown/output.md

Options

CommandKey options
parse-u, --url, -o, --output, -f, --format (markdown, json, all), --page-images, --json, --pretty
parse-async-u, --url, -o, --output, --wait, --status, --result, --poll-interval, --timeout

Run pdf2markdown <command> --help for full details.

Output & Organization

Write results to .pdf2markdown/ with -o. Add .pdf2markdown/ to .gitignore.

pdf2markdown document.pdf -o .pdf2markdown/doc.md
pdf2markdown parse file1.pdf file2.pdf -o .pdf2markdown/

Naming: .pdf2markdown/{name}.md. For large outputs, use grep, head, or incremental reads. Always quote URLs — shell interprets ? and & as special characters.

Documentation

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Webhook Tester

Webhook testing and debugging tool. Send test webhook payloads to any URL, simulate various webhook events (GitHub, Stripe, Slack), inspect responses, and lo...

Registry SourceRecently Updated
Coding

Azuredatastudio

Azure Data Studio is a data management and development tool with connectivity to popular cloud and o azuredatastudio, typescript, azure, azure-data-studio, e...

Registry SourceRecently Updated
1130ckchzh
Coding

Awesome Postgres

A curated list of awesome PostgreSQL software, libraries, tools and resources, inspired by awesome-m awesome postgres, python, database, postgres, postgresql...

Registry SourceRecently Updated
Coding

Awesome Cheatsheets

👩‍💻👨‍💻 Awesome cheatsheets for popular programming languages, frameworks and development tools. They awesome cheatsheets, javascript, backend, bash, chea...

Registry SourceRecently Updated