PDF2Markdown CLI
Convert PDF and image documents to Markdown. Supports both pdf2markdown and pdf2md commands.
Run pdf2markdown --help or pdf2md <command> --help for options.
Prerequisites
Install and authenticate. Check with pdf2markdown --status.
pdf2markdown login
# or set PDF2MARKDOWN_API_KEY
If not ready, see rules/install.md. For output handling, see rules/security.md.
Workflow
| Need | Command | When |
|---|---|---|
| Convert PDF/image | parse | File under ~30MB, have path or URL |
| Large file (async) | parse-async | File over ~30MB, or sync returns file_too_large error |
Quick start
Parse (sync, ~30MB):
pdf2markdown document.pdf -o .pdf2markdown/output.md
pdf2markdown parse --url "https://example.com/doc.pdf" -o .pdf2markdown/doc.md
pdf2markdown parse file1.pdf file2.png -o .pdf2markdown/
# JSON output
pdf2markdown parse document.pdf --format json -o .pdf2markdown/result.json
Parse-async (large files, up to 100MB):
# Submit and wait
pdf2markdown parse-async large.pdf --wait -o .pdf2markdown/output.md
pdf2markdown parse-async --url "https://cdn.example.com/big.pdf" --wait -o .pdf2markdown/doc.md
# Submit only (poll later)
pdf2markdown parse-async large.pdf # returns task_id
pdf2markdown parse-async <task_id> --status
pdf2markdown parse-async <task_id> --result -o .pdf2markdown/output.md
Options
| Command | Key options |
|---|---|
parse | -u, --url, -o, --output, -f, --format (markdown, json, all), --page-images, --json, --pretty |
parse-async | -u, --url, -o, --output, --wait, --status, --result, --poll-interval, --timeout |
Run pdf2markdown <command> --help for full details.
Output & Organization
Write results to .pdf2markdown/ with -o. Add .pdf2markdown/ to .gitignore.
pdf2markdown document.pdf -o .pdf2markdown/doc.md
pdf2markdown parse file1.pdf file2.pdf -o .pdf2markdown/
Naming: .pdf2markdown/{name}.md. For large outputs, use grep, head, or incremental reads. Always quote URLs — shell interprets ? and & as special characters.