Doc2X OCR Markdown

Overview

Convert a single PDF or image into Markdown and extract image assets with one local script:

scripts/doc2x_ocr.py

Require only one credential:

DOC2X_APIKEY

Quick Start

Set API key:

export DOC2X_APIKEY='sk-...'

Run PDF OCR to Markdown + images:

python scripts/doc2x_ocr.py pdf ./input.pdf --outdir ./output

Run image OCR to Markdown + images:

python scripts/doc2x_ocr.py image ./page.png --outdir ./output

Workflow

Validate DOC2X_APIKEY.
Choose conversion mode from input file type.
Run scripts/doc2x_ocr.py.
Return output folder and generated Markdown path.

Modes

PDF Mode

Use the asynchronous Doc2X PDF flow:

POST /api/v2/parse/preupload
PUT file bytes to returned upload URL
Poll GET /api/v2/parse/status
Trigger export POST /api/v2/convert/parse (to=md)
Poll GET /api/v2/convert/parse/result
Download zip, extract files, locate Markdown

Useful options:

--formula-mode dollar|normal (default dollar)
--merge-cross-page-forms
--poll-interval
--timeout
--keep-zip

Image Mode

Use synchronous image layout OCR:

POST /api/v2/parse/img/layout with binary image body
Write page Markdown from response
If convert_zip exists, decode and extract image resources

Output Contract

For input <name>.pdf or <name>.png, script writes:

<outdir>/<name>/... extracted files
<outdir>/<name>/<name>.md if no Markdown file exists in extracted content

Script prints a JSON summary with:

mode
uid
output_dir
markdown
zip (only when --keep-zip)

References

Read these files when you need deeper context:

references/api-quick-reference.md for endpoint behavior and limits
references/implementation-notes.md for relation to the copied official doc2x.py

Troubleshooting

Handle parse_task_limit_exceeded or parse_concurrency_limit by reducing concurrent jobs and retrying later.
Split huge PDFs if parse timeout or page-limit errors occur.
Keep poll interval between 1 and 3 seconds for status APIs unless there is a strong reason to change.
Save outputs promptly because official docs state cloud parse results are temporary.

doc2x-ocr-markdown

Safety Notice

Copy this and send it to your AI assistant to learn

Doc2X OCR Markdown

Overview

Quick Start

Workflow

Modes

PDF Mode

Image Mode

Output Contract

References

Troubleshooting

Source Transparency

Related Skills

concept-dissector-9step

nano-banana-2

qwen-image-2

p-video