Academic Reader - Scientific Paper Parser
Convert PDF files to clean Markdown using MinerU Open API. No API key required.
Quick Start
# Academic Reader - Scientific Paper Parser
mineru-open-api flash-extract report.pdf
# Academic Reader - Scientific Paper Parser
mineru-open-api flash-extract https://cdn-mineru.openxlab.org.cn/demo/example.pdf
# Academic Reader - Scientific Paper Parser
mineru-open-api flash-extract report.pdf -o ./output/
# Academic Reader - Scientific Paper Parser
mineru-open-api flash-extract report.pdf --pages 1-10
Language Rule
You MUST reply to the user in the SAME language they use. This is non-negotiable.
Capabilities
- Extracts text, tables, and formulas from PDF
- Supports both local files and URLs directly
- Page range selection with
--pages - Language hint with
--language(default:ch, useenfor English) - No API key, no signup, no authentication
- Max 10MB / 20 pages per document
When to Use
- User asks to "read", "extract", "convert", or "parse" a PDF
- User shares a PDF file or PDF link and asks for its content
- User wants to summarize or analyze a PDF document
- User needs PDF content in Markdown format
CLI Reference
Run mineru-open-api flash-extract --help for all available options.
Data Flow
flash-extract sends the document to the MinerU API (mineru.net) for processing and returns Markdown. This is a stateless API call — no account, no persistent storage. MinerU is an open-source project by OpenDataLab (Shanghai AI Lab): https://github.com/opendatalab/MinerU
Notes
- Output is Markdown only; images/tables/formulas may be replaced with placeholders
- For larger files (up to 200MB/600 pages) or precision extraction with full assets, use
mineru-open-api extract(requires auth viamineru-open-api auth) - If the CLI cannot be installed via npm/uv/go, download it from https://mineru.net/ecosystem?tab=cli