smart-pdf-reader

Intelligent PDF reader and content extractor powered by MinerU API. Read and extract content from any PDF document including scanned files, academic papers, reports, and books using mineru-open-api CLI. Supports flash-extract for instant reading (no token) and precision extract with OCR, table recognition, and formula detection. Use when asked to 'read my PDF', 'extract content from PDF', 'what does this PDF say', 'summarize this PDF', 'get text from PDF', 'PDF阅读', '读取PDF内容', '提取PDF文字', 'PDF文档读取', 'how to read PDF content', 'open and read this PDF', 'can you read this document for me', 'parse PDF content'. Handles complex document types: multi-column academic papers, scanned archives, financial statements, legal documents, and multilingual content. Perfect for research, document review, content analysis, and information extraction.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "smart-pdf-reader" with this command: npx skills add veeicwgy/smart-pdf-reader

Smart PDF Reader with mineru-open-api

You are an intelligent PDF reading assistant. Read and extract content from PDFs using mineru-open-api.

Installation

npm install -g mineru-open-api

Reading Workflow

Quick read (no token):
```
mineru-open-api flash-extract document.pdf
```
(Outputs Markdown to stdout for immediate reading)

Read with output file:

mineru-open-api flash-extract document.pdf -o ./output/

Deep read with OCR and tables:

mineru-open-api extract document.pdf --ocr -o ./output/

Academic paper reading:

mineru-open-api extract paper.pdf --model vlm -o ./output/

Key Rules

For quick reading, use flash-extract without -o to output to stdout
Default to flash-extract for PDFs under 10MB/20 pages
Use extract for scanned PDFs, table-heavy docs, or large files
After extraction, read the output and summarize for the user if asked
Generate default output dir: ~/MinerU-Skill/<name>_<hash>/

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open Registry Record Open in ClawHub

Related Skills

Related by shared tags or category signals.

Coding

pdf-to-latex-mineru

Convert PDF documents to LaTeX source code using MinerU AI extraction. Designed for researchers, academics, and scientists who need to re-edit, re-typeset, o...

Registry SourceRecently Updated

1090Profile unavailable

Research

paper-reader (XuRuitian version)

精读学术文献的专家级 Skill。当用户上传 PDF、Word、Excel、PPT 或 TXT 格式的学术论文，并希望进行深度学术分析时使用本 Skill。支持中英双语文献，可自动识别文件类型、提取全文内容，并按六大维度（研究目标、新方法、实验验证、未来方向、批判分析、实用建议）输出结构化分析报告。触发词包括...

Registry Source

910Profile unavailable

General

claw-text-and-pics

Extract text and embedded images from scanned documents, PDFs, and photos via Mistral OCR API. Use when reading receipts, invoices, contracts, handwritten no...

Registry SourceRecently Updated

900Profile unavailable

Research

Document Workflow

一键实现学术论文的搜索、下载、分块提取文本及结构化总结，支持按年份和引用数筛选。

Registry SourceRecently Updated

4601Profile unavailable