pdf2word-skills

Convert scanned PDF documents into Word text documents using a free, local OCR engine or remote api.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "pdf2word-skills" with this command: npx skills add scottkiss/pdf2word-skills

PDF to Word Converter

A skill to extract text from scanned PDF documents and convert them into reusable Word (.docx) files using the free, local docr OCR engine.

Prerequisites

Initialize the OCR engine by downloading the binaries:
```
bash scripts/install.sh
```

Install the required Python dependencies:

pip install -r scripts/requirements.txt

Usage

Run the Python script passing the input PDF file and the desired output .docx file path. You can also append any additional standard docr arguments (such as engine preferences).

python scripts/pdf2word.py <input.pdf> <output.docx> [docr_args...]

Examples

Convert a single file with the default local engine:

python scripts/pdf2word.py sample.pdf sample_output.docx

Using Other API Engines

By default, the script uses the local RapidOCR engine. The underlying docr tool also supports other engines like the Google Gemini API for potentially higher recognition accuracy on complex layouts.

To use Gemini, first configure your API key:

mkdir -p ~/.ocr
echo "gemini_api_key=your_gemini_key" > ~/.ocr/config

Then pass the -engine gemini argument to the script:

python scripts/pdf2word.py sample.pdf sample_output.docx -engine gemini

If your document has tables, you can force Gemini to output them in Markdown format so the script can parse them into native Word tables:

python scripts/pdf2word.py sample.pdf sample_output.docx -engine gemini -prompt "Extract all text and preserve tables in Markdown format using | symbols."

How it Works

The script calls docr, which uses the specified OCR model (RapidOCR by default) to read text from the scanned PDF.
The extracted text is temporarily stored.
The python-docx library is used to read the temporary text and construct a formatted Word document.
Temporary files are cleaned up automatically.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open Registry Record Open in ClawHub

Related Skills

Related by shared tags or category signals.

General

gitlab-mr-reviewer

当需要审核 GitLab 合并请求、检查 MR diff 风险、发布 GitLab 审查评论、执行 approve/request changes，或发送 MR 审查通知时使用。

Registry SourceRecently Updated

1490whrime

General

Voice Transcriber Toolkit

Voice-to-Text Transcription Toolkit - 语音识别转文字，支持Whisper/Vosk引擎，批量处理，字幕导出 | Speech recognition & transcription with Whisper/Vosk engines, batch processing, su...

Registry SourceRecently Updated

00kaiyuelv

General

Gigo Lobster Taster

🦞 GIGO · gigo-lobster-taster: 正式试吃模式：跑完整评测，默认上传云端、生成个人结果页并进入排行榜。 Triggers: 试吃我的龙虾 / 品鉴我的龙虾 / lobster taste / lobster taster.

Registry SourceRecently Updated

4060mengkunliang

General

Gigo Lobster Local

🦞 GIGO · gigo-lobster-local: 本地模式：跑完整评测，但不上云、不注册个人结果页，证书二维码回到官网首页。 Triggers: 本地试吃龙虾 / 离线试吃龙虾 / local lobster taste / offline lobster taste.

Registry SourceRecently Updated

2770mengkunliang