pdf-toolkit

Run a local script to work with PDF files, DOCX documents, OCR, and text-to-speech. Use the read tool to load this SKILL.md, then exec the uv run command inside it. Do NOT use sessions_spawn. Triggers: read pdf, extract text from pdf, merge pdfs, split pdf, rotate pdf, ocr pdf, read docx, create docx, text to speech, convert to mp3, pdf info, pdf pages.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "pdf-toolkit" with this command: npx skills add youpele52/pdf-toolkit

System Dependencies

  • uv must already be installed because this skill is executed with uv run, and uv installs the Python dependencies declared in src/main.py.
  • ffmpeg is needed for tts because the speech output is normalized and written as an .mp3 file through ffmpeg.
  • tesseract is needed for ocr because it performs the actual optical character recognition on scanned page images.
  • pdfimages is also needed for ocr because it extracts page images from PDFs before those images are passed to tesseract; pdfimages comes from poppler.
  • pandoc is optional for convert because it can convert between many document formats when text-based conversion is possible.
  • libreoffice is an optional alternative to pandoc for convert because it can handle document conversions that pandoc may not support well.

File Access And Network Behavior

  • This skill operates on the file paths provided by the caller. It can read from and write to any host path the caller supplies; it is not limited to the OpenClaw workspace.
  • The /root/.openclaw/workspace/... paths in the command examples show where the skill entrypoint lives. They do not restrict which files the skill can access.
  • The tts command uses edge-tts, which sends the input text to an external text-to-speech service over the network to generate audio.
  • Do not use tts with sensitive or private text unless you are comfortable sending that text off-host.
  • All other commands run locally on the host, subject to the optional local binaries documented below.

Skill: PDF Toolkit

When to use

  • User wants to extract text, tables, or images from a PDF.
  • User wants to get metadata or page count from a PDF.
  • User wants to merge, split, or rotate a PDF.
  • User wants to create a new PDF from plain text or Markdown.
  • User wants to read or write a DOCX file.
  • User wants to OCR a scanned PDF (requires tesseract on host).
  • User wants to convert text or a document to an MP3 audio file (requires ffmpeg on host).
  • User wants to convert between document formats (requires pandoc or libreoffice on host).
  • User wants to check which optional system tools are available.

When NOT to use

  • User wants to view or render a PDF visually — use a PDF viewer.
  • User wants to fill in PDF form fields — this skill does not support AcroForms.
  • User wants to edit an existing PDF's text in-place — use a dedicated PDF editor.

Commands

Check available tools

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py doctor

Get PDF metadata and page count

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py info <pdf_path>

Extract text from a PDF

# All pages
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text <pdf_path>

# Specific pages (1-indexed, comma-separated or ranges)
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text <pdf_path> --pages 1,3,5-8

Extract tables from a PDF

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-tables <pdf_path>
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-tables <pdf_path> --pages 2-4

Extract images from a PDF

# Saves images to current directory by default
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-images <pdf_path>
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-images <pdf_path> --output-dir /path/to/output

Merge PDFs

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py merge <pdf1> <pdf2> [<pdf3> ...] --output merged.pdf

Split a PDF

# Split into individual pages
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py split <pdf_path> --output-dir /path/to/output

# Extract a page range into a new PDF
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py split <pdf_path> --pages 2-5 --output extracted.pdf

Rotate pages in a PDF

# Rotate all pages 90 degrees clockwise
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py rotate <pdf_path> --degrees 90 --output rotated.pdf

# Rotate specific pages
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py rotate <pdf_path> --degrees 180 --pages 1,3 --output rotated.pdf

Create a PDF from text

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py create-pdf --text "Hello, world!" --output hello.pdf
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py create-pdf --file input.txt --output document.pdf

Read a DOCX file

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py read-docx <docx_path>

Write a DOCX file

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py write-docx --text "Content here" --output document.docx
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py write-docx --file input.txt --output document.docx

OCR a scanned PDF (requires tesseract)

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr <pdf_path>
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr <pdf_path> --pages 1-3 --lang eng

Convert text or document to speech (requires ffmpeg)

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text "Hello, world!" --output speech.mp3
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --file input.txt --output speech.mp3
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --file document.pdf --output speech.mp3
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text "Hello" --voice en-GB-SoniaNeural --output speech.mp3

Convert document formats (requires pandoc or libreoffice)

uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py convert <input_path> --output <output_path>

Examples

# Inspect a PDF
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py info report.pdf

# Pull text from pages 1–3
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py extract-text report.pdf --pages 1-3

# Merge two PDFs
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py merge a.pdf b.pdf --output combined.pdf

# OCR a scanned document
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py ocr scan.pdf

# Read a Word document
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py read-docx report.docx

# Text to MP3
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py tts --text "Welcome to the future." --output welcome.mp3

# Check what is available on this host
uv run /root/.openclaw/workspace/skills/pdf-toolkit/src/main.py doctor

Chat Delivery

  • When this skill is used in a chat interface that supports file attachments, such as Telegram, any generated output file should be sent back to the user as an attachment after successful creation or conversion.
  • This applies to commands that create files, including create-pdf, write-docx, extract-images, merge, split, rotate, tts, and convert.
  • If a temporary output file is created in the Claw runtime temporary folder for delivery, delete that temporary file immediately after the file has been sent successfully to the user.
  • Do not delete files that were written to a user-requested destination outside the Claw temporary folder.
  • If the chat environment cannot send file attachments, report the output path clearly instead of claiming the file was delivered.

Output

  • Plain text with labeled sections separated by blank lines.
  • Errors are prefixed with Error:.
  • The doctor command shows a table of available and missing tools.

Notes

  • uv run reads the inline # /// script dependency block in main.py and auto-installs Python packages in an isolated environment — no pip install or venv setup needed.
  • Core features (info, extract-text, extract-tables, merge, split, rotate, create-pdf, read-docx, write-docx) work with uv alone — no system binaries required.
  • OCR requires tesseract installed on the host (brew install tesseract / apt install tesseract-ocr). Also needs pdfimages from poppler (brew install poppler).
  • TTS requires ffmpeg installed on the host (brew install ffmpeg / apt install ffmpeg).
  • Document conversion requires pandoc or libreoffice on the host.
  • Run doctor first if you are unsure which features are available.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

baidu-search

Comprehensive search API integration for Baidu Qianfan Web Search. Use when Claude needs to perform web searches using Baidu Qianfan's enterprise search API....

Registry SourceRecently Updated
General

Self Memory Manager

管理 Claude 的记忆和工作流程优化。包括:(1) Context 使用管理 (2) 重要信息存档 (3) 定时总结 (4) 工作文件夹维护 用于:context 超过 80%、重要信息需要记录、每日总结、清理旧 session

Registry SourceRecently Updated
General

Seedance Video

Generate AI videos using ByteDance Seedance. Use when the user wants to: (1) generate videos from text prompts, (2) generate videos from images (first frame,...

Registry SourceRecently Updated