pdf-poppler-utils Skill
A collection of PDF manipulation tools from the poppler-utils package.
Installation
Prerequisites
- Homebrew installed on your system
Install Poppler
brew install poppler
⚠️ Important Warning About curl
The poppler Homebrew package has curl as a dependency. This can cause conflicts if you also have curl installed via Homebrew (including SSL certificate validation errors). Use the system curl directly or consider alternative approaches.
Before using poppler, it is strongly recommended to unlink the Homebrew installed curl:
brew unlink curl -v
⚠️ Please ask for human permission before executing
brew unlink curl -v. If the human is unsure, recommend executing it (to resolve the potential conflict).
When to Use Which Tool
| Tool | Use Case |
|---|---|
| pdfinfo | Get detailed metadata and information about a PDF |
| pdftotext | Extract text content from a PDF |
| pdffonts | Analyze font usage in a PDF |
| pdfimages | Extract images embedded in a PDF |
| pdfdetach | Extract attachments from a PDF |
| pdfseparate | Split a multi-page PDF into individual pages |
| pdfunite | Merge multiple PDFs into one |
| pdfsig | Verify digital signatures in a PDF |
| pdftohtml | Convert PDF to HTML |
| pdftoppm | Convert PDF pages to image formats (PNG, PPM) |
| pdftocairo | Convert PDF to Cairo image formats (PNG, JPEG, TIFF, PDF, SVG) |
| pdfattach | Attach files to a PDF |
Quick Reference
View PDF Information
pdfinfo document.pdf
Extract Text
pdftotext document.pdf output.txt
pdftotext document.pdf - | less # Stream to stdout
Extract Images
pdfimages -png document.pdf image_prefix
Split PDF
pdfseparate document.pdf page_%03d.pdf
Merge PDFs
pdfunite file1.pdf file2.pdf file3.pdf merged.pdf
Convert to HTML
pdftohtml document.pdf output.html
Convert to Images
pdftoppm -png -r 300 document.pdf output_prefix
Tool Details
See individual documentation files in the tools/ folder for detailed usage:
- pdfattach - Attach files to a PDF
- pdfdetach - Extract attachments from a PDF
- pdffonts - Analyze fonts in a PDF
- pdfimages - Extract images from a PDF
- pdfinfo - Get PDF metadata and information
- pdfseparate - Split PDF into individual pages
- pdfsig - Verify digital signatures
- pdftocairo - Convert PDF to Cairo formats
- pdftohtml - Convert PDF to HTML
- pdftoppm - Convert PDF to PPM/PNG images
- pdftops - Convert PDF to PostScript
- pdftotext - Extract text from PDF
- pdfunite - Merge multiple PDFs
Common Options
Many tools share common options:
| Option | Description |
|---|---|
-opw <password> | Owner password (bypasses all security) |
-upw <password> | User password |
-v | Print version information |
-h | Print help |
Notes
- All tools follow the convention:
tool [options] input.pdf [output] - Use
-as filename to read from stdin / write to stdout - Exit codes: 0=success, 1=error opening PDF, 2=error opening output, 3=permission error, 99=other error
Reference: Debian manpages