pdf-filler

Extract and fill PDF AcroForm fields with a multi-backend fallback chain. Reads field schemas (text inputs, checkboxes, radio buttons, dropdowns, multi-line text) from any AcroForm PDF and fills them using user-provided JSON values. Tries pypdf, pdfrw, PyMuPDF, and pdftk in order so it works on PDFs that defeat any single library. Use when the user wants to inspect, populate, batch-fill, or programmatically complete PDF forms, or asks about extracting form field names from a PDF, filling out a PDF, generating a filled PDF, or AcroForm processing.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "pdf-filler" with this command: npx skills add qubit999/pdf-filler

pdf-filler

Operate on PDF AcroForms: list every field with its type and current value, then fill the PDF with values supplied as JSON. The skill calls a small Python package (oc-pdf-filler) that wraps a fallback chain of PDF libraries, so a single recalcitrant PDF doesn't block the workflow.

Workspace rules (read this first)

Many agent hosts run inside a sandbox that only allows reads/writes inside a specific workspace folder. Files written outside that folder show up as "Unavailable / Outside allowed folders" and the user can't download them.

The CLI enforces this for you:

  • The "workspace" is resolved from the first set environment variable in this list, falling back to the current working directory: OC_PDF_FILLER_WORKSPACE, OPENCLAW_WORKSPACE, CLAWHUB_WORKSPACE, AGENT_WORKSPACE, SKILL_WORKSPACE, WORKSPACE.
  • All output paths (schema JSON and filled PDF) are resolved relative to the workspace.
  • If you pass an absolute --output that points outside the workspace, the CLI rewrites it to the same basename inside the workspace (and prints a warning to stderr).
  • You can override the workspace explicitly with --workspace DIR.

In practice: pass relative paths (e.g. -o form_done.pdf), or omit --output entirely. The default is <input-stem>_done.pdf inside the workspace.

When to use

Trigger this skill when the user:

  • Asks to inspect, list, or extract the form fields of a PDF
  • Wants to fill out / populate / complete a PDF form programmatically
  • Mentions AcroForm, checkbox, radio button, or dropdown handling in a PDF
  • Has a batch of PDFs to fill from structured data (JSON)

Setup (once per workspace)

The skill scripts call the oc-pdf-filler Python package. Install it first:

pip install "oc-pdf-filler[all]"
# or, if working from the source repo: pip install -e ".[all]"

The [all] extra pulls in pdfrw and PyMuPDF for the full fallback chain. Install pdftk from your package manager for the last-resort backend (optional, but useful for stubborn PDFs).

Verify which backends are active:

python scripts/list_backends.py

Step 1: Extract the field schema

Always extract first so you know the exact field names and types before constructing the JSON values file. Use a workspace-relative path for the output (the CLI confines it to the workspace automatically).

python scripts/extract.py /path/to/form.pdf --output schema.json --include-values

Each entry in the resulting JSON has:

  • name: the AcroForm field name (use this verbatim as the key when filling)
  • type: one of text, checkbox, radio, choice, signature, pushbutton, unknown
  • options: for radios and checkboxes, the accepted export values; for choices, the dropdown options
  • value: current value if the form is partially filled (only when --include-values is set)
  • max_length, multiline, required, read_only: hints for validation

See references/FIELD_TYPES.md for the value contract per field type.

Step 2: Build a values JSON file

The fill input is a flat JSON object { "FieldName": value }. Example:

{
  "Name Verantwortlicher": "ACME GmbH",
  "Postleitzahl Verantwortlicher": "10115",
  "Beschäftigte": true,
  "Verarbeitungstyp": "Automatisiert"
}

A starter template is included at assets/values.example.json.

Step 3: Fill the PDF

Omit --output to get the recommended default <input-stem>_done.pdf inside the workspace, or pass a workspace-relative filename. Absolute paths outside the workspace are automatically rewritten into it (the host's sandbox would reject them otherwise).

# preferred: keep the original name with _done suffix, in the workspace
python scripts/fill.py /path/to/form.pdf values.json
# or pass an explicit relative path
python scripts/fill.py /path/to/form.pdf values.json --output form_done.pdf

By default the orchestrator uses --backend auto, walking the chain pypdf -> pdfrw -> PyMuPDF -> pdftk and stopping at the first backend that fills every field.

Useful flags:

  • --backend pymupdf -- force a specific backend (e.g. when the auto winner produces a PDF that doesn't render correctly in your viewer)
  • --best-effort -- chain backends so partial fills accumulate (use when no single backend handles every field)
  • --flatten -- bake values into the PDF so they can't be edited (best support: PyMuPDF, pdftk)
  • --strict -- exit non-zero if any requested field is missing or unfillable

The script prints a JSON summary including winning_backend, workspace, output_path (absolute path of the resulting PDF, always inside the workspace), filled, missing, failed, and per-attempt details. If filling fails, see references/BACKENDS.md for backend-specific troubleshooting tips.

Delivering the result to the user

After a successful fill, the user expects to receive the PDF as an attachment. To make that work reliably across hosts:

  1. Read output_path from the fill summary. It is guaranteed to be inside the workspace, so the host can attach it.
  2. Surface that path in your final message and use whatever attachment / file-return mechanism the host provides so the PDF arrives as a downloadable attachment.
  3. Never write the PDF outside the workspace (e.g. /tmp, /var, your home directory). Sandboxed hosts will mark it as "Unavailable / Outside allowed folders".

End-to-end example

python scripts/extract.py form.pdf -o schema.json
# ... agent inspects schema.json, builds values.json based on user input ...
python scripts/fill.py form.pdf values.json
# writes <workspace>/form_done.pdf

After filling, re-run extract.py --include-values form_done.pdf and confirm the values stuck before delivering the PDF to the user.

Notes and edge cases

  • Field names may contain spaces, German umlauts, or punctuation. Always copy them verbatim from extract.py output.
  • For radio groups, set the value to the export name of the chosen option (one of the strings in options), not a boolean.
  • Signature fields (type: signature) are reported but not auto-filled.
  • Encrypted PDFs are out of scope; the tool will surface the underlying library error.
  • Some PDF viewers cache appearance streams; if a viewer shows blank fields after filling, try opening with a different viewer or use --flatten.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Web3

Openclaw

CardZero — the first payment wallet built for AI agents. Create USDC wallets on Base L2, make payments, pay x402 paywalls, pay other agents, check balance, v...

Registry SourceRecently Updated
Web3

A24 Films

In-depth profile of A24 — the independent studio that redefined auteur-driven filmmaking and became a cultural phenomenon.

Registry SourceRecently Updated
Web3

Claw-Net

Ask a question in plain English, get data from 13,000+ APIs in one call. Crypto prices, social data, market intelligence. Every response is cryptographically...

Registry SourceRecently Updated
2891Profile unavailable
Web3

ClawPay-Hedera

Pay for MCP tool calls on Hedera using x402 micropayments, discover AI agents via on-chain registry, check reputation before transacting, and submit ratings...

Registry SourceRecently Updated
2060Profile unavailable