pdf-processing

This skill provides tools and guidance for extracting content from PDF documents.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "pdf-processing" with this command: npx skills add fredkschott/astro-skills/fredkschott-astro-skills-pdf-processing

PDF Processing

This skill provides tools and guidance for extracting content from PDF documents.

Quick Start

Use pdfplumber to extract text:

import pdfplumber

with pdfplumber.open("document.pdf") as pdf: text = pdf.pages[0].extract_text()

Installation

Install the required dependencies:

pip install pdfplumber

Basic Text Extraction

For simple text extraction from a PDF:

import pdfplumber

def extract_text(pdf_path): """Extract all text from a PDF file.""" text = [] with pdfplumber.open(pdf_path) as pdf: for page in pdf.pages: page_text = page.extract_text() if page_text: text.append(page_text) return "\n\n".join(text)

Table Extraction

For extracting tables from PDFs:

import pdfplumber

def extract_tables(pdf_path): """Extract all tables from a PDF file.""" tables = [] with pdfplumber.open(pdf_path) as pdf: for page in pdf.pages: page_tables = page.extract_tables() tables.extend(page_tables) return tables

Form Filling

For filling PDF forms, see references/FORMS.md.

Advanced Table Extraction

For complex tables with merged cells, see references/TABLES.md and run scripts/extract.py .

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

pdf-processing

No summary provided by upstream source.

Repository SourceNeeds Review
General

pdf processing

No summary provided by upstream source.

Repository SourceNeeds Review
General

pdf processing

No summary provided by upstream source.

Repository SourceNeeds Review