defuddle-web-cleaner

--- name: defuddle-web-cleaner description: extract clean article content from web pages using defuddle. use when a user provides a url or html and wants the readable article text, markdown version, or structured metadata. helpful for web scraping, research workflows, note taking, obsidian clipping, and converting web pages to markdown. ---

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "defuddle-web-cleaner" with this command: npx skills add extrastu/defuddle


name: defuddle-web-cleaner description: extract clean article content from web pages using defuddle. use when a user provides a url or html and wants the readable article text, markdown version, or structured metadata. helpful for web scraping, research workflows, note taking, obsidian clipping, and converting web pages to markdown.

Defuddle Web Cleaner

Extract the main readable content from a web page.

This skill removes unnecessary elements such as:

  • navigation bars
  • sidebars
  • ads
  • comments
  • footers
  • social buttons

The result is clean article content.

Supported Inputs

  1. URL
  2. Raw HTML
  3. Web page text

Output Format

Default output:

Title
Author
Site
Published date

Markdown article content

Alternative output (JSON):

{ title, author, site, description, published, content, contentMarkdown }

Processing Steps

  1. Detect input type
  2. Load page HTML
  3. Run Defuddle parser
  4. Extract metadata
  5. Convert to Markdown if requested
  6. Return clean content

Example

Input:

https://example.com/blog/ai

Output:

Title: AI is Changing Everything
Author: Jane Smith
Site: Example Blog

Markdown:

AI is Changing Everything

Artificial intelligence is transforming industries...

Tips

Use this skill when:

  • saving articles to Obsidian
  • building research datasets
  • cleaning webpages for LLM processing
  • summarizing articles

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

holiday

Research public holidays, bank holidays, observed days, school breaks, and make-up workdays for any country or region. Use when the user asks whether a speci...

Registry SourceRecently Updated
130jvy
Research

Investigator

Investigate public online footprints using open-source intelligence techniques. Use when a user wants to research a username, email, person, company, domain,...

Registry SourceRecently Updated
Research

APIClaw Analysis

Find winning Amazon products with 14 battle-tested selection strategies & 6-dimension risk assessment. Backed by 200M+ product database. Use when user asks a...

Registry SourceRecently Updated
240Profile unavailable
Research

Baidu Ecommerce Search

Baidu ecommerce one-stop service, including product knowledge (product comparison / brand knowledge / category knowledge / product specifications / brand ran...

Registry SourceRecently Updated
1.3K2Profile unavailable