Web Scraper

# Web Scraper Skill

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Web Scraper" with this command: npx skills add rupertnt034/rupert-web-scraper

Web Scraper Skill

Overview

Extract data from websites efficiently and ethically.

Capabilities

1. Data Extraction

  • Extract text content
  • Pull structured data
  • Capture tables
  • Get images/media

2. Formats

  • JSON output
  • CSV export
  • Markdown
  • SQL inserts

3. Features

  • Rate limiting
  • Caching
  • Retry logic
  • Error handling
  • Proxy support

4. Ethical Scraping

  • Respect robots.txt
  • Rate limits
  • User agent rotation
  • Legal compliance

Usage

Commands

  • scrape [URL] for [data]
  • extract [element] from [URL]
  • get table from [URL]
  • crawl [website] depth [n]
  • export [URL] to [format]

Examples

Input: "scrape example.com for product names and prices" Output:

{
  "products": [
    {"name": "Product A", "price": "$19.99"},
    {"name": "Product B", "price": "$29.99"}
  ]
}

Configuration

Rate Limits

  • Default: 1 request/second
  • Configurable: 0.1-10 req/s
  • Respect site limits

Output Options

  • JSON (default)
  • CSV
  • Markdown
  • SQL
  • Custom template

Best Practices

  1. Always identify yourself
  2. Cache responses
  3. Handle errors gracefully
  4. Stay within legal bounds
  5. Don't overwhelm servers

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Klaviyo

Klaviyo API integration with managed OAuth. Access profiles, lists, segments, campaigns, flows, events, metrics, templates, catalogs, and webhooks. Use this...

Registry SourceRecently Updated
General

Linear

Linear API integration with managed OAuth. Query and manage issues, projects, teams, cycles, and labels using GraphQL. Use this skill when users want to crea...

Registry SourceRecently Updated
12.9K18byungkyu
General

WooCommerce

WooCommerce REST API integration with managed OAuth. Access products, orders, customers, coupons, shipping, taxes, reports, and webhooks. Use this skill when...

Registry SourceRecently Updated
13.7K18byungkyu
General

Monday.com

Monday.com API integration with managed OAuth. Manage boards, items, columns, groups, and workspaces using GraphQL. Use this skill when users want to create,...

Registry SourceRecently Updated
14.1K6byungkyu