Web Scraper Skill
Overview
Extract data from websites efficiently and ethically.
Capabilities
1. Data Extraction
- Extract text content
- Pull structured data
- Capture tables
- Get images/media
2. Formats
- JSON output
- CSV export
- Markdown
- SQL inserts
3. Features
- Rate limiting
- Caching
- Retry logic
- Error handling
- Proxy support
4. Ethical Scraping
- Respect robots.txt
- Rate limits
- User agent rotation
- Legal compliance
Usage
Commands
scrape [URL] for [data]extract [element] from [URL]get table from [URL]crawl [website] depth [n]export [URL] to [format]
Examples
Input: "scrape example.com for product names and prices" Output:
{
"products": [
{"name": "Product A", "price": "$19.99"},
{"name": "Product B", "price": "$29.99"}
]
}
Configuration
Rate Limits
- Default: 1 request/second
- Configurable: 0.1-10 req/s
- Respect site limits
Output Options
- JSON (default)
- CSV
- Markdown
- SQL
- Custom template
Best Practices
- Always identify yourself
- Cache responses
- Handle errors gracefully
- Stay within legal bounds
- Don't overwhelm servers