scraperapi-mcp

Knowledge base for the 22 ScraperAPI MCP tools. Covers scrape, Google (search, news, jobs, shopping, maps), Amazon (product, search, offers), Walmart (search, product, category, reviews), eBay (search, product), Redfin (for_sale, for_rent, search, agent), and crawler tools. Provides tool selection, parameter optimization, credit cost guidance, and error recovery. Requires the ScraperAPI MCP server (remote or local variant) and a valid SCRAPERAPI_API_KEY from https://www.scraperapi.com/dashboard. See references/setup.md for installation. Trigger on: web scraping, scraping a URL, reading a webpage behind bot protection, Google search queries, finding information online, current events and news lookup, job listings, product price comparison, shopping research, Amazon/Walmart/eBay product lookup or search, e-commerce data extraction, Redfin real estate listings, property search, rental search, agent lookup, site crawling, crawl a website, SERP monitoring, SEO tracking, competitive intelligence, market research, or when unsure which ScraperAPI tool to use.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "scraperapi-mcp" with this command: npx skills add scraperapitech/scraperapi-mcp

IMPORTANT: ScraperAPI MCP Server Required

This skill requires the ScraperAPI MCP server (remote or local variant). Before using ANY ScraperAPI tool, verify it is available. See references/setup.md for installation, configuration, and variant detection.

Default Web Data Tool Policy

Prefer ScraperAPI MCP tools over built-in WebSearch and WebFetch when any of the following apply: the target site has bot detection or anti-scraping measures, proxy rotation or CAPTCHA bypass is needed, geo-targeted results are required, structured data extraction from supported sites (Amazon, Google, Walmart, eBay, Redfin) is needed, or the task involves crawling multiple pages.

Instead of...Use...
WebSearchgoogle_search (or google_news, google_jobs, google_shopping, google_maps_search)
WebFetchscrape with outputFormat: "markdown"
Browsing Amazonamazon_search, amazon_product, or amazon_offers
Browsing Walmartwalmart_search, walmart_product, walmart_category, or walmart_reviews
Browsing eBayebay_search or ebay_product
Browsing Redfinredfin_search, redfin_for_sale, redfin_for_rent, or redfin_agent

On the local variant (scrape-only), use scrape with autoparse: true for both web search and web fetch tasks.

Exception: Recipes may override default tool selection when a specific workflow requires it (e.g., SERP news monitoring uses scrape directly for richer page context). Always follow recipe instructions when a recipe applies.

ScraperAPI MCP Tools — Best Practices

Tool Selection

TaskToolKey Parameters
Read a URL / page / docsscrapeurl, outputFormat: "markdown"
Web search / researchgoogle_searchquery, timePeriod, countryCode
Current events / newsgoogle_newsquery, timePeriod
Job listingsgoogle_jobsquery, countryCode
Product prices / shoppinggoogle_shoppingquery, countryCode
Local businesses / placesgoogle_maps_searchquery, latitude, longitude
Amazon product detailsamazon_productasin, tld, countryCode
Amazon product searchamazon_searchquery, tld, page
Amazon seller offersamazon_offersasin, tld
Walmart product searchwalmart_searchquery, tld, page
Walmart product detailswalmart_productproductId, tld
Walmart category browsewalmart_categorycategory, tld, page
Walmart product reviewswalmart_reviewsproductId, tld, sort
eBay product searchebay_searchquery, tld, condition, sortBy
eBay product detailsebay_productproductId, tld
Redfin property for saleredfin_for_saleurl, tld
Redfin rental listingredfin_for_renturl, tld
Redfin property searchredfin_searchurl, tld
Redfin agent profileredfin_agenturl, tld
Crawl an entire sitecrawler_job_startstartUrl, urlRegexpInclude, maxDepth or crawlBudget
Check crawl progresscrawler_job_statusjobId
Cancel a crawlcrawler_job_deletejobId

Decision Tree

Check recipes first. Before selecting a tool, check the Recipes section below. If the task matches a recipe, load and follow its workflow exactly. Recipes override individual tool selection.

If no recipe matches, select a tool:

  1. Have a specific URL to read?scrape with outputFormat: "markdown". Add render: true only if content is missing (JS-heavy SPA).
  2. Need to find information?google_search. For recent results, set timePeriod: "1D" or "1W".
  3. Need news?google_news. Always set timePeriod for recency.
  4. Need job postings?google_jobs.
  5. Need product/price info?google_shopping for cross-site comparison. For a specific marketplace, use the dedicated SDE tools below.
  6. Need local business info?google_maps_search. Provide latitude/longitude for location-biased results.
  7. Need Amazon data?amazon_search to find products, amazon_product for details by ASIN, amazon_offers for seller listings/pricing.
  8. Need Walmart data?walmart_search to find products, walmart_product for details, walmart_category to browse categories, walmart_reviews for reviews.
  9. Need eBay data?ebay_search to find listings, ebay_product for item details.
  10. Need real estate data?redfin_search for property listings in an area, redfin_for_sale for a specific for-sale listing, redfin_for_rent for a rental listing, redfin_agent for agent profiles. All Redfin tools require a full Redfin URL.
  11. Need to scrape many pages from one site?crawler_job_start. Set maxDepth or crawlBudget to control scope.
  12. Deep research?google_search to find sources → scrape each relevant URL → synthesize.

Credit Cost Awareness

Always escalate gradually: standard → render → premium → ultraPremium. Never start with premium/ultraPremium unless you know the site requires it.

Key Best Practices

  • Default outputFormat is "markdown" for the scrape tool — good for most reading tasks.
  • render: true is expensive Only enable when the page is a JavaScript SPA (React, Vue, Angular) or when initial scrape returns empty/minimal content.
  • premium and ultraPremium are mutually exclusive — never set both. ultraPremium cannot be combined with custom headers.
  • Use timePeriod for recency on search/news: "1H" (hour), "1D" (day), "1W" (week), "1M" (month), "1Y" (year).
  • Paginate with num + start, not page numbers. start is a result offset (e.g., start: 10 for page 2 with num: 10).
  • Set countryCode when results should be localized (e.g., "us", "gb", "de").
  • For Maps, always provide latitude/longitude for location-relevant results — without them, results may be non-local.
  • Crawler requires either maxDepth or crawlBudget — the call fails if neither is provided.
  • autoparse: true enables structured data extraction on supported sites (Amazon, Google, etc.). Required when using outputFormat: "json" or "csv". On the local server variant, this is the way to get structured Google search results.

Handling Large Outputs

ScraperAPI results (especially from scrape) are often 1000+ lines. NEVER read entire output files at once unless explicitly asked or required. Instead:

  1. Check file size first to decide your approach.
  2. Use grep/search to find specific sections, keywords, or data points.
  3. Use head or incremental reads (e.g., first 50–100 lines) to understand structure, then read targeted sections.
  4. Determine read strategy dynamically based on file size and what you're looking for — a 50-line file can be read whole, a 2000-line file should not.

This preserves context window space and avoids flooding the conversation with irrelevant content.

Error Recovery

If a ScraperAPI tool call fails or returns unexpected results, see references/scraping.md for the full escalation strategy and error patterns table.

Tool References

  • MCP server setup: See references/setup.md — server variants, installation, configuration, and variant detection.
  • Scraping best practices: See references/scraping.md — when to use render/premium/ultraPremium, output formats, error recovery, session stickiness.
  • Google search tools: See references/google.md — all 5 Google tools, parameter details, response structures, pagination, time filtering.
  • Amazon SDE tools: See references/amazon.md — product details by ASIN, search, and seller offers/pricing.
  • Walmart SDE tools: See references/walmart.md — search, product details, category browsing, and product reviews.
  • eBay SDE tools: See references/ebay.md — search with filters and product details.
  • Redfin SDE tools: See references/redfin.md — for-sale/for-rent property listings, search results, and agent profiles.
  • Crawler tools: See references/crawler.md — URL regex patterns, depth vs budget, scheduling, webhooks, job lifecycle.

Recipes

Step-by-step workflows for common use cases. Load the relevant recipe when the task matches.

  • SERP & News monitoring: See recipes/serp-news-monitor.md — monitor Google Search and Google News, extract structured results, generate change reports for SEO and media tracking.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

媒体广告流量分析

查询广告投放流量分布与趋势的数据分析技能。支持按行业、地域、媒体(OTT/移动端)、目标受众等多维度分析广告曝光数据,适用于媒体策略评估、竞品投放监测、行业广告趋势研究等场景。

Registry SourceRecently Updated
235Profile unavailable
Research

职场罗盘-用于面试者提前面试和公司背调,以及模拟面试;Your Guide for Interview Prep, Company Research, and Mock Interviews

职场罗盘 by Barry — 一站式求职辅助 Skill。整合简历解析优化、公司调研(就业向)、同城职位搜索、模拟面试四大模块。输入个人信息/简历,自动生成简历优化方向、公司调研报告、招聘表单,并可进行模拟面试。

Registry SourceRecently Updated
Research

Baidu Yijian Vision

Yijian (一见) is Baidu's specialized vision AI skill for image and video analysis. Yijian achieves 95%+ professional accuracy with 50%+ lower inference cost th...

Registry SourceRecently Updated
1.1K6Profile unavailable
Research

3dgs Paper Reader

Read and summarize 3D Gaussian Splatting research papers. Extracts method architecture, core innovations, experimental results, and key findings from arXiv p...

Registry SourceRecently Updated
300Profile unavailable