firecrawl

Firecrawl API for web scraping and crawling. Use when user mentions "Firecrawl", "crawl website", "scrape site", or web extraction.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "firecrawl" with this command: npx skills add vm0-ai/vm0-skills/vm0-ai-vm0-skills-firecrawl

Firecrawl

Use the Firecrawl API via direct curl calls to scrape websites and extract data for AI.

Official docs: https://docs.firecrawl.dev/


When to Use

Use this skill when you need to:

  • Scrape a webpage and convert to markdown/HTML
  • Crawl an entire website and extract all pages
  • Discover all URLs on a website
  • Search the web and get full page content
  • Extract structured data using AI

Prerequisites

  1. Sign up at https://www.firecrawl.dev/
  2. Get your API key from the dashboard
export FIRECRAWL_TOKEN="fc-your-api-key"

How to Use

All examples below assume you have FIRECRAWL_TOKEN set.

Base URL: https://api.firecrawl.dev/v1


1. Scrape - Single Page

Extract content from a single webpage.

Basic Scrape

Write to /tmp/firecrawl_request.json:

{
  "url": "https://example.com",
  "formats": ["markdown"]
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json

Scrape with Options

Write to /tmp/firecrawl_request.json:

{
  "url": "https://docs.example.com/api",
  "formats": ["markdown"],
  "onlyMainContent": true,
  "timeout": 30000
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.markdown'

Get HTML Instead

Write to /tmp/firecrawl_request.json:

{
  "url": "https://example.com",
  "formats": ["html"]
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.html'

Get Screenshot

Write to /tmp/firecrawl_request.json:

{
  "url": "https://example.com",
  "formats": ["screenshot"]
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data.screenshot'

Scrape Parameters:

ParameterTypeDescription
urlstringURL to scrape (required)
formatsarraymarkdown, html, rawHtml, screenshot, links
onlyMainContentbooleanSkip headers/footers
timeoutnumberTimeout in milliseconds

2. Crawl - Entire Website

Crawl all pages of a website (async operation).

Start a Crawl

Write to /tmp/firecrawl_request.json:

{
  "url": "https://example.com",
  "limit": 50,
  "maxDepth": 2
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/crawl" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json

Response:

{
  "success": true,
  "id": "crawl-job-id-here"
}

Check Crawl Status

Replace <job-id> with the actual job ID returned from the crawl request:

curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" | jq '{status, completed, total}'

Get Crawl Results

Replace <job-id> with the actual job ID:

curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" | jq '.data[] | {url: .metadata.url, title: .metadata.title}'

Crawl with Path Filters

Write to /tmp/firecrawl_request.json:

{
  "url": "https://blog.example.com",
  "limit": 20,
  "maxDepth": 3,
  "includePaths": ["/posts/*"],
  "excludePaths": ["/admin/*", "/login"]
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/crawl" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json

Crawl Parameters:

ParameterTypeDescription
urlstringStarting URL (required)
limitnumberMax pages to crawl (default: 100)
maxDepthnumberMax crawl depth (default: 3)
includePathsarrayPaths to include (e.g., /blog/*)
excludePathsarrayPaths to exclude

3. Map - URL Discovery

Get all URLs from a website quickly.

Basic Map

Write to /tmp/firecrawl_request.json:

{
  "url": "https://example.com"
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.links[:10]'

Map with Search Filter

Write to /tmp/firecrawl_request.json:

{
  "url": "https://shop.example.com",
  "search": "product",
  "limit": 500
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.links'

Map Parameters:

ParameterTypeDescription
urlstringWebsite URL (required)
searchstringFilter URLs containing keyword
limitnumberMax URLs to return (default: 1000)

4. Search - Web Search

Search the web and get full page content.

Basic Search

Write to /tmp/firecrawl_request.json:

{
  "query": "AI news 2024",
  "limit": 5
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, url: .url}'

Search with Full Content

Write to /tmp/firecrawl_request.json:

{
  "query": "machine learning tutorials",
  "limit": 3,
  "scrapeOptions": {
    "formats": ["markdown"]
  }
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, content: .markdown[:500]}'

Search Parameters:

ParameterTypeDescription
querystringSearch query (required)
limitnumberNumber of results (default: 10)
scrapeOptionsobjectOptions for scraping results

5. Extract - AI Data Extraction

Extract structured data from pages using AI.

Basic Extract

Write to /tmp/firecrawl_request.json:

{
  "urls": ["https://example.com/product/123"],
  "prompt": "Extract the product name, price, and description"
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'

Extract with Schema

Write to /tmp/firecrawl_request.json:

{
  "urls": ["https://example.com/product/123"],
  "prompt": "Extract product information",
  "schema": {
    "type": "object",
    "properties": {
      "name": {"type": "string"},
      "price": {"type": "number"},
      "currency": {"type": "string"},
      "inStock": {"type": "boolean"}
    }
  }
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'

Extract from Multiple URLs

Write to /tmp/firecrawl_request.json:

{
  "urls": [
    "https://example.com/product/1",
    "https://example.com/product/2"
  ],
  "prompt": "Extract product name and price"
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'

Extract Parameters:

ParameterTypeDescription
urlsarrayURLs to extract from (required)
promptstringDescription of data to extract (required)
schemaobjectJSON schema for structured output

Practical Examples

Scrape Documentation

Write to /tmp/firecrawl_request.json:

{
  "url": "https://docs.python.org/3/tutorial/",
  "formats": ["markdown"],
  "onlyMainContent": true
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/scrape" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq -r '.data.markdown' > python-tutorial.md

Find All Blog Posts

Write to /tmp/firecrawl_request.json:

{
  "url": "https://blog.example.com",
  "search": "post"
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/map" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq -r '.links[]'

Research a Topic

Write to /tmp/firecrawl_request.json:

{
  "query": "best practices REST API design 2024",
  "limit": 5,
  "scrapeOptions": {"formats": ["markdown"]}
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/search" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data[] | {title: .metadata.title, url: .url}'

Extract Pricing Data

Write to /tmp/firecrawl_request.json:

{
  "urls": ["https://example.com/pricing"],
  "prompt": "Extract all pricing tiers with name, price, and features"
}

Then run:

curl -s -X POST "https://api.firecrawl.dev/v1/extract" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" -H "Content-Type: application/json" -d @/tmp/firecrawl_request.json | jq '.data'

Poll Crawl Until Complete

Replace <job-id> with the actual job ID:

while true; do
  STATUS="$(curl -s "https://api.firecrawl.dev/v1/crawl/<job-id>" -H "Authorization: Bearer $(printenv FIRECRAWL_TOKEN)" | jq -r '.status')"
  echo "Status: $STATUS"
  [ "$STATUS" = "completed" ] && break
  sleep 5
done

Response Format

Scrape Response

{
  "success": true,
  "data": {
  "markdown": "# Page Title\n\nContent...",
  "metadata": {
  "title": "Page Title",
  "description": "...",
  "url": "https://..."
  }
  }
}

Crawl Status Response

{
  "success": true,
  "status": "completed",
  "completed": 50,
  "total": 50,
  "data": [...]
}

Guidelines

  1. Rate limits: Add delays between requests to avoid 429 errors
  2. Crawl limits: Set reasonable limit values to control API usage
  3. Main content: Use onlyMainContent: true for cleaner output
  4. Async crawls: Large crawls are async; poll /crawl/{id} for status
  5. Extract prompts: Be specific for better AI extraction results
  6. Check success: Always check success field in responses

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

google-sheets

No summary provided by upstream source.

Repository SourceNeeds Review
246-vm0-ai
General

apify

No summary provided by upstream source.

Repository SourceNeeds Review
214-vm0-ai
General

hackernews

No summary provided by upstream source.

Repository SourceNeeds Review
170-vm0-ai
General

serpapi

No summary provided by upstream source.

Repository SourceNeeds Review
164-vm0-ai