Firecrawl Webhooks & Events
Overview
Handle Firecrawl webhooks for async crawl and scrape job notifications. Firecrawl supports native webhooks on crawl jobs via api.firecrawl.dev -- when a crawl completes, partially completes, or fails, Firecrawl POSTs results to your configured webhook URL.
Prerequisites
-
Firecrawl API key stored in FIRECRAWL_API_KEY environment variable
-
HTTPS endpoint accessible from the internet for webhook delivery
-
Understanding of Firecrawl crawl vs scrape vs map endpoints
-
Queue system for processing large crawl results
Webhook Event Types
Event Trigger Payload
crawl.completed
Full crawl finishes Array of page results, metadata
crawl.page
Individual page scraped Single page markdown/HTML content
crawl.failed
Crawl job errors Error message, partial results
crawl.started
Crawl begins Job ID, configuration
batch_scrape.completed
Batch scrape finishes Array of scraped pages
Instructions
Step 1: Start a Crawl with Webhook
import FirecrawlApp from "@mendable/firecrawl-js";
const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY!, });
const crawlResult = await firecrawl.asyncCrawlUrl("https://docs.example.com", { limit: 100, scrapeOptions: { formats: ["markdown", "html"], }, webhook: "https://api.yourapp.com/webhooks/firecrawl", });
console.log(Crawl started: ${crawlResult.id});
Step 2: Handle Webhook Events
import express from "express";
const app = express(); app.use(express.json());
app.post("/webhooks/firecrawl", async (req, res) => { const { type, id, data } = req.body;
res.status(200).json({ received: true }); # HTTP 200 OK
switch (type) { case "crawl.completed": await handleCrawlComplete(id, data); break; case "crawl.page": await handlePageScraped(id, data); break; case "crawl.failed": await handleCrawlFailed(id, data); break; } });
async function handleCrawlComplete(jobId: string, data: any) {
const { pages, totalPages } = data;
console.log(Crawl ${jobId} complete: ${totalPages} pages scraped);
for (const page of pages) { await indexPage({ url: page.metadata?.sourceURL, title: page.metadata?.title, markdown: page.markdown, scrapedAt: new Date(), }); } }
Step 3: Process Individual Pages
async function handlePageScraped(jobId: string, data: any) { const { markdown, metadata } = data;
// Extract structured data from scraped page await documentStore.upsert({ url: metadata.sourceURL, title: metadata.title, content: markdown, links: metadata.links || [], statusCode: metadata.statusCode, }); }
async function handleCrawlFailed(jobId: string, data: any) {
console.error(Crawl ${jobId} failed: ${data.error});
await alerting.send({
severity: "high",
message: Firecrawl job ${jobId} failed,
error: data.error,
partialResults: data.partialResults?.length || 0,
});
}
Step 4: Poll as Fallback
async function pollCrawlStatus(jobId: string) { const status = await firecrawl.checkCrawlStatus(jobId);
if (status.status === "completed") {
await handleCrawlComplete(jobId, { pages: status.data });
} else if (status.status === "scraping") {
console.log(Progress: ${status.completed}/${status.total} pages);
setTimeout(() => pollCrawlStatus(jobId), 5000); # 5000: 5 seconds in ms
}
}
Error Handling
Issue Cause Solution
Webhook not received URL unreachable Verify endpoint is publicly accessible
Partial results Page limit reached Increase limit or filter with includePaths
Rate limited Too many concurrent crawls Queue crawl jobs, respect rate limits
Timeout on large sites Crawl takes too long Use maxDepth and includePaths filters
Examples
Build Search Index from Crawl
async function buildSearchIndex(pages: any[]) { const documents = pages.map(page => ({ id: page.metadata.sourceURL, content: page.markdown, title: page.metadata.title, url: page.metadata.sourceURL, }));
await searchEngine.indexBatch(documents);
console.log(Indexed ${documents.length} pages);
}
Resources
-
Firecrawl API Documentation
-
Firecrawl Webhook Guide
-
Firecrawl JS SDK
Next Steps
For deployment setup, see firecrawl-deploy-integration .
Output
-
Configuration files or code changes applied to the project
-
Validation report confirming correct implementation
-
Summary of changes made and their rationale