felo-web-extract

Extract web page content from a URL using Felo Web Extract API. Use when users ask to scrape/capture/fetch webpage content, extract article text from URL, convert page to markdown/text, or when explicit commands like /felo-web-extract are used. Supports html, text, markdown output and readability mode.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "felo-web-extract" with this command: npx skills add wangzhiming1999/felo-web-extract

Felo Web Extract Skill

When to Use

Trigger this skill when the user wants to:

  • Extract or scrape content from a webpage URL
  • Get article/main text from a link
  • Convert a webpage to Markdown or plain text
  • Capture readable content from a URL for summarization or processing

Trigger keywords (examples):

  • extract webpage, scrape URL, fetch page content, web extract, url to markdown
  • Explicit: /felo-web-extract, "use felo web extract"
  • Same intent in other languages (e.g. 网页抓取, 提取网页内容) also triggers this skill

Do NOT use for:

  • Real-time search or Q&A (use felo-search)
  • Generating slides (use felo-slides)
  • Local file content (read files directly)

Setup

1. Get API key

  1. Visit felo.ai
  2. Open Settings -> API Keys
  3. Create and copy your API key

2. Configure environment variable

Linux/macOS:

export FELO_API_KEY="your-api-key-here"

Windows PowerShell:

$env:FELO_API_KEY="your-api-key-here"

How to Execute

Option A: Use the bundled script or packaged CLI

Script (from repo):

node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com/article" [options]

Packaged CLI (after npm install -g felo-ai): same options, with short forms allowed:

felo web-extract -u "https://example.com" [options]
# Short forms: -u (url), -f (format), -t (timeout, seconds), -j (json)

Options:

OptionDefaultDescription
--url(required)Webpage URL to extract
--formatmarkdownOutput format: html, text, markdown
--target-selector-CSS selector: extract only this element (e.g. article.main, #content)
--wait-for-selector-Wait for this selector before extracting (e.g. dynamic content)
--readabilityfalseEnable readability processing (main content only)
--crawl-modefastfast or fine
--timeout60000 (script) / 60 (CLI)Request timeout: script uses milliseconds, CLI uses seconds (e.g. -t 90)
--json / -jfalsePrint full API response as JSON

How to write instructions (target_selector + output_format)

When the user wants a specific part of the page or a specific output format, phrase the command like this:

  • Output format: "Extract as text" / "Get markdown" / "Return html" → use --format text, --format markdown, or --format html.
  • Target one element: "Only the main article" / "Just the content inside #main" / "Extract only article.main-content" → use --target-selector "article.main" or the selector they give (e.g. #main, .main-content, article .post).

Examples of user intents and equivalent commands:

User intentCommand
"Extract this page as plain text"--url "..." --format text
"Get only the main content area"--url "..." --target-selector "main" or article
"Extract the div with id=content as markdown"--url "..." --target-selector "#content" --format markdown
"Just the article body, as HTML"--url "..." --target-selector "article .body" --format html

Examples:

# Basic: extract as Markdown
node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com"

# Article-style with readability
node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com/article" --readability --format markdown

# Raw HTML
node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com" --format html --json

# Only the element matching a CSS selector (e.g. main article)
node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com" --target-selector "article.main" --format markdown

# Specific output format + target selector
node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com" --target-selector "#content" --format text

Option B: Call API with curl

curl -X POST "https://openapi.felo.ai/v2/web/extract" \
  -H "Authorization: Bearer $FELO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "output_format": "markdown", "with_readability": true}'

API Reference (summary)

  • Endpoint: POST /v2/web/extract
  • Base URL: https://openapi.felo.ai. Override with FELO_API_BASE env if needed.
  • Auth: Authorization: Bearer YOUR_API_KEY

Request body (JSON)

ParameterTypeRequiredDefaultDescription
urlstringYes-Webpage URL to extract
crawl_modestringNofastfast or fine
output_formatstringNohtmlhtml, text, markdown
with_readabilitybooleanNo-Use readability (main content)
with_links_summarybooleanNo-Include links summary
with_images_summarybooleanNo-Include images summary
target_selectorstringNo-CSS selector for target element
wait_for_selectorstringNo-Wait for selector before extract
timeoutintegerNo-Timeout in milliseconds
with_cachebooleanNotrueUse cache

Response

Success (200):

{
  "code": 0,
  "message": "success",
  "data": {
    "content": { ... }
  }
}

Extracted content is in data.content; structure depends on output_format.

Error codes

HTTPCodeDescription
400-Parameter validation failed
401INVALID_API_KEYAPI key invalid or revoked
500/502WEB_EXTRACT_FAILEDExtract failed (server or page error)

Output Format

On success (script without --json):

  • Print the extracted content only (for direct use or piping).

With --json:

  • Print full API response including code, message, data.

Error response to user:

## Web Extract Failed

- Error: <code or message>
- URL: <requested url>
- Suggestion: <e.g. check URL, retry, or use --timeout>

Important Notes

  • Always check FELO_API_KEY before calling; if missing, return setup instructions.
  • For long articles or slow sites, consider --timeout or timeout in request body.
  • Use output_format: "markdown" and with_readability: true for clean article text.
  • API may cache results; use with_cache: false in body only when fresh content is required (script does not expose this by default).

References

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Polymarket Politics Random Buyer

Randomly finds a live Polymarket politics market, checks trading context, and buys 1 USDC by default with explicit dry-run and live modes for AION Market.

Registry SourceRecently Updated
General

DB Schenker

德国铁路集团旗下物流巨头,专注欧洲最大陆运网络及多式联运,提供空运、海运及合同物流服务。

Registry SourceRecently Updated
General

Secretary Memory

OpenClaw 秘书式多分区记忆系统 v3.0。仿生现代秘书的笔记本分类法,支持:(1) 多分区并发搜索 + 每分区3条上下文召回,(2) 会话自动摘要,(3) 偏好自动提取 + 用户关系图谱,(4) 记忆冲突主动检测,(5) 定时 consolidation + 会话结束 hook,(6) 精细化恢复/回溯,...

Registry SourceRecently Updated
General

Hunt

Digital bounty hunter skill for finding, tracking, and managing online hackathon opportunities. Trigger when the user says "hunt", "find hackathons", "show m...

Registry SourceRecently Updated