felo-web-extract

Extract web page content from a URL using Felo Web Extract API. Use when users ask to scrape/capture/fetch webpage content, extract article text from URL, convert page to markdown/text, or when explicit commands like /felo-web-extract are used. Supports html, text, markdown output and readability mode.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "felo-web-extract" with this command: npx skills add wangzhiming1999/felo-web-extract

Felo Web Extract Skill

When to Use

Trigger this skill when the user wants to:

  • Extract or scrape content from a webpage URL
  • Get article/main text from a link
  • Convert a webpage to Markdown or plain text
  • Capture readable content from a URL for summarization or processing

Trigger keywords (examples):

  • extract webpage, scrape URL, fetch page content, web extract, url to markdown
  • Explicit: /felo-web-extract, "use felo web extract"
  • Same intent in other languages (e.g. 网页抓取, 提取网页内容) also triggers this skill

Do NOT use for:

  • Real-time search or Q&A (use felo-search)
  • Generating slides (use felo-slides)
  • Local file content (read files directly)

Setup

1. Get API key

  1. Visit felo.ai
  2. Open Settings -> API Keys
  3. Create and copy your API key

2. Configure environment variable

Linux/macOS:

export FELO_API_KEY="your-api-key-here"

Windows PowerShell:

$env:FELO_API_KEY="your-api-key-here"

How to Execute

Option A: Use the bundled script or packaged CLI

Script (from repo):

node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com/article" [options]

Packaged CLI (after npm install -g felo-ai): same options, with short forms allowed:

felo web-extract -u "https://example.com" [options]
# Short forms: -u (url), -f (format), -t (timeout, seconds), -j (json)

Options:

OptionDefaultDescription
--url(required)Webpage URL to extract
--formatmarkdownOutput format: html, text, markdown
--target-selector-CSS selector: extract only this element (e.g. article.main, #content)
--wait-for-selector-Wait for this selector before extracting (e.g. dynamic content)
--readabilityfalseEnable readability processing (main content only)
--crawl-modefastfast or fine
--timeout60000 (script) / 60 (CLI)Request timeout: script uses milliseconds, CLI uses seconds (e.g. -t 90)
--json / -jfalsePrint full API response as JSON

How to write instructions (target_selector + output_format)

When the user wants a specific part of the page or a specific output format, phrase the command like this:

  • Output format: "Extract as text" / "Get markdown" / "Return html" → use --format text, --format markdown, or --format html.
  • Target one element: "Only the main article" / "Just the content inside #main" / "Extract only article.main-content" → use --target-selector "article.main" or the selector they give (e.g. #main, .main-content, article .post).

Examples of user intents and equivalent commands:

User intentCommand
"Extract this page as plain text"--url "..." --format text
"Get only the main content area"--url "..." --target-selector "main" or article
"Extract the div with id=content as markdown"--url "..." --target-selector "#content" --format markdown
"Just the article body, as HTML"--url "..." --target-selector "article .body" --format html

Examples:

# Basic: extract as Markdown
node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com"

# Article-style with readability
node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com/article" --readability --format markdown

# Raw HTML
node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com" --format html --json

# Only the element matching a CSS selector (e.g. main article)
node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com" --target-selector "article.main" --format markdown

# Specific output format + target selector
node felo-web-extract/scripts/run_web_extract.mjs --url "https://example.com" --target-selector "#content" --format text

Option B: Call API with curl

curl -X POST "https://openapi.felo.ai/v2/web/extract" \
  -H "Authorization: Bearer $FELO_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "output_format": "markdown", "with_readability": true}'

API Reference (summary)

  • Endpoint: POST /v2/web/extract
  • Base URL: https://openapi.felo.ai. Override with FELO_API_BASE env if needed.
  • Auth: Authorization: Bearer YOUR_API_KEY

Request body (JSON)

ParameterTypeRequiredDefaultDescription
urlstringYes-Webpage URL to extract
crawl_modestringNofastfast or fine
output_formatstringNohtmlhtml, text, markdown
with_readabilitybooleanNo-Use readability (main content)
with_links_summarybooleanNo-Include links summary
with_images_summarybooleanNo-Include images summary
target_selectorstringNo-CSS selector for target element
wait_for_selectorstringNo-Wait for selector before extract
timeoutintegerNo-Timeout in milliseconds
with_cachebooleanNotrueUse cache

Response

Success (200):

{
  "code": 0,
  "message": "success",
  "data": {
    "content": { ... }
  }
}

Extracted content is in data.content; structure depends on output_format.

Error codes

HTTPCodeDescription
400-Parameter validation failed
401INVALID_API_KEYAPI key invalid or revoked
500/502WEB_EXTRACT_FAILEDExtract failed (server or page error)

Output Format

On success (script without --json):

  • Print the extracted content only (for direct use or piping).

With --json:

  • Print full API response including code, message, data.

Error response to user:

## Web Extract Failed

- Error: <code or message>
- URL: <requested url>
- Suggestion: <e.g. check URL, retry, or use --timeout>

Important Notes

  • Always check FELO_API_KEY before calling; if missing, return setup instructions.
  • For long articles or slow sites, consider --timeout or timeout in request body.
  • Use output_format: "markdown" and with_readability: true for clean article text.
  • API may cache results; use with_cache: false in body only when fresh content is required (script does not expose this by default).

References

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Ai Competitor Analyzer

提供AI驱动的竞争对手分析,支持批量自动处理,提升企业和专业团队分析效率与专业度。

Registry SourceRecently Updated
General

Ai Data Visualization

提供自动化AI分析与多格式批量处理,显著提升数据可视化效率,节省成本,适用企业和个人用户。

Registry SourceRecently Updated
General

Ai Cost Optimizer

提供基于预算和任务需求的AI模型成本优化方案,计算节省并指导OpenClaw配置与模型切换策略。

Registry SourceRecently Updated