web-content-fetcher

网页内容获取工具 | 当常规爬虫被过滤时，使用替代服务获取网页内容。支持：1) r.jina.ai - 最稳定 2) markdown.new - Cloudflare 专用 3) defuddle.md - 备用方案。触发词：获取网页内容、网页转markdown、内容抓取、fetch webpage、bypass cloudflare

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "web-content-fetcher" with this command: npx skills add mrtommywu/web-content-fetcher

网页内容获取工具

当常规 web_fetch/web_search 无法获取内容时，使用替代服务获取网页 Markdown 格式内容。

支持的服务

优先级	服务	用法	适用场景
1	r.jina.ai	`https://r.jina.ai/{url}`	最稳定，通用性强
2	markdown.new	`https://markdown.new/{url}`	Cloudflare 保护网站
3	defuddle.md	`https://defuddle.md/{url}`	备用方案

使用方法

直接调用

当需要获取网页内容时，按顺序尝试：

首先用 web_fetch 尝试获取
如果失败或被过滤，调用本工具

# 使用 jina.ai (首选)
curl -s "https://r.jina.ai/https://example.com"

# 使用 markdown.new (Cloudflare)
curl -s "https://markdown.new/https://example.com"

# 使用 defuddle.md (备用)
curl -s "https://defuddle.md/https://example.com"

API 格式

# 简单获取
fetch_webpage <url>

# 指定方法
fetch_webpage <url> --method jina|markdown|defuddle

示例

用户: 帮我获取 https://news.example.com/article/123 的内容
助手: (使用 r.jina.ai 获取)

工具脚本

本目录包含 fetch.sh 脚本，可直接调用：

./fetch.sh https://example.com
./fetch.sh https://example.com jina

让网页内容获取不再受限 🌐

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open Registry Record Open in ClawHub

Related Skills

Related by shared tags or category signals.

General

Digicert

DigiCert integration. Manage Certificates, Orders, Users, Organizations. Use when the user wants to interact with DigiCert data.

Registry SourceRecently Updated

2540membranedev

General

Dialpad

Dialpad integration. Manage Users, Groups, Departments, Offices. Use when the user wants to interact with Dialpad data.

Registry SourceRecently Updated

3270membranedev

General

Darwinbox

Darwinbox integration. Manage Organizations, Goals, Roles, Projects, Pipelines, Leads and more. Use when the user wants to interact with Darwinbox data.

Registry SourceRecently Updated

2700membranedev

General

Creatio

Creatio integration. Manage Leads, Organizations, Users. Use when the user wants to interact with Creatio data.

Registry SourceRecently Updated

3310membranedev