tavily-extract

Extract content from specific URLs using Tavily's extraction API. Returns clean markdown/text from web pages.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "tavily-extract" with this command: npx skills add matthew77/liang-tavily-extract

Tavily Extract

Extract clean content from specific URLs. Ideal when you know which pages you want content from.

Authentication

Get your API key at https://tavily.com and add to your OpenClaw config:

{
  "skills": {
    "entries": {
      "tavily-extract": {
        "enabled": true,
        "apiKey": "tvly-YOUR_API_KEY_HERE"
      }
    }
  }
}

Or set in environment variable:

export TAVILY_API_KEY="tvly-YOUR_API_KEY_HERE"

Quick Start

Using the Script

node {baseDir}/scripts/extract.mjs "https://example.com/article"
node {baseDir}/scripts/extract.mjs "url1,url2,url3"
node {baseDir}/scripts/extract.mjs "url" --query "authentication API"

Examples

# Single URL
node {baseDir}/scripts/extract.mjs "https://docs.python.org/3/tutorial/classes.html"

# Multiple URLs
node {baseDir}/scripts/extract.mjs "https://example.com/page1,https://example.com/page2"

# With query focus
node {baseDir}/scripts/extract.mjs "https://example.com/docs" --query "authentication API"

# Advanced extraction for JS pages
node {baseDir}/scripts/extract.mjs "https://app.example.com" --depth advanced --timeout 60

Options

OptionDescriptionDefault
--query <text>Rerank chunks by relevance-
--chunks <n>Chunks per URL (1-5, requires query)3
--depth <mode>Extract depth: basic or advancedbasic
--format <fmt>Output format: markdown or textmarkdown
--timeout <sec>Max wait time (1-60 seconds)varies
--jsonOutput raw JSONfalse

Extract Depth

DepthWhen to Use
basicSimple text extraction, faster
advancedDynamic/JS-rendered pages, tables, structured data

Tips

  • Max 20 URLs per request - batch larger lists
  • Use --query + --chunks to get only relevant content
  • Try basic first, fall back to advanced if content is missing
  • Set longer --timeout for slow pages (up to 60s)
  • Check failed_results in JSON output for URLs that couldn't be extracted

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Team Up

Team Up integration. Manage Organizations. Use when the user wants to interact with Team Up data.

Registry SourceRecently Updated
General

Retriever

Retriever integration. Manage Organizations, Leads, Projects, Pipelines, Users, Filters. Use when the user wants to interact with Retriever data.

Registry SourceRecently Updated
General

Pdf Tool

Work with PDF files including merge, split, extract text, and convert. Use when user needs to combine multiple PDFs, split a PDF into pages, extract text fro...

Registry SourceRecently Updated
General

Salesblink

SalesBlink integration. Manage Organizations, Pipelines, Projects, Users, Filters. Use when the user wants to interact with SalesBlink data.

Registry SourceRecently Updated