tavily-extract

Extract clean markdown or text content from specific URLs via the Tavily CLI. Use this skill when the user has one or more URLs and wants their content, says "extract", "grab the content from", "pull the text from", "get the page at", "read this webpage", or needs clean text from web pages. Handles JavaScript-rendered pages, returns LLM-optimized markdown, and supports query-focused chunking for targeted extraction. Can process up to 20 URLs in a single call.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "tavily-extract" with this command: npx skills add tavily-ai/skills/tavily-ai-skills-tavily-extract

tavily extract

Extract clean markdown or text content from one or more URLs.

Before running any command

If tvly is not found on PATH, install it first:

curl -fsSL https://cli.tavily.com/install.sh | bash && tvly login

Do not skip this step or fall back to other tools.

See tavily-cli for alternative install methods and auth options.

When to use

  • You have a specific URL and want its content
  • You need text from JavaScript-rendered pages
  • Step 2 in the workflow: search → extract → map → crawl → research

Quick start

# Single URL
tvly extract "https://example.com/article" --json

# Multiple URLs
tvly extract "https://example.com/page1" "https://example.com/page2" --json

# Query-focused extraction (returns relevant chunks only)
tvly extract "https://example.com/docs" --query "authentication API" --chunks-per-source 3 --json

# JS-heavy pages
tvly extract "https://app.example.com" --extract-depth advanced --json

# Save to file
tvly extract "https://example.com/article" -o article.md

Options

OptionDescription
--queryRerank chunks by relevance to this query
--chunks-per-sourceChunks per URL (1-5, requires --query)
--extract-depthbasic (default) or advanced (for JS pages)
--formatmarkdown (default) or text
--include-imagesInclude image URLs
--timeoutMax wait time (1-60 seconds)
-o, --outputSave output to file
--jsonStructured JSON output

Extract depth

DepthWhen to use
basicSimple pages, fast — try this first
advancedJS-rendered SPAs, dynamic content, tables

Tips

  • Max 20 URLs per request — batch larger lists into multiple calls.
  • Use --query + --chunks-per-source to get only relevant content instead of full pages.
  • Try basic first, fall back to advanced if content is missing.
  • Set --timeout for slow pages (up to 60s).
  • If search results already contain the content you need (via --include-raw-content), skip the extract step.

See also

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

tavily-cli

No summary provided by upstream source.

Repository SourceNeeds Review
General

search

No summary provided by upstream source.

Repository SourceNeeds Review
11.9K-tavily-ai
Research

research

No summary provided by upstream source.

Repository SourceNeeds Review
General

extract

No summary provided by upstream source.

Repository SourceNeeds Review