web-search

Web search and content extraction toolkit. Use for searching documentation, facts, current information, or extracting readable content from URLs. Supports multiple providers (ddgs keyless, brave_api with key), caching, and safe defaults. Prefer this over browser-tools when no interaction is needed.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "web-search" with this command: npx skills add tmustier/web-search/tmustier-web-search-web-search

Web Search Toolkit

Search the web and extract readable content. Stable CLI with JSON output for agents.

Setup

cd {baseDir}
uv sync  # Install dependencies (once)

Optional: Set BRAVE_API_KEY for better search reliability (ddgs is keyless but flaky).

Commands

Search

{baseDir}/.venv/bin/wstk search "query"                    # Default (10 results)
{baseDir}/.venv/bin/wstk search "query" -n 5 --plain       # URLs only, one per line
{baseDir}/.venv/bin/wstk search "query" --json             # Machine-readable
{baseDir}/.venv/bin/wstk search "query" --time-range w     # Last week
{baseDir}/.venv/bin/wstk search "site:docs.python.org asyncio"  # Site-scoped

Key flags:

  • -n, --max-results <N> — Number of results (default: 10)
  • --time-range <d|w|m|y> — Filter by recency
  • --provider <ddgs|brave_api|auto> — Search provider
  • --plain — Output URLs only (for piping)
  • --json — Structured output

Pipeline (search → extract)

{baseDir}/.venv/bin/wstk pipeline "python asyncio tutorial" --json
{baseDir}/.venv/bin/wstk pipeline "python asyncio tutorial" --plan --plain

Key flags:

  • --top-k <N> — Search results to consider
  • --extract-k <N> — Number of results to extract
  • --plan — Return candidates without fetching
  • --method <http|browser|auto> — Extraction method (default: http)

Extract (fetch + extract readable content)

{baseDir}/.venv/bin/wstk extract https://example.com --plain     # Markdown output
{baseDir}/.venv/bin/wstk extract https://example.com --text      # Plain text
{baseDir}/.venv/bin/wstk extract https://example.com --json      # Full metadata
{baseDir}/.venv/bin/wstk extract ./local-file.html --plain       # From file

Key flags:

  • --markdown / --text / --both — Output format
  • --strategy <auto|readability|docs> — Extraction strategy
  • --max-chars <N> — Truncate output
  • --allow-domain <domain> — Restrict to specific domains (safety)

Fetch (raw HTTP, no extraction)

{baseDir}/.venv/bin/wstk fetch https://example.com --json        # Metadata + status
{baseDir}/.venv/bin/wstk fetch https://example.com --plain       # Path to cached body

List providers

{baseDir}/.venv/bin/wstk providers --plain

Decision Guide

  • search when you need discovery or candidate URLs.
  • pipeline when you want a one-shot search → extract bundle.
  • fetch when you need HTTP metadata or the cached body path (no extraction).
  • extract when you want readable content from a URL or local HTML.
  • render when a page is JS-only or blocked (or use extract --method browser for one-step extraction).

Common Patterns

Search → extract top result:

url=$({baseDir}/.venv/bin/wstk search "python asyncio tutorial" --plain | head -1)
{baseDir}/.venv/bin/wstk extract "$url" --plain --max-chars 8000

Search with JSON for programmatic use:

{baseDir}/.venv/bin/wstk search "openai api reference" --json | jq '.data.results[0].url'

Safe extraction (restrict domains):

{baseDir}/.venv/bin/wstk extract https://docs.python.org/3/library/asyncio.html \
  --allow-domain docs.python.org --plain

Output Formats

  • --plain — Stable text for piping (URLs for search, content for extract)
  • --json — Structured envelope: { "ok": bool, "data": {...}, "error": {...} }
  • Default — Human-readable with colors

Agent Defaults

  • Default to --json in agent wrappers; parse ok, error.code, and warnings.
  • Surface concise diagnostics by relaying error.message and error.details.reason when ok=false.
  • Use --plain only for piping, and add --no-input for non-interactive runs.
  • Consider --redact when handling sensitive URLs or content.

Exit Codes

  • 0 — Success
  • 1 — Runtime failure (network, provider error)
  • 2 — Invalid usage
  • 3 — Not found / empty result
  • 4 — Blocked / access denied
  • 5 — Needs JS rendering (page is JS-only)

Global Flags

  • --timeout <seconds> — Network timeout
  • --no-cache — Disable caching
  • --fresh — Bypass cache reads (still writes)
  • --quiet — Minimal output
  • --verbose — Debug diagnostics to stderr
  • --policy <standard|strict|permissive> — Safety defaults

References

  • references/troubleshooting.md — 403/JS-only guidance and advanced fetch flags.
  • references/providers.md — Provider selection and privacy notes.
  • docs/claude-code.md — Claude Code wrapper usage.

When to Use

  • Searching for documentation or API references
  • Looking up facts or current information
  • Extracting content from known URLs
  • Any task requiring web search without interactive browsing

Prefer browser-tools when you need: JS interaction, form filling, clicking, or visual inspection.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

web-search

No summary provided by upstream source.

Repository SourceNeeds Review
General

web-search

No summary provided by upstream source.

Repository SourceNeeds Review
-749
brave
General

web-search

No summary provided by upstream source.

Repository SourceNeeds Review