spider-cli-extraction

Spider CLI Extraction

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "spider-cli-extraction" with this command: npx skills add spider-rs/spider_skills/spider-rs-spider-skills-spider-cli-extraction

Spider CLI Extraction

Overview

Use this skill to run Spider CLI workflows with explicit runtime mode control.

Canonical source for cross-agent behavior: skills/core/spider-cli-extraction.md

Load references/cli-workflows.md when you need exact command patterns or mode-selection rules.

Workflow

Confirm CLI availability.
Prefer cargo run -p spider_cli -- ... from the Spider repo root.
If spider is globally installed, use spider ... for quick checks.
Choose the task mode.
Use crawl to collect links.
Use scrape to emit per-page JSON records and optionally include HTML.
Use download to persist page markup to disk.
Select runtime execution mode.
Use --headless for browser-rendered mode.
Use --http to force HTTP-only mode.
Omit both for default HTTP behavior.
Add scope controls.
Set --limit , --depth , --budget , and --blacklist-url .
Add --respect-robots-txt when policy compliance is required.

Quick Commands

Crawl links (default HTTP mode)

cargo run -p spider_cli -- --url https://example.com crawl --output-links

Browser mode on demand

cargo run -p spider_cli -- --url https://example.com --headless crawl --output-links

Scrape with HTML output

cargo run -p spider_cli -- --url https://example.com scrape --output-html

Script

Use scripts/spider_cli_helper.sh for wrappers:

./scripts/spider_cli_helper.sh verify-headless ./scripts/spider_cli_helper.sh crawl https://example.com --limit 20 --depth 2 ./scripts/spider_cli_helper.sh scrape https://example.com --output-html --output-links

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open in GitHub Open in ClawHub

Related Skills

Related by shared tags or category signals.

Coding

OPC Landing Page Manager

Landing page strategy, copywriting, design, and code generation for solo entrepreneurs. From product idea to a complete, self-contained, conversion-optimized...

Registry SourceRecently Updated

00LeonFJR

Coding

OPC Product Manager

Product spec generation for solo entrepreneurs. Turns a one-sentence idea into a build-ready spec that AI coding agents (Claude Code, etc.) can execute direc...

Registry SourceRecently Updated

00LeonFJR

Coding

设备

Use when querying or modifying device configurations on ESD service, calling REST APIs with sigV2 authentication on HK baseline or STG environments

Registry SourceRecently Updated

40maydayily

Coding

My Agent Browser

A fast Rust-based headless browser automation CLI with Node.js fallback that enables AI agents to navigate, click, type, and snapshot pages via structured co...

Registry SourceRecently Updated

160liujiang817