sitemap-generator

Generate XML sitemaps by crawling a website. Use when a user needs to create a sitemap.xml for SEO, audit site structure, discover all pages on a domain, or generate a sitemap for submission to Google Search Console or other search engines. Handles BFS crawling with configurable depth, page limits, and polite delays.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "sitemap-generator" with this command: npx skills add Johnnywang2001/sitemap-generator

Sitemap Generator

Crawl any website and produce a standards-compliant XML sitemap ready for search engine submission.

Quick Start

python3 scripts/sitemap_gen.py https://example.com

Output: sitemap.xml in the current directory.

Commands

# Basic — crawl and write sitemap.xml
python3 scripts/sitemap_gen.py https://example.com

# Custom output path
python3 scripts/sitemap_gen.py https://example.com -o /tmp/sitemap.xml

# Limit crawl scope
python3 scripts/sitemap_gen.py https://example.com --max-pages 500 --max-depth 3

# Polite crawling with delay
python3 scripts/sitemap_gen.py https://example.com --delay 1.0

# Set SEO hints
python3 scripts/sitemap_gen.py https://example.com --changefreq daily --priority 0.8

# Verbose progress
python3 scripts/sitemap_gen.py https://example.com -v

# Pipe to stdout
python3 scripts/sitemap_gen.py https://example.com -o -

Options

FlagDefaultDescription
--output, -ositemap.xmlOutput file path (use - for stdout)
--max-pages200Maximum pages to crawl
--max-depth5Maximum link depth from start URL
--delay0.2Seconds between requests
--timeout10Request timeout in seconds
--changefreqweeklySitemap changefreq hint
--priority0.5Sitemap priority hint (0.0–1.0)
--verbose, -voffPrint crawl progress to stderr

Dependencies

pip install requests beautifulsoup4

Notes

  • Only crawls same-domain pages (no external links)
  • Skips binary files (images, CSS, JS, PDFs, fonts)
  • Respects the delay setting to avoid overwhelming servers
  • Output conforms to the sitemaps.org 0.9 protocol

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Security

Ring Security

Monitor and manage Ring doorbells and security cameras. Query device status, review motion events, manage modes, and export event history. Use when you need...

Registry SourceRecently Updated
Security

Watadot Aws Iam

IAM security patterns by Watadot Studio. Manage users, roles, and policy verification.

Registry SourceRecently Updated
120Profile unavailable
Security

Moses Audit

MO§ES™ Audit Trail — SHA-256 chained append-only governance ledger. Every agent appends before final response. Provides moses_log_action and moses_verify_cha...

Registry SourceRecently Updated
870Profile unavailable
Security

Cogdx Pre Trade Audit

Verify trading reasoning with cognitive diagnostics before executing trades. Detects logical fallacies, calibration issues, and cognitive biases in your trad...

Registry SourceRecently Updated
130Profile unavailable