sitemap-generator

Generate XML sitemaps by crawling a website. Use when a user needs to create a sitemap.xml for SEO, audit site structure, discover all pages on a domain, or generate a sitemap for submission to Google Search Console or other search engines. Handles BFS crawling with configurable depth, page limits, and polite delays.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "sitemap-generator" with this command: npx skills add johnnywang2001/sitemap-generator

Sitemap Generator

Crawl any website and produce a standards-compliant XML sitemap ready for search engine submission.

Quick Start

python3 scripts/sitemap_gen.py https://example.com

Output: sitemap.xml in the current directory.

Commands

# Basic — crawl and write sitemap.xml
python3 scripts/sitemap_gen.py https://example.com

# Custom output path
python3 scripts/sitemap_gen.py https://example.com -o /tmp/sitemap.xml

# Limit crawl scope
python3 scripts/sitemap_gen.py https://example.com --max-pages 500 --max-depth 3

# Polite crawling with delay
python3 scripts/sitemap_gen.py https://example.com --delay 1.0

# Set SEO hints
python3 scripts/sitemap_gen.py https://example.com --changefreq daily --priority 0.8

# Verbose progress
python3 scripts/sitemap_gen.py https://example.com -v

# Pipe to stdout
python3 scripts/sitemap_gen.py https://example.com -o -

Options

FlagDefaultDescription
--output, -ositemap.xmlOutput file path (use - for stdout)
--max-pages200Maximum pages to crawl
--max-depth5Maximum link depth from start URL
--delay0.2Seconds between requests
--timeout10Request timeout in seconds
--changefreqweeklySitemap changefreq hint
--priority0.5Sitemap priority hint (0.0–1.0)
--verbose, -voffPrint crawl progress to stderr

Dependencies

pip install requests beautifulsoup4

Notes

  • Only crawls same-domain pages (no external links)
  • Skips binary files (images, CSS, JS, PDFs, fonts)
  • Respects the delay setting to avoid overwhelming servers
  • Output conforms to the sitemaps.org 0.9 protocol

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Security

Nightly Build

Automates nightly maintenance tasks like skill audits, updates, cleanup, and health checks, then summarizes a morning report.

Registry SourceRecently Updated
Security

Exec Inspector

查看和分析 OpenClaw Agent exec 工具的执行历史,支持查询、搜索、统计和实时监控命令使用详情。

Registry SourceRecently Updated
Security

Tech Security Audit

Performs local network scans using Nmap to detect vulnerabilities, identify service versions, and fingerprint operating systems.

Registry SourceRecently Updated
Security

Ocean-Evolve 海洋进化版

安全自主优化自身工作方式、技能和非敏感配置,记录所有修改且绝不触碰系统安全红线。

Registry SourceRecently Updated