llms.txt Crawler Skill

This skill enables you to fetch llms.txt files from websites and crawl all pages listed within them. The llms.txt format is a standard way for websites to provide LLM-friendly content listings.

Overview

The llms.txt file typically follows this format:

Site Name

Section Name

Page Title: Description of the page
Another Page: Another description

This skill parses these files and downloads all linked content.

Usage

Basic Usage

Run the crawl script with a target URL:

cd /path/to/skills/llms-txt-crawler/scripts npm install # First time only node crawl.js --url https://example.com

Command Line Options

Option Short Description Default

--url

-u

Base URL of the site with llms.txt Required

--output

-o

Output directory for crawled files ./output

--format

-f

Output format: md , json , or txt

--delay

-d

Delay between requests in milliseconds 500

--concurrent

-c

Maximum concurrent requests 3

Examples

Crawl agentskills.io documentation:

node crawl.js --url https://agentskills.io --output ./agentskills-docs

Crawl with custom rate limiting:

node crawl.js --url https://example.com --delay 1000 --concurrent 2

Output as JSON:

node crawl.js --url https://example.com --format json

Output Structure

The script creates the following output structure:

output/ ├── llms.txt # Original llms.txt file ├── index.json # Metadata about all crawled pages └── pages/ ├── page-1.md ├── page-2.md └── ...

Error Handling

Network errors: Retries up to 3 times with exponential backoff
Rate limiting: Respects delay settings between requests
Missing pages: Logs warnings but continues crawling other pages
Invalid URLs: Skips and logs invalid URLs

Integration Tips

When using this skill in an agent workflow:

First run the crawler to download content
The index.json file contains metadata about all pages
Use the downloaded markdown files for context or analysis

llms-txt-crawler

Safety Notice

Copy this and send it to your AI assistant to learn

Site Name

Section Name

Source Transparency

Related Skills

polaris-design

theme-factory

xlsx

webapp-testing