Documentation Discovery & Analysis

Overview

Intelligent discovery and analysis of technical documentation through multiple strategies:

llms.txt-first: Search for standardized AI-friendly documentation
Repository analysis: Use Repomix to analyze GitHub repositories
Parallel exploration: Deploy multiple Explorer agents for comprehensive coverage
Fallback research: Use Researcher agents when other methods unavailable

Core Workflow

Phase 1: Initial Discovery

Identify target

Extract library/framework name from user request
Note version requirements (default: latest)
Clarify scope if ambiguous
Identify if target is GitHub repository or website

Search for llms.txt (PRIORITIZE context7.com)

First: Try context7.com patterns

For GitHub repositories:

Pattern: https://context7.com/{org}/{repo}/llms.txt Examples:

For websites:

Pattern: https://context7.com/websites/{normalized-domain-path}/llms.txt Examples:

Topic-specific searches (when user asks about specific feature):

Pattern: https://context7.com/{path}/llms.txt?topic={query} Examples:

Fallback: Traditional llms.txt search

WebSearch: "[library name] llms.txt site:[docs domain]"

Common patterns:

https://docs.[library].com/llms.txt
https://[library].dev/llms.txt
https://[library].io/llms.txt

→ Found? Proceed to Phase 2 → Not found? Proceed to Phase 3

Phase 2: llms.txt Processing

Single URL:

WebFetch to retrieve content
Extract and present information

Multiple URLs (3+):

CRITICAL: Launch multiple Explorer agents in parallel
One agent per major documentation section (max 5 in first batch)
Each agent reads assigned URLs
Aggregate findings into consolidated report

Example:

Launch 3 Explorer agents simultaneously:

Agent 1: getting-started.md, installation.md
Agent 2: api-reference.md, core-concepts.md
Agent 3: examples.md, best-practices.md

Phase 3: Repository Analysis

When llms.txt not found:

Find GitHub repository via WebSearch
Use Repomix to pack repository: npm install -g repomix # if needed git clone [repo-url] /tmp/docs-analysis cd /tmp/docs-analysis repomix --output repomix-output.xml
Read repomix-output.xml and extract documentation

Repomix benefits:

Entire repository in single AI-friendly file
Preserves directory structure
Optimized for AI consumption

Phase 4: Fallback Research

When no GitHub repository exists:

Launch multiple Researcher agents in parallel
Focus areas: official docs, tutorials, API references, community guides
Aggregate findings into consolidated report

Agent Distribution Guidelines

1-3 URLs: Single Explorer agent
4-10 URLs: 3-5 Explorer agents (2-3 URLs each)
11+ URLs: 5-7 Explorer agents (prioritize most relevant)

Version Handling

Latest (default):

Search without version specifier
Use current documentation paths

Specific version:

Include version in search: [library] v[version] llms.txt
Check versioned paths: /v[version]/llms.txt
For repositories: checkout specific tag/branch

Output Format

Documentation for [Library] [Version]

Source

Method: [llms.txt / Repository / Research]
URLs: [list of sources]
Date accessed: [current date]

Key Information

[Extracted relevant information organized by topic]

Additional Resources

[Related links, examples, references]

Notes

[Any limitations, missing information, or caveats]

Quick Reference

Tool selection:

WebSearch → Find llms.txt URLs, GitHub repositories
WebFetch → Read single documentation pages
Task (Explore) → Multiple URLs, parallel exploration
Task (Researcher) → Scattered documentation, diverse sources
Repomix → Complete codebase analysis

Popular llms.txt locations (try context7.com first):

Astro: https://context7.com/withastro/astro/llms.txt
Next.js: https://context7.com/vercel/next.js/llms.txt
Remix: https://context7.com/remix-run/remix/llms.txt
shadcn/ui: https://context7.com/shadcn-ui/ui/llms.txt
Better Auth: https://context7.com/better-auth/better-auth/llms.txt

Fallback to official sites if context7.com unavailable:

Astro: https://docs.astro.build/llms.txt
Next.js: https://nextjs.org/llms.txt
Remix: https://remix.run/llms.txt
SvelteKit: https://kit.svelte.dev/llms.txt

Error Handling

llms.txt not accessible → Try alternative domains → Repository analysis
Repository not found → Search official website → Use Researcher agents
Repomix fails → Try /docs directory only → Manual exploration
Multiple conflicting sources → Prioritize official → Note versions

Key Principles

Prioritize context7.com for llms.txt — Most comprehensive and up-to-date aggregator
Use topic parameters when applicable — Enables targeted searches with ?topic=...
Use parallel agents aggressively — Faster results, better coverage
Verify official sources as fallback — Use when context7.com unavailable
Report methodology — Tell user which approach was used
Handle versions explicitly — Don't assume latest

Detailed Documentation

For comprehensive guides, examples, and best practices:

Workflows:

WORKFLOWS.md — Detailed workflow examples and strategies

Reference guides:

Tool Selection — Complete guide to choosing and using tools
Documentation Sources — Common sources and patterns across ecosystems
Error Handling — Troubleshooting and resolution strategies
Best Practices — 8 essential principles for effective discovery
Performance — Optimization techniques and benchmarks
Limitations — Boundaries and success criteria

docs-seeker

Safety Notice

Copy this and send it to your AI assistant to learn

Documentation for [Library] [Version]

Source

Key Information

Additional Resources

Notes

Source Transparency

Related Skills

exploratory-data-analysis

statistical-analysis

paper-2-web