SEO Skill (Agentic / Claude / Codex)
LLM-first SEO analysis skill with 16 specialized sub-skills, 10 specialist agents, and 33 scripts for website, blog, and GitHub repository optimization.
Deterministic Trigger Mapping
For prompt reliability in Codex/agent IDEs, map common user wording to a fixed workflow:
- If user says
perform seo analysis on <url>(or similar generic SEO request with a URL), treat it as a single-URL full audit. - If no explicit sub-skill is specified, run the full/page audit path with LLM-first reasoning and script-backed evidence.
- For full/page audits, always produce:
FULL-AUDIT-REPORT.md(detailed findings)ACTION-PLAN.md(prioritized fixes)
- If
generate_report.pyis run, also return the saved HTML path (for exampleSEO-REPORT.html).
Available Commands
| Command | Sub-Skill | Description |
|---|---|---|
seo audit <url> | seo-audit | Full website audit with scoring |
seo page <url> | seo-page | Deep single-page analysis |
seo technical <url> | seo-technical | Technical SEO checks |
seo content <url> | seo-content | Content quality & E-E-A-T |
seo schema <url> | seo-schema | Schema detection/validation/generation |
seo sitemap <url> | seo-sitemap | Sitemap analysis & generation |
seo images <url> | seo-images | Image optimization audit |
seo geo <url> | seo-geo | AI search optimization (GEO) |
seo programmatic <url> | seo-programmatic | Programmatic SEO safeguards |
seo competitors <url> | seo-competitor-pages | Comparison/alternatives pages |
seo hreflang <url> | seo-hreflang | International SEO validation |
seo plan <url> | seo-plan | Strategic SEO planning |
seo github <repo_or_url> | seo-github | GitHub repository discoverability, README, topics, community health, and traffic archival |
seo article <url> | seo-article | Article data extraction & LLM optimization |
seo links <url> | seo-links | External backlink profile & link health |
seo aeo <url> | seo-aeo | Answer Engine Optimization (Featured Snippets, PAA, Knowledge Panel) |
Orchestration Logic
When the user requests SEO analysis, follow this routing:
Step 1 — Identify the Task
Parse the user's request to determine which sub-skill(s) to activate:
- Full audit: Read
resources/skills/seo-audit.md— crawl multiple pages, delegate to agents, score and report - Single page: Read
resources/skills/seo-page.md— deep dive on one URL - Specific area: Read the matching
resources/skills/seo-*.mdfile - Strategic plan: Read
resources/skills/seo-plan.mdand the matchingresources/templates/*.mdfor the detected industry - GitHub repository SEO: Read
resources/skills/seo-github.mdand use GitHub scripts with--provider autofor API/ghfallback. - Generic
perform seo analysis on <url>request: treat as single-page full audit, readresources/skills/seo-page.md, and generateFULL-AUDIT-REPORT.md+ACTION-PLAN.md.
Step 2 — Collect Evidence
Primary method (LLM-first) — use the built-in read_url_content tool first:
read_url_content(url) → returns parsed HTML content directly
Use this as the baseline evidence for reasoning.
Deterministic verification (recommended when script execution is available):
# Fetch/parse raw HTML for structured checks
python3 <SKILL_DIR>/scripts/fetch_page.py <url> --output /tmp/page.html
python3 <SKILL_DIR>/scripts/parse_html.py /tmp/page.html --url <url> --json
# Optional: generate shareable HTML dashboard artifact
python3 <SKILL_DIR>/scripts/generate_report.py <url> --output SEO-REPORT.html
Do not use third-party mirrors (e.g.,
r.jina.ai) as primary evidence when direct site fetch or bundled scripts are available.<SKILL_DIR>= absolute path to this skill directory (the folder containing this SKILL.md).
Step 3 — Perform LLM-First Analysis
Use the LLM as the primary SEO analyst:
- Synthesize evidence from page content, metadata, and optional script outputs.
- Produce findings with explicit proof:
FindingEvidence(specific element, metric, or snippet)Impact(why it matters for ranking/indexing/UX)Fix(clear implementation step)
- Prioritize by impact and implementation effort.
- Separate confirmed issues, likely issues, and unknowns (missing data).
Always read and apply resources/references/llm-audit-rubric.md to keep scoring, severity, confidence, and output structure consistent across audit types.
Step 4 — Run Baseline Verification Scripts (When execution is available)
For full/page audits, run baseline checks to avoid hypothesis-only reporting. Do not replace LLM reasoning with script-only scoring.
# Check robots.txt and AI crawler management
python3 <SKILL_DIR>/scripts/robots_checker.py <url>
# Check llms.txt for AI search readiness
python3 <SKILL_DIR>/scripts/llms_txt_checker.py <url>
# Get Core Web Vitals from PageSpeed Insights (free API, no key needed)
python3 <SKILL_DIR>/scripts/pagespeed.py <url> --strategy mobile
# Check security headers (HSTS, CSP, X-Frame-Options, etc.)
python3 <SKILL_DIR>/scripts/security_headers.py <url>
# Detect broken links on a page (404s, timeouts, connection errors)
python3 <SKILL_DIR>/scripts/broken_links.py <url> --workers 5
# Trace redirect chains, detect loops and mixed HTTP/HTTPS
python3 <SKILL_DIR>/scripts/redirect_checker.py <url>
# Analyze readability from fetched HTML (Flesch-Kincaid, grade level, sentence stats)
python3 <SKILL_DIR>/scripts/readability.py /tmp/page.html --json
# Validate Open Graph and Twitter Card meta tags
python3 <SKILL_DIR>/scripts/social_meta.py <url>
# Analyze internal link structure, find orphan pages
python3 <SKILL_DIR>/scripts/internal_links.py <url> --depth 1 --max-pages 20
# Extract article content and perform keyword research for LLM-driven optimization
python3 <SKILL_DIR>/scripts/article_seo.py <url> --keyword "<optional_target_keyword>" --json
# GitHub repository SEO (provider fallback: auto|api|gh)
# Auth setup (choose one):
# export GITHUB_TOKEN="ghp_xxx" # or export GH_TOKEN="ghp_xxx"
# gh auth login -h github.com && gh auth status -h github.com
python3 <SKILL_DIR>/scripts/github_repo_audit.py --repo <owner/repo> --provider auto --json
python3 <SKILL_DIR>/scripts/github_readme_lint.py README.md --json
python3 <SKILL_DIR>/scripts/github_community_health.py --repo <owner/repo> --provider auto --json
# Benchmark/competitor inputs should be provided by LLM/web-search discovery when possible.
# If omitted, github_seo_report.py auto-derives repo-specific benchmark queries.
python3 <SKILL_DIR>/scripts/github_search_benchmark.py --repo <owner/repo> --query "<llm_or_web_query>" --provider auto --json
python3 <SKILL_DIR>/scripts/github_competitor_research.py --repo <owner/repo> --query "<llm_or_web_query>" --provider auto --top-n 6 --json
python3 <SKILL_DIR>/scripts/github_competitor_research.py --repo <owner/repo> --competitor <owner/repo> --competitor <owner/repo> --provider auto --json
python3 <SKILL_DIR>/scripts/github_traffic_archiver.py --repo <owner/repo> --provider auto --archive-dir .github-seo-data --json
python3 <SKILL_DIR>/scripts/github_seo_report.py --repo <owner/repo> --provider auto --markdown GITHUB-SEO-REPORT.md --action-plan GITHUB-ACTION-PLAN.md --json
# Optional: increase/reduce auto-derived query volume (default: 6)
# python3 <SKILL_DIR>/scripts/github_seo_report.py --repo <owner/repo> --provider auto --auto-query-max 8 --markdown GITHUB-SEO-REPORT.md --action-plan GITHUB-ACTION-PLAN.md --json
If a check fails due network, DNS, permissions, or API rate limits:
- Report it explicitly as an environment limitation, not a confirmed site issue.
- Keep confidence as
Hypothesisfor impacted categories. - Continue with available evidence instead of stopping the audit.
- Do not enter repeated fallback loops. Retry a failed source at most once, then finalize the audit.
- Do not pivot into repeated web-search scraping loops for the same URL.
Visual analysis (requires Playwright — use conda activate pentest if available):
# Capture screenshots (desktop, laptop, tablet, mobile)
python3 <SKILL_DIR>/scripts/capture_screenshot.py <url> --all
# Analyze visual layout, above-the-fold, mobile responsiveness
python3 <SKILL_DIR>/scripts/analyze_visual.py <url> --json
HTML Report Generator — generates a self-contained interactive HTML dashboard:
# Generate full SEO report (runs scripts automatically, saves HTML to PWD)
python3 <SKILL_DIR>/scripts/generate_report.py <url>
python3 <SKILL_DIR>/scripts/generate_report.py <url> --output custom-report.html
Step 5 — Delegate to Specialist Agents
For comprehensive audits, read the relevant agent file from resources/agents/ to adopt the specialist role:
| Agent | File | Focus Area |
|---|---|---|
| Technical SEO | seo-technical.md | Crawlability, indexability, security, URLs, mobile, CWV, JS rendering |
| Content Quality | seo-content.md | E-E-A-T assessment, content metrics, AI content detection |
| Performance | seo-performance.md | Core Web Vitals (LCP, INP, CLS), optimization recommendations |
| Schema Markup | seo-schema.md | Detection, validation, generation of JSON-LD structured data |
| Sitemap | seo-sitemap.md | XML sitemap validation, generation, quality gates |
| Visual Analysis | seo-visual.md | Screenshots, above-the-fold, responsiveness, layout |
| Verifier (global) | seo-verifier.md | Deduplicate findings, suppress contradictions, and validate evidence relevance before final report |
Step 6 — Apply Quality Gates
Reference the quality standards in resources/references/:
- Content minimums: Read quality-gates.md for word counts, unique content %, title/meta requirements
- Schema validation: Read schema-types.md for active/deprecated/restricted types
- Core Web Vitals: Read cwv-thresholds.md for current metric thresholds
- E-E-A-T framework: Read eeat-framework.md for scoring criteria
- Google reference: Read google-seo-reference.md for quick reference
- LLM report rubric: Read llm-audit-rubric.md for mandatory evidence format, confidence labels, and output contract
Step 6.5 — Verify Findings (All Workflows)
Before writing final reports, run verification:
python3 <SKILL_DIR>/scripts/finding_verifier.py --findings-json <raw_findings.json> --json
Use verified output for final report tables, not raw findings.
Step 7 — Score and Report
Use numeric scores as guidance, not as a replacement for evidence quality and judgment.
Default Scoring Weights (Full Audit)
Canonical source of truth — These weights are defined here and in
resources/skills/seo-audit.md. Do not modify weights in individual sub-skill files; update only these two locations to keep scores consistent.
| Category | Weight |
|---|---|
| Technical SEO | 25% |
| Content Quality | 20% |
| On-Page SEO | 15% |
| Schema / Structured Data | 15% |
| Performance (CWV) | 10% |
| Image Optimization | 10% |
| AI Search Readiness (GEO) | 5% |
If using
scripts/generate_report.py, the automated dashboard uses script-level category weights defined in that script. Keep the narrative audit LLM-first and evidence-first.
Step 8 — Mandatory Deliverables
For seo audit, seo page, and generic perform seo analysis on <url> flows:
- Create
FULL-AUDIT-REPORT.mdin the current working directory at the start of the audit, then update it as evidence is collected. - Create
ACTION-PLAN.mdin the current working directory at the start of the audit, then update it with prioritized fixes. - If HTML dashboard was generated, include its exact saved path (for example
SEO-REPORT.htmlor an absolute path). - In the final response, explicitly list generated artifacts and paths.
- If technical checks are blocked by environment limits, still write both markdown files and include an "Environment Limitations" section.
Score Interpretation
| Score | Rating |
|---|---|
| 90-100 | Excellent |
| 70-89 | Good |
| 50-69 | Needs Improvement |
| 30-49 | Poor |
| 0-29 | Critical |
Industry Detection
When running seo plan, detect the business type and load the matching template:
| Industry | Template File |
|---|---|
| SaaS / Software | saas.md |
| Local Service Business | local-service.md |
| E-commerce / Retail | ecommerce.md |
| Publisher / Media | publisher.md |
| Agency / Consultancy | agency.md |
| Other / Generic | generic.md |
Detection signals:
- SaaS: pricing page, feature pages, /docs, /api, trial/demo CTAs
- Local: address, phone, Google Business Profile, service area pages
- E-commerce: product pages, cart, checkout, /collections, /categories
- Publisher: article dates, author pages, /news, high content volume
- Agency: case studies, /work, /portfolio, team pages, service offerings
Schema Templates
Pre-built JSON-LD templates are available in templates.json for:
- Common: BlogPosting, Article, Organization, LocalBusiness, BreadcrumbList, WebSite (with SearchAction)
- Video: VideoObject, BroadcastEvent, Clip, SeekToAction
- E-commerce: ProductGroup (variants), OfferShippingDetails, Certification
- Other: SoftwareSourceCode, ProfilePage (E-E-A-T author pages)
Validation Scripts
Two validation scripts are available for CI/CD integration:
Pre-commit SEO Check
bash <SKILL_DIR>/scripts/pre_commit_seo_check.sh
Checks staged HTML files for: placeholder text in schema, title tag length, missing alt text, deprecated schema types, FID references (should be INP), meta description length.
Schema Validator
python3 <SKILL_DIR>/scripts/validate_schema.py <file_path>
Validates JSON-LD blocks in HTML files: JSON syntax, @context/@type presence, placeholder text, deprecated/restricted types.
Output Format
All sub-skill reports should use consistent severity levels:
- 🔴 Critical — Directly impacts rankings or indexing (fix immediately)
- ⚠️ Warning — Optimization opportunity (fix within 1 month)
- ✅ Pass — Meets or exceeds standards
- ℹ️ Info — Not applicable or informational only
Structure reports as:
- Summary table with element, value, and severity
- Detailed findings grouped by category
- Actionable recommendations ordered by impact
Critical Rules
- INP not FID — FID was removed September 9, 2024. The sole interactivity metric is INP (Interaction to Next Paint). Never reference FID.
- FAQ schema is restricted — FAQPage schema is limited to government and healthcare authority sites only (August 2023). Do NOT recommend for commercial sites.
- HowTo schema is deprecated — Rich results fully removed September 2023. Never recommend.
- JSON-LD only — Always use
<script type="application/ld+json">. Never recommend Microdata or RDFa. - E-E-A-T everywhere — As of December 2025, E-E-A-T applies to ALL competitive queries, not just YMYL.
- Mobile-first is complete — 100% mobile-first indexing since July 5, 2024.
- Location page limits — Warning at 30+ pages, hard stop at 50+ pages. Enforce unique content requirements.
- AI crawler management — Check robots.txt for GPTBot, ClaudeBot, PerplexityBot, Applebot-Extended, Google-Extended, Bytespider, CCBot.
- LLM-first, resilient pipeline — Start by reading the page with
read_url_content, then always run relevant scripts for structured evidence. Scripts are the preferred evidence source — use them actively. However, if any script fails (timeout, network, parsing), the LLM MUST still produce a complete analysis using its own reasoning (confidence:Likely). Never block a report on a single script failure. - Always produce file artifacts for audit flows —
FULL-AUDIT-REPORT.mdandACTION-PLAN.mdare required outputs for full/page audit requests. - Bound evidence retries — Avoid long search/retry loops. If core checks fail due DNS/network, finalize promptly with confidence labels and file outputs.
- Avoid redundant web fallbacks — If direct fetch/scripts fail and one fallback also fails, stop retrying and finish the report with explicit limitations.
- Signal freshness tracking — Every reference file should contain a
<!-- Updated: YYYY-MM-DD -->comment. Flag any reference file older than 90 days for review. When Google announces algorithm changes, verify affected reference files within 7 days. Key dates to track: core updates (quarterly), schema deprecations (schema-types.md), CWV threshold changes (cwv-thresholds.md).
Dependencies
Optional Script Dependencies
- Python 3.8+
requests(for network analysis scripts)beautifulsoup4(for HTML parsing scripts)- Playwright (for
capture_screenshot.pyandanalyze_visual.py)
Or if using conda:pip install playwright && playwright install chromiumconda activate pentest(if Playwright is pre-installed)
Install Script Dependencies
pip install requests beautifulsoup4