programmatic-seo

Programmatic SEO

Production-grade framework for building SEO page sets at scale. Covers the full lifecycle from keyword pattern discovery through template design, data pipeline construction, quality assurance, and post-launch optimization. Designed for deployments ranging from 50 to 100,000+ pages.

Table of Contents

When to Use vs When Not To
Initial Assessment
The 14 Playbooks
Playbook Selection Matrix
Keyword Pattern Mining
Data Pipeline Architecture
Template Design System
Quality Control Framework
Internal Linking Architecture
Indexation Strategy
Launch Sequence
Post-Launch Optimization
Anti-Patterns and Penalty Avoidance
Decision Matrix: Build vs Skip
Output Artifacts
Related Skills

When to Use vs When Not To

Use this skill when:

You have a repeating keyword pattern with 50+ variations
You have (or can acquire) structured data to populate pages
The search intent is consistent across variations
Your domain has sufficient authority to compete

Do NOT use when:

Each page requires unique editorial content (use content-creator instead)
Total addressable pages < 30 (manual content is more effective)
You lack a data source and would be generating thin placeholder content
Your domain authority is below DR 20 and competitors are DR 60+

Initial Assessment

Before designing any pSEO strategy, answer these questions. Skip nothing.

Opportunity Validation

Question Why It Matters Red Flag

What is the repeating keyword pattern? Defines the template structure Pattern is vague or inconsistent

What is the aggregate monthly search volume? Determines ROI ceiling < 5,000 aggregate monthly searches

How many unique pages can you generate? Scope the project < 50 pages (too few) or > 50K without data infrastructure

What does the SERP look like for sample queries? Competitive feasibility Page 1 dominated by DR 80+ editorial content

Is intent informational, navigational, or transactional? Template design Mixed intent across the same pattern

Data Source Evaluation

Rate your data source on this scale:

Tier Source Type Defensibility Example

S Proprietary first-party Unbeatable Your product usage data, internal benchmarks

A Product-derived Strong Aggregated user analytics, customer outcomes

B User-generated Moderate Community reviews, submitted content

C Licensed exclusive Moderate Paid data feed no competitor has

D Public aggregated Weak Government data, public APIs

F Scraped commodity None Wikipedia rewrites, copied listings

Rule: Do not build pSEO on Tier F data. Google penalizes commodity rewrites. If your only data source is public and easily replicable, invest in acquiring Tier A-C data first.

Competitive Moat Assessment

For 5 sample queries in your pattern, analyze page 1 results:

What is the average Domain Rating of ranking pages?
Are existing results programmatic or editorial?
What unique data do ranking pages provide?
What is the content depth (word count, data richness, UX quality)?

Go/No-Go threshold: If the average DR gap between you and page 1 is > 30 AND existing results have proprietary data, the opportunity requires either a differentiated approach or domain authority building first.

The 14 Playbooks

Playbook Pattern Example Data Requirement

1 Templates "[Type] template" "resume template", "invoice template" Template files + metadata

2 Curation "best [category]" "best CRM for startups" Product/service reviews + ratings

3 Conversions "[X] to [Y]" "100 USD to EUR" Conversion logic/API

4 Comparisons "[X] vs [Y]" "Notion vs Confluence" Feature data for both products

5 Examples "[type] examples" "landing page examples" Curated example collection

6 Locations "[service] in [city]" "coworking in Austin" Location-specific data

7 Personas "[product] for [audience]" "CRM for real estate" Audience-specific use cases

8 Integrations "[A] + [B] integration" "Slack Asana integration" Integration documentation

9 Glossary "what is [term]" "what is churn rate" Domain expertise

10 Translations Content in N languages Localized guides Translation + localization data

11 Directory "[category] tools" "AI writing tools" Tool listings + evaluations

12 Profiles "[entity name]" "Stripe company profile" Entity-level data

13 Statistics "[topic] statistics" "SaaS churn statistics 2026" Verified statistical data

14 Calculators "[topic] calculator" "LTV calculator" Calculation logic + inputs

Playbook Selection Matrix

If you have... Primary Playbook Secondary Layer

A product with many integrations Integrations Comparisons

A design/creative tool Templates + Examples Personas

A multi-segment audience Personas Comparisons

Local/regional presence Locations Directory

A tool/utility product Calculators + Conversions Glossary

Deep domain expertise Glossary + Statistics Curation

A competitor landscape to exploit Comparisons + Curation Directory

User-generated content Examples + Directory Profiles

Layering rule: Combine up to 2 playbooks per page set. Example: "Best coworking spaces in [city]" = Curation + Locations.

Keyword Pattern Mining

Step 1: Pattern Identification

Extract the repeating structure from seed keywords:

Seed: "react developer salary san francisco" Pattern: [role] salary [city] Variables: role (200+ options), city (500+ options) Max pages: 200 x 500 = 100,000

Step 2: Volume Distribution Analysis

Not all variable combinations have search volume. Map the distribution:

Tier Volume Range Typical % of Total Pages Strategy

Head 1,000+ monthly 2-5% Priority indexation, highest content quality

Torso 100-999 monthly 15-25% Standard template, full deployment

Long-tail 10-99 monthly 40-50% Template with conditional content blocks

Zero-volume < 10 monthly 20-40% Noindex OR skip unless data is uniquely valuable

Step 3: Intent Classification

For each pattern, verify intent consistency:

Intent Type Template Implications CTA Strategy

Informational Data-heavy, educational content Newsletter, related content

Commercial investigation Comparison tables, pros/cons Free trial, demo

Transactional Pricing, availability, features Buy now, sign up

Navigational Brand-specific, direct answer Product page link

Data Pipeline Architecture

Pipeline Design

[Data Source] → [Extraction] → [Transformation] → [Enrichment] → [Validation] → [Template Population] → [Quality Check] → [Publish]

Data Quality Gates

Every record must pass these gates before page generation:

Gate Check Failure Action

Completeness All required fields populated Skip page, log for manual review

Accuracy Data matches source, no staleness > 90 days Flag for refresh

Uniqueness No duplicate records Merge or deduplicate

Minimum richness Page will have > 300 words of unique content Skip or enrich

Legal compliance Data usage rights verified Block publication

Update Cadence

Data Type Recommended Update Frequency Staleness Penalty

Pricing data Weekly High (users notice immediately)

Company/product data Monthly Medium

Statistical data Quarterly Low if year-tagged

Glossary/educational Semi-annually Very low

Location data Monthly Medium (closures, address changes)

Template Design System

Page Architecture

Every programmatic page must have these zones:

┌─────────────────────────────────────┐ │ Zone 1: Unique Header │ H1 with target keyword, unique intro paragraph ├─────────────────────────────────────┤ │ Zone 2: Primary Data Section │ The core data/content for this specific page ├─────────────────────────────────────┤ │ Zone 3: Contextual Analysis │ Insights, comparisons, trends specific to this entity ├─────────────────────────────────────┤ │ Zone 4: Related Data │ Adjacent data points that add depth ├─────────────────────────────────────┤ │ Zone 5: Internal Navigation │ Related pages, breadcrumbs, category links ├─────────────────────────────────────┤ │ Zone 6: CTA │ Conversion element matched to intent └─────────────────────────────────────┘

Uniqueness Requirements

Each page MUST have at least 3 of these 5 uniqueness sources:

Unique data points -- Numbers, facts, or attributes specific to this entity
Conditional content blocks -- Sections that appear/disappear based on data attributes
Calculated insights -- Derived metrics (percentages, comparisons, rankings)
Contextual recommendations -- "If X, then Y" advice blocks based on the data
User-generated content -- Reviews, comments, or community contributions

URL Structure

Always use subfolders. Never subdomains for pSEO.

Pattern URL Template Example

Location /[service]/[city]/

/coworking/austin/

Comparison /compare/[a]-vs-[b]/

/compare/notion-vs-confluence/

Integration /integrations/[partner]/

/integrations/slack/

Glossary /glossary/[term]/

/glossary/churn-rate/

Persona /[product]-for-[audience]/

/crm-for-real-estate/

Quality Control Framework

Pre-Publication QA Checklist

Content Quality:

Each page has > 300 words of unique content (not counting shared template elements)
H1 is unique and contains the target keyword
Meta title is unique (< 60 chars) and meta description is unique (< 155 chars)
No broken data references (empty fields rendered as "N/A" or blank)
At least 2 conditional content blocks triggered per page
No duplicate pages targeting the same keyword

Technical SEO:

Canonical tag points to self
Hreflang tags if multilingual
Schema markup renders without errors
Page loads in < 3 seconds
Mobile responsive

Internal Linking:

Breadcrumb trail is complete
3-5 related pages linked contextually
Hub page links to this page
No orphan pages in the set

Thin Content Detection

Run this check against every generated page:

Signal Threshold Action

Unique word count < 200 unique words Block publication

Content similarity to another page in set

80% Jaccard similarity Merge or differentiate

Data fields populated < 60% of template fields Skip or enrich

User time-on-page (post-launch) < 15 seconds average Review and improve

Bounce rate (post-launch)

85% Review intent match

Internal Linking Architecture

Hub-and-Spoke Model

                ┌─────────┐
                │  HUB    │  /coworking/
                │  PAGE   │  (ranks for "coworking spaces")
                └────┬────┘
      ┌──────────────┼──────────────┐
 ┌────┴────┐    ┌────┴────┐    ┌────┴────┐
 │ SPOKE 1 │    │ SPOKE 2 │    │ SPOKE 3 │
 │ /austin/│    │ /denver/│    │ /seattle/│
 └────┬────┘    └────┬────┘    └────┬────┘
      │              │              │
 Cross-links between related spokes

Linking rules:

Hub links DOWN to every spoke (or top 50 spokes if > 200 pages)
Every spoke links UP to the hub
Spokes link ACROSS to 3-5 related spokes (geographic proximity, thematic similarity)
Deep pages link UP to their spoke AND the hub
Cross-silo links only when contextually genuine

Pagination for Large Sets

If a hub page has > 50 spokes, implement paginated sub-hubs:

/coworking/ → Top cities + browse by state /coworking/california/ → All California cities /coworking/california/page/2/ → Paginated if > 25 cities

Indexation Strategy

Crawl Budget Management

Page Set Size Strategy

< 500 pages Single XML sitemap, submit all

500-5,000 Segmented sitemaps by category

5,000-50,000 Segmented sitemaps + priority scoring + IndexNow

50,000+ Programmatic sitemap generation + crawl budget monitoring + strategic noindex

Indexation Priority

Priority Pages Action

P0 Hub pages Submit immediately, internal link from homepage

P1 Head-volume spokes (top 10%) Submit in first sitemap batch

P2 Torso-volume spokes Submit in second batch, 1-2 weeks later

P3 Long-tail spokes Submit gradually over 4-6 weeks

P4 Zero-volume pages Noindex unless data is uniquely valuable

IndexNow Integration

For large-scale updates, use IndexNow to notify search engines immediately:

POST https://api.indexnow.org/indexnow { "host": "yoursite.com", "key": "your-api-key", "urlList": ["https://yoursite.com/page1", "https://yoursite.com/page2"] }

Launch Sequence

Phase 1: Pilot (Week 1-2)

Deploy 20-50 pages from head-volume tier
Submit sitemap with pilot pages only
Monitor indexation rate daily
Check for crawl errors in Search Console

Phase 2: Scale (Week 3-6)

Deploy remaining torso-volume pages in batches of 100-500
Add cross-links between deployed pages
Monitor thin content warnings
Track impressions in Search Console

Phase 3: Long-Tail (Week 7-12)

Deploy long-tail pages
Noindex zero-volume pages (keep them crawlable but not indexed)
Begin link acquisition outreach for hub pages

Phase 4: Optimization (Ongoing)

A/B test template variations on head-volume pages
Refresh stale data quarterly
Add conditional content blocks based on engagement data
Monitor for keyword cannibalization across the set

Post-Launch Optimization

Metrics Dashboard

Metric Frequency Target

Indexation rate Weekly

90% of submitted pages indexed within 60 days

Organic impressions Weekly Trending up month-over-month

Average position (by tier) Bi-weekly Head pages: top 10; Torso: top 30

Click-through rate Monthly

3% for head pages

Bounce rate Monthly < 70%

Conversion rate Monthly

1% for transactional intent

Pages per session Monthly

1.5

Optimization Playbook

Signal Diagnosis Action

Indexed but not ranking Content quality or authority gap Enrich content, build links to hub

Ranking but low CTR Title/description not compelling A/B test meta titles

Ranking but high bounce Intent mismatch or thin content Audit against search intent, add data

Deindexed after initial indexing Thin content penalty Improve uniqueness, reduce similarity

Crawled but not indexed Quality threshold not met Add more unique content per page

Anti-Patterns and Penalty Avoidance

Anti-Pattern Why It Fails Prevention

City-name swapping Same content + different city = doorway page penalty Each location page needs unique local data

Keyword stuffing in templates Unnatural density triggers spam filters Keep keyword density 1-2%, write naturally

Generating pages for zero-demand queries Wastes crawl budget, signals low quality Validate demand before generating

No internal links to pSEO pages Orphan pages get deprioritized Connect every page to the hub-spoke structure

Stale data never refreshed Users lose trust, Google notices Set update cadence per data type

All pages identical structure Lack of variation signals automation to Google Use 3-5 template variants

Decision Matrix: Build vs Skip

Score each dimension 1-5, then apply the threshold.

Dimension Weight 1 (Skip) 5 (Build)

Search demand 30% < 1K aggregate monthly

50K aggregate monthly

Data quality 25% Public/scraped, easily replicated Proprietary, defensible

Competitive gap 20% DR gap > 40, strong incumbents DR gap < 15, weak/no incumbents

Template feasibility 15% Each page needs unique editorial Clean template fits all variations

Business alignment 10% No conversion path from these pages Direct path to core product

Scoring guide:

4.0+ weighted average: Build immediately
3.0-3.9: Build if resources allow, validate with pilot first
2.0-2.9: Invest in data quality or authority first
< 2.0: Do not build

Output Artifacts

Artifact Format Description

Opportunity Analysis Markdown table Keyword patterns x volume x data source x difficulty x business alignment

Playbook Recommendation Decision matrix If/then mapping with rationale and real-world examples

Page Template Specification Annotated wireframe (markdown) URL pattern, zone structure, uniqueness sources, conditional logic

Data Pipeline Spec Flow diagram (text) Source > extraction > transformation > validation > publication

Quality Scorecard Checklist + thresholds Pre-publication QA gates with pass/fail criteria

Indexation Plan Phased timeline Priority tiers, sitemap structure, crawl budget allocation

Post-Launch Dashboard Metric table KPIs, targets, review cadence, optimization triggers

Related Skills

seo-audit -- Run after pSEO pages are live to diagnose indexation issues, thin content warnings, or ranking problems across the page set.
schema-markup -- Add structured data to pSEO templates (Product, FAQ, LocalBusiness) for rich snippet eligibility at scale.
site-architecture -- Plan hub-and-spoke structure and crawl budget management for large pSEO deployments (500+ pages).
competitor-alternatives -- Use the Comparisons playbook when building "[X] vs [Y]" pages; competitor-alternatives has dedicated comparison page frameworks.
content-creator -- Use when individual pages in the set need editorial-quality unique content beyond template generation.

programmatic-seo

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

program-manager

senior-devops

code-reviewer