enhance-docs
Analyze documentation for readability, structure, and RAG optimization.
Parse Arguments
const args = '$ARGUMENTS'.split(' ').filter(Boolean); const targetPath = args.find(a => !a.startsWith('--')) || '.'; const fix = args.includes('--fix'); const aiMode = args.includes('--ai');
Documentation Locations
Type Location Purpose
User docs docs/*.md , README.md
Human-readable guides
Agent docs agent-docs/*.md
AI reference material
Project memory CLAUDE.md , AGENTS.md
AI context/instructions
Optimization Modes
AI-Only Mode (--ai )
For agent-docs and RAG-optimized documentation:
-
Aggressive token reduction
-
Dense information packing
-
Self-contained sections for retrieval
-
Optimal chunking boundaries
Both Mode (--both , default)
For user-facing documentation:
-
Balance readability with AI-friendliness
-
Clear structure for both humans and retrievers
Workflow
-
Discover - Find all .md files
-
Parse - Extract structure and content
-
Check - Run pattern checks based on mode
-
Report - Generate markdown output
-
Fix - Apply auto-fixes if --fix
Detection Patterns
- Link Validation (HIGH)
-
Broken anchor links (text )
-
Links to non-existent files
-
Malformed link syntax
- Structure Validation (HIGH)
Heading hierarchy:
-
No jumps (H1 → H3 without H2)
-
Single H1 per document
-
Code blocks with language tags
Position-aware content (based on "lost in the middle" research):
-
Critical info at START or END of document
-
Supporting details in MIDDLE
-
Flag important content buried in middle sections
Recommended structure:
-
Overview/Purpose (START - high attention)
-
Quick Start / TL;DR
-
Detailed Content
-
Reference / API
-
Summary / Key Points (END - high attention)
-
Token Efficiency (HIGH - AI Mode)
Token estimation: characters / 4 or words * 1.3
Unnecessary prose:
-
"In this document..."
-
"As you can see..."
-
"Let's explore..."
-
"It's important to note that..."
Verbose phrases:
Verbose Concise
"in order to" "to"
"due to the fact that" "because"
"has the ability to" "can"
"at this point in time" "now"
"for the purpose of" "for"
"in the event that" "if"
Target: ~1500 tokens for project memory files, flexible for reference docs.
- RAG Optimization (MEDIUM - AI Mode)
Chunk size guidelines:
Size Issue
1000 tokens Too long, split into subtopics
<50 tokens Too short, merge with related content
200-500 tokens Optimal for retrieval
Semantic boundaries:
-
Single topic per section
-
Self-contained sections (avoid "It", "This" at section start)
-
Clear section titles that describe content
Context anchors:
Bad - ambiguous start
Configuration
It requires several settings...
Good - self-contained
Configuration
The plugin configuration requires several settings...
- Information Density (MEDIUM - AI Mode)
Prefer tables over prose:
Bad - verbose
The function accepts a path parameter which is required, a limit parameter which defaults to 10, and an optional format parameter.
Good - dense
| Param | Required | Default | Description |
|---|---|---|---|
| path | Yes | - | File path |
| limit | No | 10 | Max results |
| format | No | json | Output format |
Prefer lists over paragraphs for sequential items.
Use code blocks for examples, commands, configurations.
- Cross-Reference Quality (MEDIUM)
-
Internal links should use relative paths
-
External links should be stable (avoid commit hashes)
-
Reference sections should point to canonical sources
- Balance Suggestions (MEDIUM - Both Mode)
-
Missing section headers in long content (>500 words without heading)
-
Important information buried late in document
-
Missing TL;DR or summary for long documents
Auto-Fixes
Issue Fix
Inconsistent headings H1 → H3 becomes H1 → H2
Verbose phrases Replace with concise alternatives
Missing code language Add based on content detection
Output Format
Documentation Analysis: {name}
File: {path} Mode: {AI-only | Both} Tokens: ~{count}
| Certainty | Count |
|---|---|
| HIGH | {n} |
| MEDIUM | {n} |
Link Issues
| Line | Issue | Fix | Certainty |
Structure Issues
| Line | Issue | Fix | Certainty |
Efficiency Issues [AI mode]
| Line | Issue | Fix | Certainty |
RAG Issues [AI mode]
| Line | Issue | Fix | Certainty |
Pattern Statistics
Category Patterns Mode Certainty
Links 3 shared HIGH
Structure 4 shared HIGH
Token Efficiency 3 ai HIGH
RAG Optimization 3 ai MEDIUM
Information Density 2 ai MEDIUM
Cross-Reference 2 shared MEDIUM
Balance 3 both MEDIUM
Total 20
RAG Chunking
<bad_example>
Installation
[2000+ tokens of mixed content covering install, config, and usage]
</bad_example> <good_example>
Installation
[400 tokens - installation only]
Configuration
[300 tokens - config only]
Usage
[400 tokens - usage only]
</good_example>
Position-Aware Content
<bad_example>
Introduction
[Long background...]
History
[More context...]
Critical Setup Steps
[Important info buried in middle]
</bad_example> <good_example>
Quick Start (Critical)
[Important setup steps at START]
Background
[Supporting context in middle]
Reference
[Details...]
Key Reminders
[Critical points repeated at END]
</good_example>
Tables vs Prose
<bad_example>
The API accepts three parameters. The first is query which is required.
The second is limit which defaults to 10. The third is format.
</bad_example> <good_example>
| Param | Required | Default |
|---|---|---|
| query | Yes | - |
| limit | No | 10 |
| format | No | json |
</good_example>
References
-
agent-docs/CONTEXT-OPTIMIZATION-REFERENCE.md
-
Token budgeting, position awareness, chunking
-
agent-docs/PROMPT-ENGINEERING-REFERENCE.md
-
Structure, information density
Constraints
-
Auto-fix only HIGH certainty issues
-
Preserve original tone and style
-
Balance AI optimization with human readability (default mode)
-
Don't remove content, only restructure or condense