Agent Skill Creator Standard
Priority: P0 (CRITICAL)
Strict guidelines for High-Density Agent Skills. Maximize info/token ratio.
Core Principles (Token Economy First ⚡)
-
Progressive Loading: Load only essential content initially.
-
Lazy References: Move detailed examples to references/ .
-
Imperative Compression: Use verbs, abbreviations, bullet points.
-
Context Limits: Cursor(~100k), Claude(~200k), Windsurf(~32k).
Three-Level Loading System
-
Metadata: Triggers → AGENTS.md index (Proactive Activation)
-
SKILL.md: Body < 100 lines → Core guidelines (When triggered)
-
Resources: references/ , scripts/ , assets/ → Deep knowledge (On-demand)
Writing Rules
-
Imperative: Start with verbs. "Use BLoC" not "You should use BLoC".
-
Token Economy: Skip articles. Use standard abbreviations. Bullets > paragraphs.
-
Structure:
-
Mandatory Frontmatter (YAML: name, description, metadata labels & triggers).
-
Priority: P0 (Critical), P1 (Standard), P2 (Optional).
-
Guidelines: Imperative Do's.
-
Anti-Patterns: Strict format Don'ts.
-
References: Links to lazy-loaded files.
Strict Size Limits
Element Limit Action if Exceeded
SKILL.md total 100 lines Extract to references/
Inline code block 10 lines Extract to references/
Anti-pattern item 15 words Compress to imperative
Tables 8 rows Extract to references/
Strict Formatting Rules
-
Anti-Patterns: No X: Do Y[, not Z]. [Context <= 15 words]
-
Example: No Logic in Builder: Perform calculations in BLoC, not UI.
-
No Redundancy: Do not repeat frontmatter descriptions.
-
Oversized Skills: If SKILL.md >100 lines, extract step-by-step guides and complex scenarios to references/ .
-
Nested Formatting: Avoid Bold:
More Bold``.
Test, Measure & Iterate
After writing a skill draft, validate it before shipping:
-
Write eval cases: Create evals/evals.json with 2–3 realistic prompts (see testing.md).
-
Design trigger queries: Generate 8–10 should-trigger and 8–10 should-not-trigger queries to measure description accuracy.
-
Optimize description: Make it "pushy" — list explicit trigger contexts, not just what it does.
-
Catch regressions: Snapshot the skill before editing; compare before/after outputs for existing test cases.
-
Iterate: Re-run evals after each change; stop when trigger rate ≥ 80% on held-out queries.
See Testing, Trigger Rate & Regression Guide for eval schema, query design rules, and regression protocol.
Resources & Deep Knowledge
-
Resource Organization & Directory Structure
-
Skill Creation Lifecycle
-
Anti-Patterns Details
-
Creation Template
-
Testing, Trigger Rate & Regression Guide