rate-skill

Evaluate skill quality against best practices. Use when asked to "rate this skill", "review skill quality", "check skill formatting", "is this skill good", "evaluate SKILL.md", "grade this skill", or when validating skill files before publishing.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "rate-skill" with this command: npx skills add antjanus/skillbox/antjanus-skillbox-rate-skill

Rate Skill

Overview

Audit SKILL.md files against quality standards from generate-skill best practices. Provides letter grade (A-F) and actionable recommendations.

Core principle: Measure skill quality objectively to improve activation reliability and context efficiency.

When to Use

Always use when:

  • Reviewing skills before publishing
  • Validating skill structure and formatting
  • Checking if skill meets quality standards
  • User asks to "rate", "grade", or "review" a skill

Useful for:

  • Skill authors validating their work
  • Maintainers reviewing PRs with new skills
  • Quality audits of skill repositories
  • Before submitting skills to marketplaces

Avoid when:

  • Evaluating non-skill documentation
  • Reviewing code (not skill definitions)
  • General code quality auditing

How It Works

  1. Read specified SKILL.md file
  2. Evaluate against quality criteria
  3. Calculate scores per category
  4. Generate letter grade (A-F)
  5. Output findings with priorities
  6. Provide actionable recommendations

Quality Criteria

CategoryWeightCriteria
Length20%Under 500 lines (or progressive disclosure)
Conciseness20%Clear, scannable, no fluff
Repetitiveness15%No redundant content
Structure15%Required sections present and ordered
Triggers15%3-5+ specific activation phrases
Examples10%Good/Bad code comparisons
Troubleshooting5%Common issues addressed

Length (20%)

Scores: A: <500 or progressive disclosure | B: 500-600 | C: 600-800 | D: 800-1000 | F: >1000

Checks: Line count, reference/ directory, progressive disclosure links

Conciseness (20%)

Scores: A: High info density, scannable | B: Mostly concise | C: Some wordiness | D: Verbose | F: Excessive

Red flags: Long paragraphs (>5 sentences), redundant explanations, flowery language

Repetitiveness (15%)

Scores: A: Zero redundancy | B: 1-2 overlaps | C: 3-4 overlaps | D: 5+ overlaps | F: Heavy redundancy

Common: Format in section AND example, repeated "use when", duplicate trigger phrases

Structure (15%)

Scores: A: All required sections | B: Missing 1 optional | C: Missing 2-3 | D: Missing required | F: Severely lacking

Required: Frontmatter, Overview, When to Use, Main content, Examples (Good/Bad), Troubleshooting, Integration

Triggers (15%)

Scores: A: 5+ specific | B: 3-4 good | C: 2 phrases | D: 1 vague | F: None

Quality: User language ("when asked to X"), specific situations, multiple contexts, concrete not abstract

Examples (10%)

Scores: A: 3+ with Good/Bad | B: 2 with comparisons | C: 1 comparison | D: No comparisons | F: None

Quality: Uses tags, includes explanations, real scenarios, syntax highlighting

Troubleshooting (5%)

Scores: A: 5+ pairs | B: 3-4 pairs | C: 1-2 basic | D: Vague | F: None

Quality: Clear problem, cause identified, solution with code, explanation

Output Format

# Skill Rating: [Letter Grade]

## Summary
- **File:** path/to/SKILL.md
- **Lines:** XXX lines
- **Overall Grade:** [A/B/C/D/F] ([Score]/100)
- **Status:** [Production Ready / Needs Work / Not Ready]

## Category Scores

| Category | Score | Grade | Status |
|----------|-------|-------|--------|
| Length | XX/20 | [A-F] | [✅/⚠️/❌] |
| Conciseness | XX/20 | [A-F] | [✅/⚠️/❌] |
| Repetitiveness | XX/15 | [A-F] | [✅/⚠️/❌] |
| Structure | XX/15 | [A-F] | [✅/⚠️/❌] |
| Triggers | XX/15 | [A-F] | [✅/⚠️/❌] |
| Examples | XX/10 | [A-F] | [✅/⚠️/❌] |
| Troubleshooting | XX/5 | [A-F] | [✅/⚠️/❌] |

## Findings by Priority

### ❌ Critical Issues (Fix Before Publishing)
1. [Issue description]
   - Impact: [Why this matters]
   - Fix: [Specific action to take]

### ⚠️ Important Issues (Should Fix)
1. [Issue description]
   - Impact: [Why this matters]
   - Fix: [Specific action to take]

### 📋 Nice to Have
1. [Suggestion]
   - Benefit: [Why this helps]

## Strengths
- [What this skill does well]
- [Another strength]

## Priority Action Items
1. [Priority 1 action]
2. [Priority 2 action]
3. [Priority 3 action]

## Estimated Improvements
- Fix critical issues: +[X] points
- Address important issues: +[X] points
- Potential grade: [Current] → [Target]

Usage

Basic rating:

/rate-skill skills/example-skill/SKILL.md

Rate after changes:

# Make improvements
[edit SKILL.md]

# Re-rate
/rate-skill skills/example-skill/SKILL.md

Compare before/after:

# Rate original
/rate-skill skills/track-session/SKILL.md

# Make improvements
[condense, remove redundancy]

# Rate again to see improvement
/rate-skill skills/track-session/SKILL.md

Grading Scale

GradeScoreMeaning
A90-100Excellent - Production ready
B80-89Good - Minor improvements recommended
C70-79Acceptable - Needs work before publishing
D60-69Poor - Significant issues to address
F0-59Failing - Major overhaul needed

Status mapping:

  • A-B: Production Ready ✅
  • C: Needs Work ⚠️
  • D-F: Not Ready ❌

Examples

Example 1: Rating a High-Quality Skill

Input: /rate-skill skills/track-session/SKILL.md

Output:

# Skill Rating: A

## Summary
- **File:** skills/track-session/SKILL.md
- **Lines:** 489 lines
- **Overall Grade:** A (92/100)
- **Status:** Production Ready

## Category Scores

| Category | Score | Grade | Status |
|----------|-------|-------|--------|
| Length | 20/20 | A | Pass |
| Conciseness | 18/20 | A | Pass |
| Repetitiveness | 14/15 | A | Pass |
| Structure | 15/15 | A | Pass |
| Triggers | 15/15 | A | Pass |
| Examples | 9/10 | A | Pass |
| Troubleshooting | 5/5 | A | Pass |

## Strengths
- Excellent progressive disclosure with reference/VERIFICATION.md
- 10+ diverse trigger phrases
- Strong examples with Good/Bad comparisons

## Priority Action Items
1. Consider adding one more example for edge cases (optional)

Note: High-scoring skills get a short report focused on strengths and optional improvements.

Example 2: Rating a Skill That Needs Work

Input: /rate-skill skills/problematic-skill/SKILL.md

Output:

# Skill Rating: C

## Summary
- **File:** skills/problematic-skill/SKILL.md
- **Lines:** 742 lines
- **Overall Grade:** C (74/100)
- **Status:** Needs Work

## Findings by Priority

### Critical Issues
1. **Length: 742 lines without progressive disclosure**
   - Impact: High context usage, harder to scan
   - Fix: Move detailed content to reference/ directory

2. **Only 2 trigger phrases in description**
   - Impact: Poor activation reliability
   - Fix: Add 3-5 specific user phrases and situations

### Important Issues
1. **Verbose mode descriptions (30+ lines each)**
   - Fix: Condense to 2-3 lines per mode

## Priority Action Items
1. Implement progressive disclosure (move 200+ lines to reference/)
2. Add 3+ trigger phrases to description
3. Condense verbose sections

## Estimated Improvements
- Fix critical issues: +12 points -> 86 (B)
- Potential grade: C -> A

Note: Lower-scoring skills get detailed findings with specific fixes and an improvement roadmap.

Troubleshooting

Problem: Can't find SKILL.md file

Cause: Path incorrect or file doesn't exist.

Solution:

# Verify file exists
ls skills/skill-name/SKILL.md

# Use correct path
/rate-skill skills/skill-name/SKILL.md

Problem: Rating seems too harsh

Cause: Standards are strict for good reason - quality matters for activation.

Solution:

  • Review specific findings
  • Compare to high-quality skills
  • Focus on critical issues first
  • Remember: B grade is still "good"

Problem: Grade improved but still low

Cause: Multiple categories need attention.

Solution:

  • Focus on highest-weight categories first (Length, Conciseness)
  • Fix critical issues before nice-to-haves
  • Re-rate after each major change
  • Use "Estimated Improvements" as roadmap

Problem: Don't know how to fix an issue

Cause: Fix recommendation unclear.

Solution:

  • Check generate-skill examples for patterns
  • Review high-rated skills for reference
  • Ask for specific help on that issue
  • Consult CLAUDE.md for SkillBox guidelines

Integration

This skill works with:

  • generate-skill - Use after generating to validate quality
  • Skill development workflow - Rate before committing/publishing
  • Quality control - Gate for accepting skills into repositories
  • Continuous improvement - Track quality metrics over time

Workflow:

# Create skill
/generate-skill new-feature

# Rate it
/rate-skill skills/new-feature/SKILL.md

# Fix issues
[make improvements]

# Re-rate
/rate-skill skills/new-feature/SKILL.md

# When A or B grade, publish
git add skills/new-feature/
git commit -m "Add new-feature skill"

Quality gates:

  • A-B: Merge to main ✅
  • C: Request changes ⚠️
  • D-F: Reject until improved ❌

References

Based on:

  • generate-skill best practices
  • SkillBox CLAUDE.md guidelines
  • obra/superpowers patterns
  • Vercel agent-skills standards

Related:

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

Sharedintellect Quorum

Multi-agent validation framework — 6 independent AI critics evaluate artifacts against rubrics with evidence-grounded findings.

Registry SourceRecently Updated
3660Profile unavailable
Automation

Canary

Safety monitoring and tripwire detection for AI agents. Protects against unauthorized file access, dangerous commands, and excessive activity. Auto-halts on...

Registry SourceRecently Updated
1510Profile unavailable
General

track-session

No summary provided by upstream source.

Repository SourceNeeds Review
General

ideal-react-component

No summary provided by upstream source.

Repository SourceNeeds Review