Skill Improver

Process user feedback from skill retrospectives and update skill files to improve them over time.

When to Use

User asks to "review skill feedback" or "improve skills based on usage"
You notice feedback files in .claude/feedback/
User mentions a skill didn't work well or missed something
Periodic review (monthly) to incorporate learnings

How It Works

Step 1: Gather Feedback

Read all feedback files in .claude/feedback/ :

ls -la .claude/feedback/retro-*.md

Look for patterns:

Multiple users reporting same missing step → add to skill
Benchmarks don't match user's context → add context-specific ranges
Workflow confusing → restructure or add clarifications
Skill incomplete → add missing sections

Step 2: Identify High-Impact Changes

Prioritize updates based on:

High Priority (do first):

Missing critical steps that users had to figure out themselves
Incorrect benchmarks or numbers
Confusing workflow that requires clarification
Safety issues or errors

Medium Priority:

Additional examples or templates
Better explanations of existing steps
Alternative approaches for different contexts

Low Priority:

Nice-to-have additions
Stylistic improvements
Minor clarifications

Step 3: Update Skill Files

For each skill needing updates:

3a. Add a "Learnings" Section

If the skill doesn't have one, add at the end:

Learnings from Use

[Date]: [Brief description of what was learned]

Feedback: [What users reported]
Update: [What we changed]
Result: [Expected improvement]

3b. Update Main Content

If feedback suggests core changes:

Add missing steps to checklists
Update benchmarks with ranges (e.g., "20-30% for B2C, 50-70% for B2B")
Restructure workflow if confusing
Add "Common Pitfalls" section if users make same mistakes

3c. Version the Change

At the top of the skill, track versions:

name: skill-name version: 1.2.0 last_updated: 2026-01-22 changelog:

v1.2.0 (2026-01-22): Added missing step for X based on user feedback
v1.1.0 (2026-01-15): Updated benchmarks for Y context
v1.0.0 (2026-01-01): Initial release

Step 4: Archive Processed Feedback

Move processed feedback to archive:

mkdir -p .claude/feedback/archive mv .claude/feedback/retro-2026-01-22-*.md .claude/feedback/archive/

Keep a summary of learnings in .claude/feedback/SUMMARY.md :

Feedback Summary

[Skill Name]

Total feedback sessions: 12 Last updated: 2026-01-22

Key learnings:

Added step for X (reported by 3 users)
Updated benchmarks for B2B context (reported by 5 users)
Clarified workflow around Y (reported by 2 users)

Patterns:

Users in enterprise context need higher benchmarks
Early-stage startups need more examples
Non-technical users need clearer explanations of jargon

Example Workflow

Scenario: product-market-fit skill needs improvement

Step 1: Review Feedback

Read .claude/feedback/retro-2026-01-22-143022.md :

Feedback

Missed important steps? yes

Improvements needed: The Sean Ellis test threshold of 40% seems high for B2B enterprise products. We're at 32% "very disappointed" but our retention is 85% D30 which is excellent. Should the skill mention that thresholds vary by product type?

Step 2: Identify Pattern

Check other feedback files → 3 more users report B2B context needs different benchmarks.

Step 3: Update Skill

Edit .claude/skills/product-market-fit/SKILL.md :

Before:

Sean Ellis Test (40% Rule)

"How would you feel if you could no longer use [product]?"

≥40% "Very disappointed" = Strong PMF

After:

Sean Ellis Test (Context-Dependent Thresholds)

"How would you feel if you could no longer use [product]?"

Thresholds by product type:

Consumer B2C: ≥40% "Very disappointed" = Strong PMF
SMB B2B: ≥35% "Very disappointed" = Strong PMF
Enterprise B2B: ≥30% "Very disappointed" = Strong PMF (longer sales cycles, different buying psychology)

Why the difference?

Enterprise buyers are more rational than emotional
Switching costs are higher (contracts, integrations)
Retention is a better PMF signal for B2B (see Step 2)

Add to Learnings section:

Learnings from Use

2026-01-22: Refined Sean Ellis thresholds by product type

Feedback: 4 users reported 40% threshold too high for B2B enterprise
Update: Added context-specific thresholds (B2C 40%, SMB 35%, Enterprise 30%)
Result: More accurate PMF diagnosis for different product types

Step 4: Archive & Track

mv .claude/feedback/retro-2026-01-22-*.md .claude/feedback/archive/

Update .claude/feedback/SUMMARY.md :

product-market-fit

Total feedback sessions: 4 Last updated: 2026-01-22

Key learnings:

Added context-specific Sean Ellis thresholds (reported by 4 users)
B2B needs different benchmarks than B2C

Next improvements to consider:

Add industry-specific retention benchmarks
Include examples from different verticals

Quality Checklist

Before updating any skill, ensure:

Feedback is from multiple users (pattern, not outlier)
Change makes skill more accurate, not just more complex
Benchmarks are sourced or validated (not anecdotal)
Update is backward compatible (doesn't break existing workflows)
Learnings section documents why we made the change
Version number incremented appropriately (semver)
Processed feedback archived, not deleted

Feedback Categories

Track feedback by type to identify systemic issues:

Category 1: Missing Steps

Example: "Skill forgot to mention we need to segment cohorts by acquisition channel" Action: Add step to checklist

Category 2: Incorrect Benchmarks

Example: "40% D30 retention is not 'strong' for our B2B SaaS, it's average" Action: Update benchmarks with context (B2C vs B2B vs Enterprise)

Category 3: Confusing Workflow

Example: "I didn't know whether to do cohort analysis before or after Sean Ellis test" Action: Number steps clearly, add workflow diagram

Category 4: Missing Context

Example: "Skill assumes I have 1000+ users, what if I only have 50?" Action: Add "Early Stage Adaptation" section

Category 5: Tool-Specific Issues

Example: "How do I calculate D30 retention in Google Analytics?" Action: Add "Implementation in Common Tools" section

Best Practices

Do:

✅ Look for patterns across multiple feedback sessions ✅ Update skills incrementally (small, tested changes) ✅ Document why changes were made (Learnings section) ✅ Preserve feedback history (archive, don't delete) ✅ Version skills so users know what changed

Don't:

❌ Update based on single piece of feedback (might be outlier) ❌ Make skills overly complex trying to cover every edge case ❌ Remove content without understanding why it was there ❌ Ignore feedback for more than 30 days (patterns emerge) ❌ Update without testing the new version

Automation Ideas

Weekly Digest (optional)

Create a script to summarize new feedback:

#!/bin/bash

.claude/hooks/learning/weekly-feedback-digest.sh

echo "📊 Feedback Digest (Last 7 Days)" echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"

find .claude/feedback -name "retro-*.md" -mtime -7 | while read -r file; do echo "" echo "File: $(basename $file)" grep "Skills Used" -A 5 "$file" grep "Improvements needed:" -A 3 "$file" done

Auto-Tag for Review

When feedback mentions specific issues, auto-tag:

"missing step" → tag for immediate review
"wrong number" → tag for fact-check
"confusing" → tag for clarity rewrite

Success Metrics

Track improvement over time:

Feedback frequency: Decreasing = skills getting better
Repeated issues: Should approach zero over time
User satisfaction: Track "Did this skill help?" responses
Skill usage: Updated skills should see increased usage

Meta: This Skill Improves Itself

This skill should follow its own advice:

Learnings from Use:

[To be filled as this skill gets used and improved]

Version History:

v1.0.0 (2026-01-22): Initial release - framework for skill improvement

Next Steps:

Review feedback in .claude/feedback/
Identify patterns and prioritize updates
Update skill files with improvements
Document learnings
Archive processed feedback
Commit changes with clear message

The more you use this system, the better your skills become. It's a continuous improvement loop.

skill-improver

Safety Notice

Copy this and send it to your AI assistant to learn

Learnings from Use

Feedback Summary

[Skill Name]

Feedback

Sean Ellis Test (40% Rule)

Sean Ellis Test (Context-Dependent Thresholds)

Learnings from Use

product-market-fit

.claude/hooks/learning/weekly-feedback-digest.sh

Source Transparency

Related Skills

learning-coach

ugc-content-creator

sales-playbook

journaling