file-categorization

Reusable logic for categorizing files as Command, Agent, Skill, or Documentation based on structure and content analysis

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "file-categorization" with this command: npx skills add enuno/claude-command-and-control/enuno-claude-command-and-control-file-categorization

File Categorization Skill

When to Use This Skill

  • Processing files in integration pipelines
  • Scanning directories for file organization
  • Auto-routing files to appropriate locations
  • Generating file inventory reports
  • Validating repository structure

What This Skill Does

Analyzes file structure and content to accurately categorize files into:

  • Commands - Slash command definitions
  • Agents - Agent configuration files
  • Skills - Reusable workflow automation
  • Documentation - General markdown documentation
  • Other - Uncategorized files requiring manual review

Categorization Logic

Step 1: Filename Pattern Matching

Commands:

  • Filename matches *-command.md or *command.md
  • Located in .claude/commands/ directory
  • Filename uses verb-noun pattern (e.g., integration-scan.md)

Agents:

  • Filename matches *-agent.md or *agent.md
  • Located in agents-templates/ directory
  • Contains role-based names (architect, builder, validator, etc.)

Skills:

  • Filename is SKILL.md or *-SKILL.md or *-skill.md
  • Located in skills/*/ directories
  • Contains workflow automation content

Documentation:

  • Standard .md files
  • Located in docs/ directory
  • Contains reference or tutorial content

Step 2: Frontmatter Analysis

Read the YAML frontmatter (if present) to identify:

Command Indicators:

---
description: "..."
allowed-tools: [...]
author: "..."
version: "X.Y"
---

Skill Indicators:

---
name: skill-name
description: "..."
---

Agent Indicators (less structured, more prose):

## Agent Identity
**Role**: [Agent Role]
**Version**: X.Y.Z
**Purpose**: [Purpose description]

Step 3: Content Structure Analysis

Commands have:

  • Workflow sections with numbered steps
  • Bash command examples (prefixed with !)
  • allowed-tools restrictions
  • Usage examples

Agents have:

  • Core Responsibilities section
  • Allowed Tools and Permissions section
  • Workflow Patterns section
  • Context Management section

Skills have:

  • "When to Use" section
  • "What This Skill Does" section
  • Step-by-step process descriptions
  • Examples with real data

Documentation has:

  • Standard markdown structure
  • Tutorial or reference content
  • No executable workflows
  • Educational purpose

Step 4: Keyword Detection

Scan content for category-specific keywords:

Command Keywords:

  • !bash, !git, !npm, etc. (shell commands)
  • "allowed-tools"
  • "Usage:", "Workflow:", "Steps:"
  • Command-line patterns

Agent Keywords:

  • "Core Responsibilities"
  • "Workflow Patterns"
  • "Context Management"
  • "Orchestrator", "Sub-Agent"
  • "Handoff", "Delegation"

Skill Keywords:

  • "When to Use"
  • "What This Skill Does"
  • "Skill" in self-references
  • Reusable workflow language

Documentation Keywords:

  • "Introduction", "Overview", "Guide"
  • "Tutorial", "Reference", "Best Practices"
  • Educational/explanatory language

Categorization Algorithm

function categorizeFile(filePath, content):
  // Phase 1: Filename and location
  if filename matches command patterns OR in .claude/commands/:
    category = "Command"
    confidence = "High"

  else if filename == "SKILL.md" OR in skills/*/:
    category = "Skill"
    confidence = "High"

  else if in agents-templates/:
    category = "Agent"
    confidence = "High"

  else if in docs/:
    category = "Documentation"
    confidence = "Medium"

  // Phase 2: Frontmatter analysis (refine)
  frontmatter = extractYAML(content)
  if frontmatter contains "allowed-tools" AND "version":
    category = "Command"
    confidence = "High"

  else if frontmatter contains "name" (no allowed-tools):
    category = "Skill"
    confidence = "High"

  // Phase 3: Content structure (if still uncertain)
  if confidence != "High":
    if content contains "## Agent Identity":
      category = "Agent"
      confidence = "High"

    else if content contains "## When to Use":
      category = "Skill"
      confidence = "Medium"

    else if content contains "!bash" OR "!git":
      category = "Command"
      confidence = "Medium"

  // Phase 4: Fallback
  if category == null:
    category = "Other"
    confidence = "Low"
    reason = "Unable to determine category, manual review needed"

  return {category, confidence, reasoning}

Output Format

For each categorized file, return:

### [Filename]
- **Category**: [Command|Agent|Skill|Documentation|Other]
- **Confidence**: [High|Medium|Low]
- **Reasoning**: [Why this category was assigned]
- **Frontmatter**: [✅ Valid | ⚠️ Malformed | ❌ Missing]
- **Required Fields**: [List of found/missing fields]
- **Recommended Location**: [Target directory path]

Example Usage

Example 1: Categorizing Integration File

Input:

File: USING-GIT-WORKTREES-SKILL.md
Content:
---
name: using-git-worktrees
description: Creates isolated git worktrees...
---

# Using Git Worktrees

## When to Use
...

Output:

### USING-GIT-WORKTREES-SKILL.md
- **Category**: Skill
- **Confidence**: High
- **Reasoning**: Filename matches skill pattern, frontmatter has 'name' field, content has "When to Use" section
- **Frontmatter**: ✅ Valid
- **Required Fields**: name ✅, description ✅
- **Recommended Location**: skills/using-git-worktrees/SKILL.md

Example 2: Categorizing Command File

Input:

File: integration-scan.md
Content:
---
description: "Scan and categorize incoming files"
allowed-tools: ["Read", "Bash(find)"]
author: "Claude Command and Control"
version: "1.0"
---

# Integration Scan

## Purpose
...

Output:

### integration-scan.md
- **Category**: Command
- **Confidence**: High
- **Reasoning**: Filename uses verb-noun pattern, frontmatter has 'allowed-tools' and 'version'
- **Frontmatter**: ✅ Valid
- **Required Fields**: description ✅, allowed-tools ✅, author ✅, version ✅
- **Recommended Location**: .claude/commands/integration-scan.md

Example 3: Uncategorizable File

Input:

File: notes.md
Content:
# Random Notes

Some thoughts about the project...

Output:

### notes.md
- **Category**: Other
- **Confidence**: Low
- **Reasoning**: No frontmatter, no structural indicators, generic content
- **Frontmatter**: ❌ Missing
- **Required Fields**: N/A
- **Recommended Location**: Manual review required

Integration with Commands

Used By

  • /integration-scan - Primary categorization logic
  • /integration-process - Determines target directory
  • /integration-validate - Validates category-specific structure

Usage Pattern

# In integration-scan command

For each file in /INTEGRATION/incoming:
  1. Read file content
  2. Use file-categorization skill
  3. Extract category and confidence
  4. Include in scan report
  5. Mark for processing if High confidence
  6. Flag for review if Medium/Low confidence

Category-Specific Validation Rules

Commands

  • ✅ MUST have: description, allowed-tools, author, version
  • ✅ SHOULD have: workflow steps, usage examples
  • ⚠️ Check: Tool permissions not overly broad

Agents

  • ✅ MUST have: Agent Identity, Core Responsibilities, Allowed Tools
  • ✅ SHOULD have: Workflow Patterns, Context Management
  • ⚠️ Check: Role clearly defined

Skills

  • ✅ MUST have: name, description, "When to Use"
  • ✅ SHOULD have: Examples, step-by-step process
  • ⚠️ Check: Examples use real data (not placeholders)

Documentation

  • ✅ MUST have: Clear title, structured content
  • ✅ SHOULD have: Table of contents, cross-references
  • ⚠️ Check: No executable workflows (should be in Command/Skill)

Error Handling

Malformed Frontmatter

Issue: YAML syntax error
Action: Note in categorization output
Category: "Other" with reason "Invalid frontmatter"
Recommendation: Fix YAML before processing

Conflicting Indicators

Issue: Filename says "command" but structure says "skill"
Action: Confidence = "Medium"
Reasoning: "Filename and content indicators conflict"
Recommendation: Manual review

Missing Content

Issue: File is empty or too short (<100 chars)
Action: Category = "Other"
Confidence: "Low"
Reasoning: "Insufficient content for categorization"

Testing Recommendations

Test with:

  1. Typical files - Standard commands, agents, skills
  2. Edge cases - Mixed indicators, missing frontmatter
  3. Malformed files - Syntax errors, incomplete content
  4. Ambiguous files - Could fit multiple categories

Expected accuracy:

  • High confidence: >95% correct
  • Medium confidence: >80% correct
  • Low confidence: Requires manual review

Version History

1.0 (2025-11-23)

  • Initial file categorization skill
  • Four-phase categorization algorithm
  • Integration with scan/process commands
  • Comprehensive validation rules

Skill Status: Production Ready Accuracy Target: >95% for High confidence categorizations Dependencies: None (standalone logic)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Research

researcher-role-skill

No summary provided by upstream source.

Repository SourceNeeds Review
Research

content-research-writer

No summary provided by upstream source.

Repository SourceNeeds Review
General

case-analyzer

No summary provided by upstream source.

Repository SourceNeeds Review