token-optimizer

Practical guide to reduce token consumption, lower AI costs, and improve Claude Code performance through file organization, context management, and strategic model selection. Backed by real experiment data. Use when user mentions "optimize tokens", "reduce costs", "Claude is slow", "too many tokens", "token budget", "context window full", "organize codebase for AI", or "reduce token consumption". Do NOT use for general coding questions, debugging, or performance optimization unrelated to AI token usage.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "token-optimizer" with this command: npx skills add alexismunoz1/token-optimizer

Token Optimizer

A comprehensive toolkit to reduce token consumption, lower AI costs, and improve Claude Code performance. Every recommendation is backed by real experiment data from a controlled comparison of monolithic vs modular code architectures.

Installation

npx skills add alexismunoz1/token-optimizer

Or manually:

cp -r token-optimizer ~/.claude/skills/

Core Features

1. File Organization Optimization

The single highest-impact optimization. Small, focused files reduce token consumption by 18.2% and noise by 92% on focused tasks (the majority of daily development work).

Core rules:

  • Maximum 150 lines per file — split by responsibility if longer
  • Single responsibility — one concern per file
  • Descriptive names in kebab-case — the filename tells the AI exactly what's inside

Real example: Fixing an email validation bug required reading 814 lines in a monolithic file (49,466 tokens) vs only 67 lines in a modular setup (40,447 tokens) — 18.2% savings, 92% less noise.

For naming conventions, avoid/prefer tables, and project structure templates, see references/file-organization-guide.md

2. CLAUDE.md Optimization

A well-structured CLAUDE.md can reduce token consumption by 50-70%. Most projects have bloated CLAUDE.md files that load unnecessary context on every interaction.

Key principles:

  • Keep it under 500 lines — essentials only
  • Be specific — "PostgreSQL with Prisma" not just "database"
  • Include project structure and commands — save the AI from guessing
  • Use triggers, not full docs — reference skills/files for details, don't inline everything

For a ready-to-use optimized template, see references/claude-md-template.md

3. Context Management

Token waste often comes from accumulated irrelevant context, not from individual operations.

Essential commands:

CommandWhen to UseEffect
/clearSwitching tasks, after major correctionsResets context completely
/compactLong conversation (>50 exchanges)Compresses history, keeps essentials
/contextDiagnosing high token useShows what's consuming tokens

Lazy loading: Don't front-load all information. One project achieved 54% reduction in initial tokens (7,584 → 3,434) by keeping only triggers in CLAUDE.md and loading details on demand.

For advanced strategies, subagent patterns, and MCP management, see references/context-management-guide.md

4. Strategic Model Selection

Choosing the right model per task type is one of the easiest cost savings to implement.

Task TypeModelWhy
80% of daily tasksSonnetBest cost/performance ratio
Complex architectureOpusDeeper reasoning needed
Simple/quick tasksHaikuUp to 18x cheaper than Opus

Default to Sonnet. Escalate to Opus only for genuinely complex problems. Use Haiku for simple tasks, tests, and searches.

5. MCP & Subagent Optimization

MCP Management:

  • Keep maximum 10 active MCPs at a time (max 80 total tools)
  • Disable MCPs not needed for the current task
  • Each unused MCP still costs tokens in tool descriptions

Subagents for verbose tasks: Use the Task tool for operations that generate large output (test runs, builds, searches). The verbose output stays in the subagent's context — only the summary returns to your main conversation.

Quick Wins Checklist

Apply these in order of impact:

  1. Run /context first → establishes your baseline before any changes
  2. Split large files (>150 lines) into focused modules → saves 18%+ tokens
  3. Optimize your CLAUDE.md → can reduce consumption 50-70%
  4. Use /clear between tasks → eliminates irrelevant context
  5. Use /compact in long conversations → compresses history
  6. Use subagents for verbose tasks → test output, build logs, and search results stay in subagent context instead of polluting your main conversation
  7. Use the right model → default to Sonnet for daily work, Haiku for simple tasks (18x cheaper than Opus), Opus only for genuinely complex architecture decisions
  8. Limit active MCPs to ≤10 → each unused MCP still costs tokens every turn because its tool descriptions are sent in every request

Expected Savings

Results from our controlled experiment with an 814-line TypeScript e-commerce app:

OptimizationImpact
Modular files (focused tasks)-18.2% tokens
Noise reduction (lines processed)-92%
Optimized CLAUDE.md-50-70% consumption
Lazy loading context-54% initial tokens
Haiku vs Opus (simple tasks)-94% cost

Key insight: Focused tasks (bug fixes, specific changes — ~80% of daily work) benefit enormously from modular code. Cross-cutting tasks show minimal difference at small scale (+1-5%) but modular wins decisively at 5,000+ lines.

Note on scale: These results are from a controlled experiment with an 814-line codebase. At larger scales (5,000+ lines), the savings from modular architecture are even more significant because monolithic files start hitting context window limits while modular files maintain constant size (35-146 lines each).

For the complete experiment methodology and raw data, see references/metrics-report.md

Diagnostic Workflow

When activated, follow this process:

  1. Measure first: Always start by asking the user to run /context. Without a baseline number, you can't prove any optimization worked. This step is not optional.
  2. Read the user's code: Before recommending anything, look at their actual files and project structure. Scan for files >150 lines, check their CLAUDE.md size, and count active MCPs. Recommendations grounded in their real codebase are far more useful than generic advice.
  3. Identify: Determine the biggest source of waste (large files, bloated CLAUDE.md, accumulated context, too many MCPs)
  4. Recommend: Suggest the highest-impact optimization from the Quick Wins Checklist
  5. Verify: After changes, have the user re-run /context to measure improvement

Important guidelines:

  • Always diagnose first — don't dump all optimizations
  • Measure before and after — every optimization should be verified with /context
  • Focus on the user's specific problem — identify the most impactful change first
  • Be transparent about trade-offs — modular files save 18%+ on focused tasks but show minimal difference on cross-cutting tasks at small scale

Usage Examples

"My Claude Code sessions are getting expensive"

  1. Run /context to see current token consumption breakdown
  2. Audit CLAUDE.md size — if over 500 lines, trim to essentials
  3. Check for files >150 lines — identify candidates for splitting
  4. Count active MCPs — disable unused ones
  5. Review model usage — switch routine tasks to Sonnet/Haiku

"Organize my codebase for AI"

  1. Scan the project for files exceeding 150 lines
  2. Identify generic filenames (utils.ts, helpers.ts, index.ts with logic)
  3. Propose file splits by responsibility with new descriptive names
  4. Suggest a project structure following the organization guide

"My context window keeps filling up"

  1. Run /context to identify what's consuming tokens
  2. Check if CLAUDE.md has inline documentation that should be referenced instead
  3. Recommend /clear between tasks and /compact for long sessions
  4. Suggest moving verbose content to referenced files (lazy loading)

Troubleshooting

ProblemCauseSolution
No improvement after optimizationsNo baseline measurement takenRun /context before AND after each change
Don't know how many tokens I'm usingToken consumption not visible by defaultUse /context to see the full breakdown
/compact doesn't reduce enoughCompresses but keeps essentialsUse /clear if prior context is irrelevant
Cross-cutting tasks slower after splittingMultiple reads needed (1-5% more tokens)Expected and marginal — focused tasks (80% of work) still save 18%+

Reference Materials

  • references/file-organization-guide.md — Naming conventions, project structure templates, and implementation checklist
  • references/context-management-guide.md — Lazy loading, subagents, MCP management, and model selection strategies
  • references/metrics-report.md — Complete experiment data and methodology with raw numbers
  • references/claude-md-template.md — Ready-to-use optimized CLAUDE.md template

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

token-optimizer

No summary provided by upstream source.

Repository SourceNeeds Review
185-d4kooo
General

token-optimizer

No summary provided by upstream source.

Repository SourceNeeds Review
General

token-optimizer

No summary provided by upstream source.

Repository SourceNeeds Review
General

barrel-exports

No summary provided by upstream source.

Repository SourceNeeds Review