Token Optimizer

A comprehensive toolkit to reduce token consumption, lower AI costs, and improve Claude Code performance. Every recommendation is backed by real experiment data from a controlled comparison of monolithic vs modular code architectures.

Installation

npx skills add alexismunoz1/token-optimizer

Or manually:

cp -r token-optimizer ~/.claude/skills/

Core Features

1. File Organization Optimization

The single highest-impact optimization. Small, focused files reduce token consumption by 18.2% and noise by 92% on focused tasks (the majority of daily development work).

Core rules:

Maximum 150 lines per file — split by responsibility if longer
Single responsibility — one concern per file
Descriptive names in kebab-case — the filename tells the AI exactly what's inside

Real example: Fixing an email validation bug required reading 814 lines in a monolithic file (49,466 tokens) vs only 67 lines in a modular setup (40,447 tokens) — 18.2% savings, 92% less noise.

For naming conventions, avoid/prefer tables, and project structure templates, see references/file-organization-guide.md

2. CLAUDE.md Optimization

A well-structured CLAUDE.md can reduce token consumption by 50-70%. Most projects have bloated CLAUDE.md files that load unnecessary context on every interaction.

Key principles:

Keep it under 500 lines — essentials only
Be specific — "PostgreSQL with Prisma" not just "database"
Include project structure and commands — save the AI from guessing
Use triggers, not full docs — reference skills/files for details, don't inline everything

For a ready-to-use optimized template, see references/claude-md-template.md

3. Context Management

Token waste often comes from accumulated irrelevant context, not from individual operations.

Essential commands:

Command	When to Use	Effect
`/clear`	Switching tasks, after major corrections	Resets context completely
`/compact`	Long conversation (>50 exchanges)	Compresses history, keeps essentials
`/context`	Diagnosing high token use	Shows what's consuming tokens

Lazy loading: Don't front-load all information. One project achieved 54% reduction in initial tokens (7,584 → 3,434) by keeping only triggers in CLAUDE.md and loading details on demand.

For advanced strategies, subagent patterns, and MCP management, see references/context-management-guide.md

4. Strategic Model Selection

Choosing the right model per task type is one of the easiest cost savings to implement.

Task Type	Model	Why
80% of daily tasks	Sonnet	Best cost/performance ratio
Complex architecture	Opus	Deeper reasoning needed
Simple/quick tasks	Haiku	Up to 18x cheaper than Opus

Default to Sonnet. Escalate to Opus only for genuinely complex problems. Use Haiku for simple tasks, tests, and searches.

5. MCP & Subagent Optimization

MCP Management:

Keep maximum 10 active MCPs at a time (max 80 total tools)
Disable MCPs not needed for the current task
Each unused MCP still costs tokens in tool descriptions

Subagents for verbose tasks: Use the Task tool for operations that generate large output (test runs, builds, searches). The verbose output stays in the subagent's context — only the summary returns to your main conversation.

Quick Wins Checklist

Apply these in order of impact:

Run /context first → establishes your baseline before any changes
Split large files (>150 lines) into focused modules → saves 18%+ tokens
Optimize your CLAUDE.md → can reduce consumption 50-70%
Use /clear between tasks → eliminates irrelevant context
Use /compact in long conversations → compresses history
Use subagents for verbose tasks → test output, build logs, and search results stay in subagent context instead of polluting your main conversation
Use the right model → default to Sonnet for daily work, Haiku for simple tasks (18x cheaper than Opus), Opus only for genuinely complex architecture decisions
Limit active MCPs to ≤10 → each unused MCP still costs tokens every turn because its tool descriptions are sent in every request

Expected Savings

Results from our controlled experiment with an 814-line TypeScript e-commerce app:

Optimization	Impact
Modular files (focused tasks)	-18.2% tokens
Noise reduction (lines processed)	-92%
Optimized CLAUDE.md	-50-70% consumption
Lazy loading context	-54% initial tokens
Haiku vs Opus (simple tasks)	-94% cost

Key insight: Focused tasks (bug fixes, specific changes — ~80% of daily work) benefit enormously from modular code. Cross-cutting tasks show minimal difference at small scale (+1-5%) but modular wins decisively at 5,000+ lines.

Note on scale: These results are from a controlled experiment with an 814-line codebase. At larger scales (5,000+ lines), the savings from modular architecture are even more significant because monolithic files start hitting context window limits while modular files maintain constant size (35-146 lines each).

For the complete experiment methodology and raw data, see references/metrics-report.md

Diagnostic Workflow

When activated, follow this process:

Measure first: Always start by asking the user to run /context. Without a baseline number, you can't prove any optimization worked. This step is not optional.
Read the user's code: Before recommending anything, look at their actual files and project structure. Scan for files >150 lines, check their CLAUDE.md size, and count active MCPs. Recommendations grounded in their real codebase are far more useful than generic advice.
Identify: Determine the biggest source of waste (large files, bloated CLAUDE.md, accumulated context, too many MCPs)
Recommend: Suggest the highest-impact optimization from the Quick Wins Checklist
Verify: After changes, have the user re-run /context to measure improvement

Important guidelines:

Always diagnose first — don't dump all optimizations
Measure before and after — every optimization should be verified with /context
Focus on the user's specific problem — identify the most impactful change first
Be transparent about trade-offs — modular files save 18%+ on focused tasks but show minimal difference on cross-cutting tasks at small scale

Usage Examples

"My Claude Code sessions are getting expensive"

Run /context to see current token consumption breakdown
Audit CLAUDE.md size — if over 500 lines, trim to essentials
Check for files >150 lines — identify candidates for splitting
Count active MCPs — disable unused ones
Review model usage — switch routine tasks to Sonnet/Haiku

"Organize my codebase for AI"

Scan the project for files exceeding 150 lines
Identify generic filenames (utils.ts, helpers.ts, index.ts with logic)
Propose file splits by responsibility with new descriptive names
Suggest a project structure following the organization guide

"My context window keeps filling up"

Run /context to identify what's consuming tokens
Check if CLAUDE.md has inline documentation that should be referenced instead
Recommend /clear between tasks and /compact for long sessions
Suggest moving verbose content to referenced files (lazy loading)

Troubleshooting

Problem	Cause	Solution
No improvement after optimizations	No baseline measurement taken	Run `/context` before AND after each change
Don't know how many tokens I'm using	Token consumption not visible by default	Use `/context` to see the full breakdown
`/compact` doesn't reduce enough	Compresses but keeps essentials	Use `/clear` if prior context is irrelevant
Cross-cutting tasks slower after splitting	Multiple reads needed (1-5% more tokens)	Expected and marginal — focused tasks (80% of work) still save 18%+

Reference Materials

references/file-organization-guide.md — Naming conventions, project structure templates, and implementation checklist
references/context-management-guide.md — Lazy loading, subagents, MCP management, and model selection strategies
references/metrics-report.md — Complete experiment data and methodology with raw numbers
references/claude-md-template.md — Ready-to-use optimized CLAUDE.md template

token-optimizer

Safety Notice

Copy this and send it to your AI assistant to learn

Token Optimizer

Installation

Core Features

1. File Organization Optimization

2. CLAUDE.md Optimization

3. Context Management

4. Strategic Model Selection

5. MCP & Subagent Optimization

Quick Wins Checklist

Expected Savings

Diagnostic Workflow

Usage Examples

"My Claude Code sessions are getting expensive"

"Organize my codebase for AI"

"My context window keeps filling up"

Troubleshooting

Reference Materials

Source Transparency

Related Skills

token-optimizer

token-optimizer

token-optimizer

barrel-exports