Cost Tracking Framework

When This Activates

This skill activates when:

User asks about API costs or spending
Concerns about expensive operations
Need to optimize token usage

Token Cost Reference

Claude Pricing (Approximate)

Model Input (per 1M tokens) Output (per 1M tokens)

Opus ~$15 ~$75

Sonnet ~$3 ~$15

Haiku ~$0.25 ~$1.25

Typical Operation Costs

Operation Tokens Approximate Cost

Simple question 500-2K $0.01-0.05

File read + analysis 2-10K $0.05-0.25

Code generation 5-20K $0.15-0.50

Multi-file refactor 20-100K $0.50-2.50

Long conversation 50-200K $1.00-5.00

Cost Optimization Strategies

Route to Local LLM (FREE)

Use local_ask for simple tasks:

FREE - no API cost

local_ask question="where is the login function?" local_ask question="explain this error" mode=explain local_review file="src/auth.ts" focus=bugs

Good for local:

Simple lookups ("where is X?")
Code explanations
Commit message generation
Quick code reviews

Use Memory Tools First

Pre-indexed memory is instant and free:

Instant, no API cost

memory_query "authentication flow" memory_functions name="handleLogin" smart_read path="src/auth.ts" detail=summary

Reduce Context Size

Use smart_read with detail=summary before detail=full
Truncate large files to relevant sections
Clear conversation when changing topics

Batch Related Questions

Instead of 5 separate messages, combine:

"Can you: 1) explain the auth flow, 2) find the login component, 3) check for security issues, and 4) suggest improvements?"

Gateway Metrics

Check current efficiency:

gateway_metrics format=summary

Returns:

Cache hit rate
Token savings
Routing breakdown (local vs API)

Cost Estimation

Before expensive operations:

This refactor will touch ~20 files. Estimated cost: $0.50-1.00 Proceed? [Y/n]

Budget Awareness

Daily Patterns

Morning: Fresh context, lower cost
Long sessions: Context grows, higher cost
After compaction: Reset context, lower cost

High-Cost Triggers

"Analyze entire codebase"
"Review all files in directory"
"Generate comprehensive documentation"
Very long conversations (>50 turns)

Saving Tips

Start fresh for new topics - Don't carry irrelevant context
Use subagents - They have focused context
Check memory first - Summaries save full file reads
Compress transcripts - Archived sessions are compressed
Local for simple tasks - Ollama is free

Monitoring Commands

Check gateway efficiency

python3 ~/.claude-dash/learning/efficiency_tracker.py --report

View session sizes

du -sh ~/.claude-dash/sessions/*

cost-tracking

Safety Notice

Copy this and send it to your AI assistant to learn

FREE - no API cost

Instant, no API cost

Check gateway efficiency

View session sizes

Source Transparency

Related Skills

session-handoff

confidence-calibration

pricing-strategy

debug-strategy