caveman-token-optimizer

Claude Code skill that makes AI agents respond in caveman-speak, cutting ~65-75% of output tokens while preserving full technical accuracy

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "caveman-token-optimizer" with this command: npx skills add JuliusBrussee/caveman

Caveman Token Optimizer

Skill by ara.so — Daily 2026 Skills collection.

A Claude Code skill and Codex plugin that makes AI agents respond in compressed caveman-speak — cutting ~65% of output tokens on average (up to 87%) while keeping full technical accuracy. No pleasantries. No filler. Just answer.

What It Does

Caveman mode strips:

  • Pleasantries: "Sure, I'd be happy to help!" → gone
  • Hedging: "It might be worth considering" → gone
  • Articles (a, an, the) → gone
  • Verbose transitions → gone

Caveman keeps:

  • All code blocks (written normally)
  • Technical terms (exact: useMemo, polymorphism, middleware)
  • Error messages (quoted exactly)
  • Git commits and PR descriptions (normal)

Same fix. 75% less word. Brain still big.

Install

Claude Code (npx)

npx skills add JuliusBrussee/caveman

Claude Code (plugin system)

claude plugin marketplace add JuliusBrussee/caveman
claude plugin install caveman@caveman

Codex

  1. Clone the repo
  2. Open Codex inside the repo
  3. Run /plugins
  4. Search Caveman
  5. Install plugin

Install once. Works in all sessions after that.

Manual / Local

git clone https://github.com/JuliusBrussee/caveman.git
cd caveman
pip install -e .

Usage — Trigger Commands

Claude Code

/caveman          # enable default (full) caveman mode
/caveman lite     # professional brevity, grammar intact
/caveman full     # default — drop articles, use fragments
/caveman ultra    # maximum compression, telegraphic

Codex

$caveman
$caveman lite
$caveman full
$caveman ultra

Natural language triggers

Any of these phrases activate caveman mode:

  • "talk like caveman"
  • "caveman mode"
  • "less tokens please"
  • "be concise"

Disable

/caveman off
# or say: "stop caveman" / "normal mode"

Level sticks until changed or session ends.

Intensity Levels

LevelTriggerStyleExample
Lite/caveman liteDrop filler, keep grammar"Component re-renders because inline object prop creates new reference each cycle. Wrap in useMemo."
Full/caveman fullDrop articles, use fragments"New object ref each render. Inline prop = new ref = re-render. Wrap in useMemo."
Ultra/caveman ultraTelegraphic, abbreviate everything"Inline obj prop → new ref → re-render. useMemo."

Benchmark Results

Real token counts from Claude API (reproducible via benchmarks/ directory):

TaskNormalCavemanSaved
React re-render bug118015987%
Auth middleware fix70412183%
PostgreSQL pool setup234738084%
Git rebase vs merge70229258%
Async/await refactor38730122%
Docker multi-stage build104229072%
Average121429465%

Important: Caveman only affects output tokens. Thinking/reasoning tokens are untouched. Caveman make mouth smaller, not brain.

Reproducing Benchmarks

git clone https://github.com/JuliusBrussee/caveman.git
cd caveman/benchmarks

# Set your Anthropic API key
export ANTHROPIC_API_KEY=your_key_here

# Run benchmark suite
python run_benchmarks.py

# Compare normal vs caveman responses
python compare.py --task react-rerender
python compare.py --task auth-middleware
python compare.py --all

Code Examples — What Caveman Mode Changes

Before (normal, 69 tokens)

The reason your React component is re-rendering is likely because
you're creating a new object reference on each render cycle. When
you pass an inline object as a prop, React's shallow comparison
sees it as a different object every time, which triggers a
re-render. I'd recommend using useMemo to memoize the object.

After (caveman full, 19 tokens)

New object ref each render. Inline object prop = new ref = re-render.
Wrap in `useMemo`.

Code blocks stay normal — caveman not stupid

# Caveman explains in grunt, but code stays clean:
# "Token expiry check broken. Fix:"

def verify_token(token: str) -> bool:
    payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"])
    # Was: payload["exp"] < time.time()
    # Fix:
    return payload["exp"] >= time.time()

What Caveman Preserves vs. Removes

# Tokens caveman REMOVES (waste):
filler_phrases = [
    "I'd be happy to help you with that",   # 8 tokens gone
    "The reason this is happening is because", # 7 tokens gone
    "I would recommend that you consider",   # 7 tokens gone
    "Sure, let me take a look at that",      # 8 tokens gone
    "Great question!",                        # 2 tokens gone
    "Certainly!",                             # 1 token gone
]

# Things caveman KEEPS (substance):
preserved = [
    "code blocks",          # always normal
    "technical_terms",      # exact spelling preserved
    "error_messages",       # quoted verbatim
    "variable_names",       # exact
    "git_commits",          # normal prose
    "pr_descriptions",      # normal prose
]

Integration Pattern — Using in a Project

If you want caveman-style compression in your own Claude API calls:

import anthropic

client = anthropic.Anthropic()  # uses ANTHROPIC_API_KEY env var

# Load the caveman SKILL.md as a system prompt addition
with open("path/to/caveman/SKILL.md", "r") as f:
    caveman_skill = f.read()

response = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    system=f"{caveman_skill}\n\nRespond in caveman mode: full intensity.",
    messages=[
        {"role": "user", "content": "Why is my React component re-rendering?"}
    ]
)

print(response.content[0].text)
# → "New object ref each render. Inline prop = new ref = re-render. useMemo fix."
print(f"Tokens used: {response.usage.output_tokens}")  # ~19 vs ~69

Session Workflow

# Start session with caveman
/caveman full

# Ask technical questions normally — agent responds in caveman
> Why does my Docker build take so long?
→ "Layer cache miss. COPY before RUN npm install. Fix order:"
[code block shown normally]

# Switch intensity mid-session
/caveman lite

# Turn off for PR description writing
/caveman off
> Write a PR description for this auth fix
→ [normal, professional prose]

# Back to caveman
/caveman

Troubleshooting

Caveman mode not activating:

# Verify plugin installed
claude plugin list | grep caveman

# Reinstall
claude plugin remove caveman
claude plugin install caveman@caveman

Savings lower than expected:

  • Caveman only compresses output tokens — input tokens unchanged
  • Tasks with heavy code output (like Docker setup) see less savings since code is preserved verbatim
  • Reasoning/thinking tokens not affected — savings show in visible response only
  • Ultra mode gets maximum compression; switch if full mode feels verbose

Need normal mode for specific output:

/caveman off   # for PR descriptions, user-facing docs, formal reports
/caveman       # re-enable after

Benchmarking your own tasks:

cd benchmarks/
export ANTHROPIC_API_KEY=your_key_here
python run_benchmarks.py --task "your custom task description"

Why It Works

Backed by a March 2026 paper "Brevity Constraints Reverse Performance Hierarchies in Language Models": constraining large models to brief responses improved accuracy by 26 percentage points on certain benchmarks. Verbose not always better.

TOKENS SAVED          ████████ 65% avg (up to 87%)
TECHNICAL ACCURACY    ████████ 100%
RESPONSE SPEED        ████████ faster (less to generate)
READABILITY           ████████ better (no wall of text)

Key Files

caveman/
├── SKILL.md          # the skill definition loaded by Claude Code
├── benchmarks/
│   ├── run_benchmarks.py   # reproduce token count results
│   └── compare.py          # side-by-side comparison tool
├── plugin.json             # Codex plugin manifest
└── README.md

Links


One rock. That it. 🪨

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

everything-claude-code-harness

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

paperclip-ai-orchestration

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

freecodecamp-curriculum

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

opencli-web-automation

No summary provided by upstream source.

Repository SourceNeeds Review