3-Layer Token Compressor — Cut AI API Costs 40-60%

Pre-process prompts through 3 compression layers before sending to paid APIs. Uses a local Ollama model to intelligently compress messages and summarize history. Same quality, fewer tokens, lower bills.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "3-Layer Token Compressor — Cut AI API Costs 40-60%" with this command: npx skills add TheShadowRose/token-compressor

3-Layer Token Compressor — Cut AI API Costs 40-60%

Pre-process prompts through 3 compression layers before sending to paid APIs. Uses a free local Ollama model to do the compression work — your paid API only sees the condensed result.

Runtime Requirements

RequirementDetails
OllamaMust be running locally (default: localhost:11434)
Local modelA small model for compression (e.g. llama3.1:8b). Configurable via compressionModel option.
Node.js14+

Ollama is required at runtime. The compressor sends prompts to your local model — not to any external API.

What This Skill Sends to the Local Model

This skill sends the following to your local Ollama model:

OperationSystem promptUser prompt
Message compressionYou are a text compression tool. Output only what is asked, nothing else.Your message + instruction to compress
History summarizationSameOld conversation turns + instruction to summarize

No data is sent to external APIs. All compression happens locally.

Side Effects

TypeDescription
NETWORKHTTP to localhost:11434 only — your local Ollama instance
MEMORYResponse cache stored in-memory (Map, configurable size/TTL)
DISKNone — cache is not persisted to disk

Setup

const TokenCompressor = require('./src/token-compressor');

const compressor = new TokenCompressor({
  ollamaHost: 'localhost',      // default
  ollamaPort: 11434,            // default
  compressionModel: 'llama3.1:8b',  // default — any Ollama model works
  maxUncompressedTurns: 10,     // keep last N turns verbatim
  cacheMaxSize: 100,
  cacheTTL: 3600000             // 1 hour
});

See README.md for full API documentation and usage examples.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

TokenRanger

Install, configure, and operate the TokenRanger OpenClaw plugin. Use when you want to reduce cloud LLM token costs by 50-80% via local Ollama context compres...

Registry SourceRecently Updated
0163
Profile unavailable
General

Token Tamer — AI API Cost Control

Monitor, budget, and optimize AI API spending across any provider. Tracks every call, enforces budgets, detects waste, provides optimization recommendations.

Registry SourceRecently Updated
068
Profile unavailable
Automation

Tokenoptimizer

Reduce OpenClaw AI costs by 97%. Haiku model routing, free Ollama heartbeats, prompt caching, and budget controls. Go from $1,500/month to $50/month in 5 min...

Registry SourceRecently Updated
215.5K
Profile unavailable