3-Layer Token Compressor — Cut AI API Costs 40-60%

Pre-process prompts through 3 compression layers before sending to paid APIs. Uses a local Ollama model to intelligently compress messages and summarize history. Same quality, fewer tokens, lower bills.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "3-Layer Token Compressor — Cut AI API Costs 40-60%" with this command: npx skills add token-compressor

3-Layer Token Compressor — Cut AI API Costs 40-60%

Pre-process prompts through 3 compression layers before sending to paid APIs. Uses a free local Ollama model to do the compression work — your paid API only sees the condensed result.

Runtime Requirements

RequirementDetails
OllamaMust be running locally (default: localhost:11434)
Local modelA small model for compression (e.g. llama3.1:8b). Configurable via compressionModel option.
Node.js14+

Ollama is required at runtime. The compressor sends prompts to your local model — not to any external API.

What This Skill Sends to the Local Model

This skill sends the following to your local Ollama model:

OperationSystem promptUser prompt
Message compressionYou are a text compression tool. Output only what is asked, nothing else.Your message + instruction to compress
History summarizationSameOld conversation turns + instruction to summarize

No data is sent to external APIs. All compression happens locally.

Side Effects

TypeDescription
NETWORKHTTP to localhost:11434 only — your local Ollama instance
MEMORYResponse cache stored in-memory (Map, configurable size/TTL)
DISKNone — cache is not persisted to disk

Setup

const TokenCompressor = require('./src/token-compressor');

const compressor = new TokenCompressor({
  ollamaHost: 'localhost',      // default
  ollamaPort: 11434,            // default
  compressionModel: 'llama3.1:8b',  // default — any Ollama model works
  maxUncompressedTurns: 10,     // keep last N turns verbatim
  cacheMaxSize: 100,
  cacheTTL: 3600000             // 1 hour
});

See README.md for full API documentation and usage examples.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

TokenRanger

Install, configure, and operate the TokenRanger OpenClaw plugin. Use when you want to reduce cloud LLM token costs by 50-80% via local Ollama context compres...

Registry SourceRecently Updated
3180Profile unavailable
General

Λ-Compression — 90% - 98% Lossless Reasoning Compression

Physics-based lossless compression for AI output — prose AND structured data. Strips 60-98% of tokens with zero information loss. Prose mode compresses reaso...

Registry SourceRecently Updated
1690Profile unavailable
General

Token Tamer — AI API Cost Control

Monitor, budget, and optimize AI API spending across any provider. Tracks every call, enforces budgets, detects waste, provides optimization recommendations.

Registry SourceRecently Updated
2430Profile unavailable
Coding

Anthropic Token Optimizer

Reduce Anthropic API costs (cache read, compaction, context bloat) for OpenClaw agents. Use when users ask about token optimization, reducing API costs, cach...

Registry SourceRecently Updated
1140Profile unavailable