Compress

Compress text semantically with iterative validation, anchor checksums, and verified information preservation.

Safety Notice

This listing is from the official public ClawHub registry. Review SKILL.md and referenced scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "Compress" with this command: npx skills add ivangdavila/compress

⚠️ Important Limitations

This is SEMANTIC compression, not bit-perfect lossless.

  • L1-L2: Verified reconstruction, production-ready
  • L3-L4: Experimental, may lose subtle information
  • Never use for: Medical dosages, legal text, financial figures, safety-critical data

The Validation Loop

1. Compress original O → compressed C
2. Extract anchors from O (entities, numbers, dates)
3. Reconstruct C → R (without seeing O)
4. Verify: anchors match + semantic diff
5. If mismatch → refine C with missing info
6. Repeat until validated (max 3 iterations)

Convergence = verified. No convergence after 3 rounds = level too aggressive.


Quick Reference

TaskLoad
Compression levels (L1-L4)levels.md
Validation algorithm detailsvalidation.md
Format-specific strategiesformats.md
Token budgeting and metricsmetrics.md

Compression Levels

LevelRatioReliabilityUse Case
L1~0.8x✅ HighProduction, human-readable
L2~0.5x✅ GoodSystem prompts, repeated use
L3~0.3x⚠️ ModerateExperimental, review output
L4~0.15x⚠️ LowResearch only, expect losses

Anchor Checksum System

Before compression, extract critical facts:

[ANCHORS: 3 people, $42,000, 2024-03-15, "Project Alpha"]

Reconstruction MUST reproduce these exactly. If anchors mismatch → compression failed.


Core Rules

  1. Always validate — Never trust compression without reconstruction test
  2. Use anchors — Extract numbers, names, dates before compressing
  3. Cap at L2 for production — L3-L4 are experimental
  4. Report confidence — Include iteration count and anchor match rate
  5. Independent verification — Consider different model for reconstruction

Cost-Benefit Reality

Each compression costs 3-4 LLM calls. Break-even calculation:

break_even_retrievals = compression_tokens / saved_tokens_per_use

Only cost-effective if: You'll retrieve the compressed content 6-8+ times.

For one-time use → just use the original text.


Before Compressing

  • Content type is NOT safety-critical
  • Target level chosen (L1-L2 recommended)
  • Anchors identified (numbers, names, dates)
  • ROI makes sense (multiple retrievals expected)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

Fitbit Tracker

Personal Fitbit integration for daily health tracking with adaptive sleep and activity reporting

Registry SourceRecently Updated
General

Ollama Load Balancer

Ollama load balancer for Llama, Qwen, DeepSeek, and Mistral inference across multiple machines. Load balancing with auto-discovery via mDNS, health checks, q...

Registry SourceRecently Updated
General

Google Merchant Center

Google Merchant Center integration. Manage Accounts. Use when the user wants to interact with Google Merchant Center data.

Registry SourceRecently Updated
General

Twitter/X All-in-One — Search, Monitor & Publish Text & Media Posts

Searches and reads X (Twitter): profiles, timelines, mentions, followers, tweet search, trends, lists, communities, and Spaces. Publishes posts, likes/unlike...

Registry SourceRecently Updated