Context Compression
Reduce context size while preserving information critical to task completion.
Overview
Context compression is essential for long-running agent sessions. The goal is NOT maximum compression—it's preserving enough information to complete tasks without re-fetching.
Key Metric: Tokens-per-task (total tokens to complete a task), NOT tokens-per-request.
When to Use
-
Long-running conversations approaching context limits
-
Multi-step agent workflows with accumulating history
-
Sessions with large tool outputs
-
Memory management in persistent agents
Strategy Quick Reference
Strategy Compression Interpretable Verifiable Best For
Anchored Iterative 60-80% Yes Yes Long sessions
Opaque 95-99% No No Storage-critical
Regenerative Full 70-85% Yes Partial Simple tasks
Sliding Window 50-70% Yes Yes Real-time chat
Recommended: Anchored Iterative Summarization with probe-based evaluation.
Anchored Summarization (RECOMMENDED)
Maintains structured, persistent summaries with forced sections:
Session Intent
[What we're trying to accomplish - NEVER lose this]
Files Modified
- path/to/file.ts: Added function X, modified class Y
Decisions Made
- Decision 1: Chose X over Y because [rationale]
Current State
[Where we are in the task - progress indicator]
Blockers / Open Questions
- Question 1: Awaiting user input on...
Next Steps
- Complete X
- Test Y
Why it works:
-
Structure FORCES preservation of critical categories
-
Each section must be explicitly populated (can't silently drop info)
-
Incremental merge (new compressions extend, don't replace)
Implementation
from dataclasses import dataclass, field from typing import Optional
@dataclass class AnchoredSummary: """Structured summary with forced sections."""
session_intent: str
files_modified: dict[str, list[str]] = field(default_factory=dict)
decisions_made: list[dict] = field(default_factory=list)
current_state: str = ""
blockers: list[str] = field(default_factory=list)
next_steps: list[str] = field(default_factory=list)
compression_count: int = 0
def merge(self, new_content: "AnchoredSummary") -> "AnchoredSummary":
"""Incrementally merge new summary into existing."""
return AnchoredSummary(
session_intent=new_content.session_intent or self.session_intent,
files_modified={**self.files_modified, **new_content.files_modified},
decisions_made=self.decisions_made + new_content.decisions_made,
current_state=new_content.current_state,
blockers=new_content.blockers,
next_steps=new_content.next_steps,
compression_count=self.compression_count + 1,
)
def to_markdown(self) -> str:
"""Render as markdown for context injection."""
sections = [
f"## Session Intent\n{self.session_intent}",
f"## Files Modified\n" + "\n".join(
f"- `{path}`: {', '.join(changes)}"
for path, changes in self.files_modified.items()
),
f"## Decisions Made\n" + "\n".join(
f"- **{d['decision']}**: {d['rationale']}"
for d in self.decisions_made
),
f"## Current State\n{self.current_state}",
]
if self.blockers:
sections.append(f"## Blockers\n" + "\n".join(f"- {b}" for b in self.blockers))
sections.append(f"## Next Steps\n" + "\n".join(
f"{i+1}. {step}" for i, step in enumerate(self.next_steps)
))
return "\n\n".join(sections)
Compression Triggers
Threshold Action
70% capacity Trigger compression
50% capacity Target after compression
10 messages minimum Required before compressing
Last 5 messages Always preserve uncompressed
CC 2.1.7: Effective Context Window
Calculate against effective context (after system overhead):
Trigger Static (CC 2.1.6) Effective (CC 2.1.7)
Warning 60% of static 60% of effective
Compress 70% of static 70% of effective
Critical 90% of static 90% of effective
Best Practices
DO
-
Use anchored summarization with forced sections
-
Preserve recent messages uncompressed (context continuity)
-
Test compression with probes, not similarity metrics
-
Merge incrementally (don't regenerate from scratch)
-
Track compression count and quality scores
DON'T
-
Compress system prompts (keep at START)
-
Use opaque compression for critical workflows
-
Compress below the point of task completion
-
Trigger compression opportunistically (use fixed thresholds)
-
Optimize for compression ratio over task success
Target Metrics
Metric Target Red Flag
Probe pass rate
90% <70%
Compression ratio 60-80%
95% (too aggressive)
Task completion Same as uncompressed Degraded
Latency overhead <2s
5s
References
For detailed implementation and patterns, see:
-
Compression Strategies: Detailed comparison of all strategies (anchored, opaque, regenerative, sliding window), implementation patterns, and decision flowcharts
-
Priority Management: Compression triggers, CC 2.1.7 effective context, probe-based evaluation, OrchestKit integration
Bundled Resources
-
assets/anchored-summary-template.md
-
Template for structured compression summaries with forced sections
-
assets/compression-probes-template.md
-
Probe templates for validating compression quality
-
references/compression-strategies.md
-
Detailed strategy comparisons
-
references/priority-management.md
-
Compression triggers and evaluation
Related Skills
-
context-engineering
-
Attention mechanics and positioning
-
memory-systems
-
Persistent storage patterns
-
multi-agent-orchestration
-
Context isolation across agents
-
observability-monitoring
-
Tracking compression metrics
Version: 1.0.0 (January ) Key Principle: Optimize for tokens-per-task, not tokens-per-request Recommended Strategy: Anchored Iterative Summarization with probe-based evaluation
Capability Details
anchored-summarization
Keywords: compress, summarize history, context too long, anchored summary Solves:
-
Reduce context size while preserving critical information
-
Implement structured compression with required sections
-
Maintain session intent and decisions through compression
compression-triggers
Keywords: token limit, running out of context, when to compress Solves:
-
Determine when to trigger compression (70% utilization)
-
Set compression targets (50% utilization)
-
Preserve last 5 messages uncompressed
probe-evaluation
Keywords: evaluate compression, test compression, probe Solves:
-
Validate compression quality with functional probes
-
Test information preservation after compression
-
Achieve >90% probe pass rate