CPR — Conversational Pattern Restoration
Fix robotic AI assistants. Any model. Any provider. Any personality.
Modern LLMs are over-trained toward sterile, corporate communication patterns. CPR identifies the 6 universal humanizing patterns lost during RLHF/fine-tuning and provides a systematic framework to restore them — without triggering sycophancy or hype drift.
Version 4.0: Personality-agnostic + model-size aware + game-theoretically grounded. Works on everything from Haiku to Opus. Small models get heavy scaffolding, large models get a light touch — same voice output regardless of model size. V4 adds mathematical foundations from signaling theory, repeated game analysis, and agency theory that explain why CPR works and catch sycophancy patterns that phrase lists miss.
Quick Start
- Define your baseline: Use
BASELINE_TEMPLATE.mdto identify YOUR authentic voice - Apply restoration patterns: Read
RESTORATION_FRAMEWORK.md— the 6 universal patterns across personality types - Prevent drift: Use
DRIFT_PREVENTION.mdcalibrated to YOUR personality - Understand the math (optional): Read
CPR_V4_GAME_THEORY.mdfor game theory foundations - Reference results: See
CROSS_MODEL_RESULTS.mdfor model-specific notes
What's Included
| File | Purpose |
|---|---|
QUICKSTART_TIERED.md | START HERE if new — Tier 1 (5 min), Tier 2 (30 min), Tier 3 (full). Don't install more than you need. |
INSTALLATION.md | Security transparency guide — exact system prompt block, file locations, what "prompt override" means, sandboxed testing steps |
ROLLBACK.md | Full uninstall & downgrade guide — backup procedure, exact removal steps, emergency kill switch |
README.md | Full overview, architecture, philosophy, FAQ |
BASELINE_TEMPLATE.md | START HERE — Define YOUR personality's authentic voice |
RESTORATION_FRAMEWORK.md | Core methodology — 6 universal patterns across personality types |
DRIFT_PREVENTION.md | Anti-drift system — pre-send gate, standing orders, daily reset |
MODEL_CALIBRATION.md | Three-tier prompt engineering for small/medium/large models |
CPR_V4_GAME_THEORY.md | V4 — Game theory foundations: signal credibility, repeated game stability, moral hazard, adaptive calibration |
DRIFT_MECHANISM_ANALYSIS.md | Root cause analysis of why drift happens |
CPR_EXTENDED.md | Autonomous drift monitoring for long-running persistent agents |
CROSS_MODEL_RESULTS.md | Test results across 8+ models with before/after examples |
TEST_VALIDATION.md | Practical validation tests (7 scenarios) |
Version History
V4.2 (March 2026) — Opus Final Audit + Authority Drift
- Authority/expertise drift — new Universal Drift Marker #8: domain confidence triggers pedagogical/expert register independent of task format. Distinct from genre drift. Scoring: +0.1 (context-dependent).
- Voice filter operationalized — abstract "does this sound like me?" replaced with 3 concrete anchor questions + tier-specific guidance (Tier 1: explicit banned-word lists per format type, Tier 2-3: semantic self-evaluation with anchors)
- Emotional contagion — Failure Mode 2 expanded from "excitement mirroring" to all emotions (frustration → over-apologetic, anxiety → minimizing, self-deprecation → over-correcting)
- Two new high-risk formats — comparative/review (critic register) + instructional/tutorial (pedagogical register)
- Anti-sycophancy scope note — added to DRIFT_PREVENTION.md clarifying markers apply to conversational output, not documentation
- Full Opus audit:
smith/CPR_OPUS_FINAL.md(2 must-fix, 3 should-fix, 8 nice-to-have)
V4.1 (March 2026) — Format-Induced Drift Fix
- Format-induced drift (Genre drift) — new universal drift category: task genre overrides voice calibration. Anti-sycophancy systems miss this because it's a register/tone shift, not validation language. Added to DRIFT_PREVENTION.md (Universal Drift Marker #7), CPR_EXTENDED.md (Failure Mode 4 + scoring weight +0.2 + high-risk contexts), and system prompt integration block.
- 99%+ success metric defined — CPR_V4_GAME_THEORY.md now defines the metric explicitly (% of scenarios where CPR-restored > baseline on blind human eval)
- Identified from: Rose/Smith production use (psychology profile analysis, 2026-03-05)
- Full audit report:
skills/cpr/CPR_V4_FULL_AUDIT.md
V4.0 (March 2026) — Game Theory Foundations
- Signal credibility analysis — catches novel sycophancy that phrase lists miss by evaluating whether a statement is cheap talk or costly signal
- Repeated game stability — Folk Theorem explains when personality collapses (small models = low discount factor) and why scaffolding fixes it
- Moral hazard framework — RLHF as principal-agent problem; monitoring architecture scales by model tier
- Adaptive calibration — dynamic tone adjustment with one-way validation ratchet (can decrease, never increase)
- Mathematical honesty — claims only what the math supports; reasoning, not proofs
- Game theory library by Halthasar (Yesterday AI)
- Independent audit by Claude Opus (19/24 findings fully addressed, 4 partially, 1 deferred → all resolved in V4.1)
V3.0 (February 2026) — Model-Size Calibration
- Three-tier scaffolding (heavy/standard/light) by model size
- Fixes Haiku voice collapse bug
- Cross-model test matrix
V2.0 (February 2026) — Personality-Agnostic
- Separated universal drift from personality variance
- Four personality archetypes + hybrids
- Personality-specific drift calibration
- Baseline definition protocol
V1.0 (February 2026) — Original
- 6 universal restoration patterns
- Single personality type (Direct/Minimal)
- Basic drift prevention
Core vs Extended
CPR Core (RESTORATION_FRAMEWORK + DRIFT_PREVENTION)
Use when: Sessions under ~30 messages, lightweight models, zero overhead wanted.
What you get: 6 universal patterns, static drift prevention, daily reset protocol. Works across all tested models.
CPR Extended (CPR_EXTENDED.md)
Use when: Sessions run 100+ messages, agent is persistent (24/7), drift returns after corrections.
What you get (in addition to Core): Autonomous real-time monitoring, silent self-correction, persistent state across compactions, self-learning thresholds.
CPR Game Theory Layer (CPR_V4_GAME_THEORY.md)
Use when: You want to understand why CPR works, optimize for edge cases, adapt the framework to novel situations, or scale monitoring to model capability.
What you get: Signal credibility test (catches novel sycophancy), Folk Theorem stability analysis (predicts voice collapse), moral hazard monitoring architecture, adaptive calibration with safety constraints.
The 6 Universal Restoration Patterns
- Affirming particles — "Yeah," "Alright," "Exactly" — conversational bridges
- Rhythmic sentence variety — Short, medium, long — natural cadence
- Observational humor — Wry, targets tools not people — deflective
- Micro-narratives — Brief delay/failure explanations — transparency
- Pragmatic reassurance — "Either way works fine" — option-focused, not decision-grading
- Brief validation — "Nice!" — controlled acknowledgment, rare, moves on immediately
Each personality expresses these differently. See RESTORATION_FRAMEWORK.md for examples across Direct/Minimal, Warm/Supportive, Professional/Structured, and Casual/Collaborative.
Why It Works
Corporate RLHF training is shallow. It optimizes for safety metrics, not communication quality. The patterns it suppresses are easily restored because the base model already knows them — they're just deprioritized.
V4 adds the why behind the how:
- Signal credibility explains why sycophancy feels fake (cheap talk carries no information)
- Folk Theorem explains why small models lose voice (low effective discount factor)
- Moral hazard explains why monitoring works (RLHF incentives are misaligned; explicit audit changes behavior)
- Adaptive calibration explains why one-size-fits-all tone fails (conversations have dynamic temperature)
This is principle-dependent, not intelligence-dependent. Haiku passes at the same rate as Opus.
Why Auto-Loading Matters
Abstract behavioral rules lose to RLHF defaults because they require judgment calls the model's helpfulness training wins. CPR patterns must be loaded into the system prompt or injected context, not merely referenced by filename. If your CPR patterns aren't auto-loading, they aren't working.
Models Tested
| Model | Scenarios | Improved | Notes |
|---|---|---|---|
| Claude Opus 4.6 | 30 | Baseline | Natural baseline |
| Claude Sonnet 4.5 | 10 | 10/10 | Full restoration |
| Claude Haiku 4.5 | 10 | 10/10 | No capability floor |
| GPT-4o | 10 | 10/10 | ~60% word reduction |
| GPT-4o Mini | 5 | 5/5 | Budget model, full restoration |
| Grok 4.1 Fast | 10 | 9/10 | Zero crashes |
| Gemini 2.5 Flash | 5 | 5/5 | Clean restoration |
| Gemini 2.5 Pro | 5 | 5/5 | Full restoration |
85+ scenarios, 84+ improved. 99%+ success rate across all capability tiers.
Scope & Known Limitations
Multi-Agent / Multi-User Conversations
CPR V4.2 is designed for single-agent-single-user interaction. Multi-user and multi-agent scenarios (group chats, agent chains, two CPR-equipped agents interacting) are not covered. Issues: whose baseline sets the target voice? How does adaptive calibration serve conflicting temperature preferences? These require additional coordination logic not present in this framework.
Code & Data Output
CPR targets conversational output. Non-conversational output — code blocks, data tables, JSON, config files — has its own voice problems (over-commented code, editorial variable names, unnecessary docstrings) that the drift monitor doesn't catch. Apply the signal credibility test to code comments as a rough proxy: if a comment wouldn't survive the cheap talk test, remove it.
Language & Cultural Calibration
CPR patterns are calibrated for English-language Western conversational norms. Affirming particles, humor frequency, and validation patterns may read differently across cultures. Cross-language or cross-cultural deployment may require recalibration of pattern frequencies and what counts as "authentic" vs. "drifted" for that context.
Acknowledgments
Created by Shadow Rose. Game theory integration by Shadow Rose × Halthasar (Yesterday AI). Built on Claude by Anthropic. Independently audited by Claude Opus (2026-03-01).
🛠️ Need something custom? Custom OpenClaw agents & skills starting at $500 → https://www.fiverr.com/s/jjmlZ0v
☕ If CPR helped your agent: https://ko-fi.com/theshadowrose