Session Replay Skill
Purpose
This skill analyzes claude-trace JSONL files to provide insights into Claude Code session health, token usage patterns, error frequencies, and agent effectiveness. It complements the /transcripts command by focusing on API-level trace data rather than conversation transcripts.
When to Use This Skill
-
Session debugging: Diagnose why a session was slow or failed
-
Token analysis: Understand token consumption patterns
-
Error patterns: Identify recurring failures across sessions
-
Performance optimization: Find bottlenecks in tool usage
-
Agent effectiveness: Measure which agents/tools are most productive
Quick Start
Analyze Latest Session
User: Analyze my latest session health
I'll analyze the most recent trace file:
Read latest trace file from .claude-trace/
trace_dir = Path(".claude-trace") trace_files = sorted(trace_dir.glob("*.jsonl"), key=lambda f: f.stat().st_mtime) latest = trace_files[-1] if trace_files else None
Parse and analyze
if latest: analysis = analyze_trace_file(latest) print(format_session_report(analysis))
Compare Multiple Sessions
User: Compare token usage across my last 5 sessions
I'll aggregate metrics across sessions:
trace_files = sorted(Path(".claude-trace").glob("*.jsonl"))[-5:] comparison = compare_sessions(trace_files) print(format_comparison_table(comparison))
Actions
Action: health
Analyze session health metrics from a trace file.
What to do:
-
Read the trace file (JSONL format)
-
Extract API requests and responses
-
Calculate metrics:
-
Total tokens (input/output)
-
Request count and timing
-
Error rate
-
Tool usage distribution
-
Generate health report
Metrics to extract:
From each JSONL line containing a request/response pair:
{ "timestamp": "...", "request": { "method": "POST", "url": "https://api.anthropic.com/v1/messages", "body": { "model": "claude-...", "messages": [...], "tools": [...] } }, "response": { "usage": { "input_tokens": N, "output_tokens": N }, "content": [...], "stop_reason": "..." } }
Output format:
Session Health Report
File: log-2025-11-23-19-32-36.jsonl Duration: 45 minutes
Token Usage:
- Input: 125,432 tokens
- Output: 34,521 tokens
- Total: 159,953 tokens
- Efficiency: 27.5% output ratio
Request Stats:
- Total requests: 23
- Average latency: 2.3s
- Errors: 2 (8.7%)
Tool Usage:
- Read: 45 calls
- Edit: 12 calls
- Bash: 8 calls
- Grep: 15 calls
Health Score: 82/100 (Good)
- Minor issue: 2 errors detected
Action: errors
Identify error patterns across sessions.
What to do:
-
Scan trace files for error responses
-
Categorize errors by type
-
Identify recurring patterns
-
Suggest fixes
Error categories to detect:
-
Rate limit errors (429)
-
Token limit exceeded
-
Tool execution failures
-
Timeout errors
-
API errors
Output format:
Error Analysis
Sessions analyzed: 5 Total errors: 12
Error Categories:
-
Rate limit (429): 5 occurrences
- Recommendation: Add delays between requests
-
Token limit: 3 occurrences
- Recommendation: Use context management skill
-
Tool failures: 4 occurrences
- Bash timeout: 2
- File not found: 2
- Recommendation: Check paths before operations
Action: compare
Compare metrics across multiple sessions.
What to do:
-
Load multiple trace files
-
Extract comparable metrics
-
Calculate trends
-
Identify anomalies
Output format:
Session Comparison
Session 1 Session 2 Session 3 Trend
Tokens (total) 150K 180K 120K -17% Requests 25 30 18 -28% Errors 2 0 1 stable Duration (min) 45 60 30 -33% Efficiency 0.27 0.32 0.35 +7%
Action: tools
Analyze tool usage patterns.
What to do:
-
Extract tool calls from traces
-
Calculate frequency and timing
-
Identify inefficient patterns
-
Suggest optimizations
Patterns to detect:
-
Sequential calls that could be parallel
-
Repeated reads of same file
-
Excessive grep/glob calls
-
Unused tool results
Output format:
Tool Usage Analysis
Tool Calls Avg Time Success Rate Read 45 0.1s 100% Edit 12 0.3s 92% Bash 8 1.2s 75% Grep 15 0.2s 100% Task 3 45s 100%
Optimization Opportunities:
-
5 Read calls to same file within 2 minutes
- Consider caching strategy
-
3 sequential Bash calls could be parallelized
- Use multiple Bash calls in single message
Implementation Notes
Parsing JSONL Traces
Claude-trace files are JSONL format with request/response pairs:
import json from pathlib import Path from typing import Dict, List, Any
def parse_trace_file(path: Path) -> List[Dict[str, Any]]: """Parse a claude-trace JSONL file.""" entries = [] with open(path) as f: for line in f: if line.strip(): try: entry = json.loads(line) entries.append(entry) except json.JSONDecodeError: continue return entries
def extract_metrics(entries: List[Dict]) -> Dict[str, Any]: """Extract session metrics from trace entries.""" metrics = { "total_input_tokens": 0, "total_output_tokens": 0, "request_count": 0, "error_count": 0, "tool_usage": {}, "timestamps": [], }
for entry in entries:
if "request" in entry:
metrics["request_count"] += 1
metrics["timestamps"].append(entry.get("timestamp", 0))
if "response" in entry:
usage = entry["response"].get("usage", {})
metrics["total_input_tokens"] += usage.get("input_tokens", 0)
metrics["total_output_tokens"] += usage.get("output_tokens", 0)
# Check for errors
if entry["response"].get("error"):
metrics["error_count"] += 1
# Extract tool usage from request body
if "request" in entry and "body" in entry["request"]:
body = entry["request"]["body"]
if isinstance(body, dict) and "tools" in body:
for tool in body["tools"]:
name = tool.get("name", "unknown")
metrics["tool_usage"][name] = metrics["tool_usage"].get(name, 0) + 1
return metrics
Locating Trace Files
def find_trace_files(trace_dir: str = ".claude-trace") -> List[Path]: """Find all trace files, sorted by modification time.""" trace_path = Path(trace_dir) if not trace_path.exists(): return [] return sorted( trace_path.glob("*.jsonl"), key=lambda f: f.stat().st_mtime, reverse=True # Most recent first )
Error Handling
Handle common error scenarios gracefully:
def safe_parse_trace_file(path: Path) -> Tuple[List[Dict], List[str]]: """Parse trace file with error collection for malformed lines.
Returns:
Tuple of (valid_entries, error_messages)
"""
entries = []
errors = []
if not path.exists():
return [], [f"Trace file not found: {path}"]
try:
with open(path) as f:
for line_num, line in enumerate(f, 1):
if not line.strip():
continue
try:
entry = json.loads(line)
entries.append(entry)
except json.JSONDecodeError as e:
errors.append(f"Line {line_num}: Invalid JSON - {e}")
except PermissionError:
return [], [f"Permission denied: {path}"]
except UnicodeDecodeError:
return [], [f"Encoding error: {path} (expected UTF-8)"]
return entries, errors
def format_error_report(errors: List[str], path: Path) -> str: """Format error report for user display.""" if not errors: return ""
report = f"""
Trace File Issues
File: {path.name} Issues found: {len(errors)}
""" for error in errors[:10]: # Limit to first 10 report += f"- {error}\n"
if len(errors) > 10:
report += f"\n... and {len(errors) - 10} more issues"
return report
Common error scenarios:
Scenario Cause Handling
Empty file Session had no API calls Report "No data to analyze"
Malformed JSON Corrupted trace or interrupted write Skip line, count in error report
Missing fields Older trace format Use .get() with defaults
Permission denied File locked by another process Clear error message, suggest retry
Encoding error Non-UTF-8 characters Report encoding issue
Integration with Existing Tools
Tool Selection Matrix
Need Use This Why
"Why was my session slow?" session-replay API latency and token metrics
"What did I discuss last session?" /transcripts Conversation content
"Extract learnings from sessions" CodexTranscriptsBuilder Knowledge extraction
"Reduce my token usage" session-replay + context_management Metrics + optimization
"Resume interrupted work" /transcripts Context restoration
vs. /transcripts Command
/transcripts (conversation management):
-
Focuses on conversation content
-
Restores session context
-
Used for context preservation
-
Trigger: "restore session", "continue work", "what was I doing"
session-replay skill (API-level analysis):
-
Focuses on API metrics
-
Analyzes performance and errors
-
Used for debugging and optimization
-
Trigger: "session health", "token usage", "why slow", "debug session"
vs. CodexTranscriptsBuilder
CodexTranscriptsBuilder (knowledge extraction):
-
Extracts patterns from conversations
-
Builds learning corpus
-
Knowledge-focused
-
Trigger: "extract patterns", "build knowledge base", "learn from sessions"
session-replay skill (metrics analysis):
-
Extracts performance metrics
-
Identifies technical issues
-
Operations-focused
-
Trigger: "performance metrics", "error patterns", "tool efficiency"
Combined Workflows
Workflow 1: Diagnose and Fix Token Issues
- session-replay: Analyze token usage patterns (health action)
- Identify high-token operations
- context_management skill: Apply proactive trimming
- session-replay: Compare before/after sessions (compare action)
Workflow 2: Post-Incident Analysis
- session-replay: Identify error patterns (errors action)
- /transcripts: Review conversation context around errors
- session-replay: Check tool usage around failures (tools action)
- Document findings in DISCOVERIES.md
Workflow 3: Performance Baseline
- session-replay: Analyze 5-10 recent sessions (compare action)
- Establish baseline metrics (tokens, latency, errors)
- Track deviations from baseline over time
Storage Locations
-
Trace files: .claude-trace/*.jsonl
-
Session logs: ~/.amplihack/.claude/runtime/logs/<session_id>/
-
Generated reports: Output directly (no persistent storage needed)
Philosophy Alignment
Ruthless Simplicity
-
Single-purpose: Analyze trace files only - no session management, no transcript editing
-
No external dependencies: Uses only Python standard library (json, pathlib, datetime)
-
Direct file parsing: No ORM, no database, no complex abstractions
-
Present-moment focus: Analyzes what exists now, no future-proofing
Zero-BS Implementation
-
All functions work completely: Every code example in this skill runs without modification
-
Real parsing, real metrics: No mocked data, no placeholder calculations
-
No stubs or placeholders: If a feature is documented, it works
-
Fail fast on errors: Clear error messages, no silent failures
Brick Philosophy
-
Self-contained analysis: All functionality in this single skill
-
Clear inputs (trace files) and outputs (reports): No hidden state or side effects
-
Regeneratable from this specification: This SKILL.md is the complete source of truth
-
Isolated responsibility: Session analysis only - doesn't modify files or trigger actions
Limitations
This skill CANNOT:
-
Modify trace files: Read-only analysis, no editing or deletion
-
Generate traces: Use claude-trace npm package to create trace files
-
Restore sessions: Use /transcripts command for session restoration
-
Real-time monitoring: Analyzes completed sessions, not live tracking
-
Cross-project analysis: Analyzes traces in current project only
-
Parse non-JSONL formats: Only claude-trace JSONL format supported
-
Access remote traces: Local filesystem only, no cloud storage
Tips for Effective Analysis
-
Start with health check: Run health action first
-
Look for patterns: Use errors to find recurring issues
-
Optimize hot spots: Use tools to find inefficiencies
-
Track trends: Use compare across sessions
-
Combine with transcripts: Use /transcripts for context
Common Patterns
Pattern 1: Debug Slow Session
User: My last session was really slow, analyze it
- Run health action on latest trace
- Check request latencies
- Identify tool bottlenecks
- Report findings with recommendations
Pattern 2: Reduce Token Usage
User: I'm hitting token limits, help me understand usage
- Compare token usage across sessions
- Identify high-token operations
- Suggest context management strategies
- Recommend workflow optimizations
Pattern 3: Fix Recurring Errors
User: I keep getting errors, find the pattern
- Run errors action across last 10 sessions
- Categorize and count error types
- Identify root causes
- Provide targeted fixes
Resources
-
Trace directory: .claude-trace/
-
Transcripts command: /transcripts
-
Context management skill: context-management
-
Philosophy: ~/.amplihack/.claude/context/PHILOSOPHY.md
Troubleshooting
No trace files found
Symptom: "No trace files in .claude-trace/"
Causes and fixes:
-
claude-trace not enabled: Set AMPLIHACK_USE_TRACE=1 before starting session
-
Wrong directory: Check you're in project root with .claude-trace/ directory
-
Fresh project: Run a session with tracing enabled first
Incomplete metrics
Symptom: Missing token counts or zero values
Causes and fixes:
-
Interrupted session: Trace may be incomplete if session crashed
-
Streaming responses: Some streaming modes don't capture full metrics
-
Older trace format: Upgrade claude-trace to latest version
Health score seems wrong
Symptom: Score doesn't match session experience
Understanding the score:
-
90-100: Excellent - low errors, good efficiency
-
70-89: Good - minor issues detected
-
50-69: Fair - significant issues worth investigating
-
Below 50: Poor - likely errors or inefficiencies
Factors in health score:
-
Error rate (40% weight)
-
Token efficiency ratio (30% weight)
-
Request success rate (20% weight)
-
Tool success rate (10% weight)
Large trace files
Symptom: Analysis is slow or memory-intensive
Solutions:
-
Analyze specific time range instead of full file
-
Use tools action for targeted analysis
-
Archive old traces: mv .claude-trace/old-*.jsonl .claude-trace/archive/
Remember
This skill provides session-level debugging and optimization insights. It complements transcript management with API-level visibility. Use it to diagnose issues, optimize workflows, and understand Claude Code behavior patterns.
Key Takeaway: Trace files contain the raw truth about session performance. This skill extracts actionable insights from that data.