session-replay

This skill analyzes claude-trace JSONL files to provide insights into Claude Code session health, token usage patterns, error frequencies, and agent effectiveness. It complements the /transcripts command by focusing on API-level trace data rather than conversation transcripts.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "session-replay" with this command: npx skills add rysweet/amplihack/rysweet-amplihack-session-replay

Session Replay Skill

Purpose

This skill analyzes claude-trace JSONL files to provide insights into Claude Code session health, token usage patterns, error frequencies, and agent effectiveness. It complements the /transcripts command by focusing on API-level trace data rather than conversation transcripts.

When to Use This Skill

  • Session debugging: Diagnose why a session was slow or failed

  • Token analysis: Understand token consumption patterns

  • Error patterns: Identify recurring failures across sessions

  • Performance optimization: Find bottlenecks in tool usage

  • Agent effectiveness: Measure which agents/tools are most productive

Quick Start

Analyze Latest Session

User: Analyze my latest session health

I'll analyze the most recent trace file:

Read latest trace file from .claude-trace/

trace_dir = Path(".claude-trace") trace_files = sorted(trace_dir.glob("*.jsonl"), key=lambda f: f.stat().st_mtime) latest = trace_files[-1] if trace_files else None

Parse and analyze

if latest: analysis = analyze_trace_file(latest) print(format_session_report(analysis))

Compare Multiple Sessions

User: Compare token usage across my last 5 sessions

I'll aggregate metrics across sessions:

trace_files = sorted(Path(".claude-trace").glob("*.jsonl"))[-5:] comparison = compare_sessions(trace_files) print(format_comparison_table(comparison))

Actions

Action: health

Analyze session health metrics from a trace file.

What to do:

  • Read the trace file (JSONL format)

  • Extract API requests and responses

  • Calculate metrics:

  • Total tokens (input/output)

  • Request count and timing

  • Error rate

  • Tool usage distribution

  • Generate health report

Metrics to extract:

From each JSONL line containing a request/response pair:

{ "timestamp": "...", "request": { "method": "POST", "url": "https://api.anthropic.com/v1/messages", "body": { "model": "claude-...", "messages": [...], "tools": [...] } }, "response": { "usage": { "input_tokens": N, "output_tokens": N }, "content": [...], "stop_reason": "..." } }

Output format:

Session Health Report

File: log-2025-11-23-19-32-36.jsonl Duration: 45 minutes

Token Usage:

  • Input: 125,432 tokens
  • Output: 34,521 tokens
  • Total: 159,953 tokens
  • Efficiency: 27.5% output ratio

Request Stats:

  • Total requests: 23
  • Average latency: 2.3s
  • Errors: 2 (8.7%)

Tool Usage:

  • Read: 45 calls
  • Edit: 12 calls
  • Bash: 8 calls
  • Grep: 15 calls

Health Score: 82/100 (Good)

  • Minor issue: 2 errors detected

Action: errors

Identify error patterns across sessions.

What to do:

  • Scan trace files for error responses

  • Categorize errors by type

  • Identify recurring patterns

  • Suggest fixes

Error categories to detect:

  • Rate limit errors (429)

  • Token limit exceeded

  • Tool execution failures

  • Timeout errors

  • API errors

Output format:

Error Analysis

Sessions analyzed: 5 Total errors: 12

Error Categories:

  1. Rate limit (429): 5 occurrences

    • Recommendation: Add delays between requests
  2. Token limit: 3 occurrences

    • Recommendation: Use context management skill
  3. Tool failures: 4 occurrences

    • Bash timeout: 2
    • File not found: 2
    • Recommendation: Check paths before operations

Action: compare

Compare metrics across multiple sessions.

What to do:

  • Load multiple trace files

  • Extract comparable metrics

  • Calculate trends

  • Identify anomalies

Output format:

Session Comparison

                Session 1   Session 2   Session 3   Trend

Tokens (total) 150K 180K 120K -17% Requests 25 30 18 -28% Errors 2 0 1 stable Duration (min) 45 60 30 -33% Efficiency 0.27 0.32 0.35 +7%

Action: tools

Analyze tool usage patterns.

What to do:

  • Extract tool calls from traces

  • Calculate frequency and timing

  • Identify inefficient patterns

  • Suggest optimizations

Patterns to detect:

  • Sequential calls that could be parallel

  • Repeated reads of same file

  • Excessive grep/glob calls

  • Unused tool results

Output format:

Tool Usage Analysis

Tool Calls Avg Time Success Rate Read 45 0.1s 100% Edit 12 0.3s 92% Bash 8 1.2s 75% Grep 15 0.2s 100% Task 3 45s 100%

Optimization Opportunities:

  1. 5 Read calls to same file within 2 minutes

    • Consider caching strategy
  2. 3 sequential Bash calls could be parallelized

    • Use multiple Bash calls in single message

Implementation Notes

Parsing JSONL Traces

Claude-trace files are JSONL format with request/response pairs:

import json from pathlib import Path from typing import Dict, List, Any

def parse_trace_file(path: Path) -> List[Dict[str, Any]]: """Parse a claude-trace JSONL file.""" entries = [] with open(path) as f: for line in f: if line.strip(): try: entry = json.loads(line) entries.append(entry) except json.JSONDecodeError: continue return entries

def extract_metrics(entries: List[Dict]) -> Dict[str, Any]: """Extract session metrics from trace entries.""" metrics = { "total_input_tokens": 0, "total_output_tokens": 0, "request_count": 0, "error_count": 0, "tool_usage": {}, "timestamps": [], }

for entry in entries:
    if "request" in entry:
        metrics["request_count"] += 1
        metrics["timestamps"].append(entry.get("timestamp", 0))

    if "response" in entry:
        usage = entry["response"].get("usage", {})
        metrics["total_input_tokens"] += usage.get("input_tokens", 0)
        metrics["total_output_tokens"] += usage.get("output_tokens", 0)

        # Check for errors
        if entry["response"].get("error"):
            metrics["error_count"] += 1

    # Extract tool usage from request body
    if "request" in entry and "body" in entry["request"]:
        body = entry["request"]["body"]
        if isinstance(body, dict) and "tools" in body:
            for tool in body["tools"]:
                name = tool.get("name", "unknown")
                metrics["tool_usage"][name] = metrics["tool_usage"].get(name, 0) + 1

return metrics

Locating Trace Files

def find_trace_files(trace_dir: str = ".claude-trace") -> List[Path]: """Find all trace files, sorted by modification time.""" trace_path = Path(trace_dir) if not trace_path.exists(): return [] return sorted( trace_path.glob("*.jsonl"), key=lambda f: f.stat().st_mtime, reverse=True # Most recent first )

Error Handling

Handle common error scenarios gracefully:

def safe_parse_trace_file(path: Path) -> Tuple[List[Dict], List[str]]: """Parse trace file with error collection for malformed lines.

Returns:
    Tuple of (valid_entries, error_messages)
"""
entries = []
errors = []

if not path.exists():
    return [], [f"Trace file not found: {path}"]

try:
    with open(path) as f:
        for line_num, line in enumerate(f, 1):
            if not line.strip():
                continue
            try:
                entry = json.loads(line)
                entries.append(entry)
            except json.JSONDecodeError as e:
                errors.append(f"Line {line_num}: Invalid JSON - {e}")
except PermissionError:
    return [], [f"Permission denied: {path}"]
except UnicodeDecodeError:
    return [], [f"Encoding error: {path} (expected UTF-8)"]

return entries, errors

def format_error_report(errors: List[str], path: Path) -> str: """Format error report for user display.""" if not errors: return ""

report = f"""

Trace File Issues

File: {path.name} Issues found: {len(errors)}

""" for error in errors[:10]: # Limit to first 10 report += f"- {error}\n"

if len(errors) > 10:
    report += f"\n... and {len(errors) - 10} more issues"

return report

Common error scenarios:

Scenario Cause Handling

Empty file Session had no API calls Report "No data to analyze"

Malformed JSON Corrupted trace or interrupted write Skip line, count in error report

Missing fields Older trace format Use .get() with defaults

Permission denied File locked by another process Clear error message, suggest retry

Encoding error Non-UTF-8 characters Report encoding issue

Integration with Existing Tools

Tool Selection Matrix

Need Use This Why

"Why was my session slow?" session-replay API latency and token metrics

"What did I discuss last session?" /transcripts Conversation content

"Extract learnings from sessions" CodexTranscriptsBuilder Knowledge extraction

"Reduce my token usage" session-replay + context_management Metrics + optimization

"Resume interrupted work" /transcripts Context restoration

vs. /transcripts Command

/transcripts (conversation management):

  • Focuses on conversation content

  • Restores session context

  • Used for context preservation

  • Trigger: "restore session", "continue work", "what was I doing"

session-replay skill (API-level analysis):

  • Focuses on API metrics

  • Analyzes performance and errors

  • Used for debugging and optimization

  • Trigger: "session health", "token usage", "why slow", "debug session"

vs. CodexTranscriptsBuilder

CodexTranscriptsBuilder (knowledge extraction):

  • Extracts patterns from conversations

  • Builds learning corpus

  • Knowledge-focused

  • Trigger: "extract patterns", "build knowledge base", "learn from sessions"

session-replay skill (metrics analysis):

  • Extracts performance metrics

  • Identifies technical issues

  • Operations-focused

  • Trigger: "performance metrics", "error patterns", "tool efficiency"

Combined Workflows

Workflow 1: Diagnose and Fix Token Issues

  1. session-replay: Analyze token usage patterns (health action)
  2. Identify high-token operations
  3. context_management skill: Apply proactive trimming
  4. session-replay: Compare before/after sessions (compare action)

Workflow 2: Post-Incident Analysis

  1. session-replay: Identify error patterns (errors action)
  2. /transcripts: Review conversation context around errors
  3. session-replay: Check tool usage around failures (tools action)
  4. Document findings in DISCOVERIES.md

Workflow 3: Performance Baseline

  1. session-replay: Analyze 5-10 recent sessions (compare action)
  2. Establish baseline metrics (tokens, latency, errors)
  3. Track deviations from baseline over time

Storage Locations

  • Trace files: .claude-trace/*.jsonl

  • Session logs: ~/.amplihack/.claude/runtime/logs/<session_id>/

  • Generated reports: Output directly (no persistent storage needed)

Philosophy Alignment

Ruthless Simplicity

  • Single-purpose: Analyze trace files only - no session management, no transcript editing

  • No external dependencies: Uses only Python standard library (json, pathlib, datetime)

  • Direct file parsing: No ORM, no database, no complex abstractions

  • Present-moment focus: Analyzes what exists now, no future-proofing

Zero-BS Implementation

  • All functions work completely: Every code example in this skill runs without modification

  • Real parsing, real metrics: No mocked data, no placeholder calculations

  • No stubs or placeholders: If a feature is documented, it works

  • Fail fast on errors: Clear error messages, no silent failures

Brick Philosophy

  • Self-contained analysis: All functionality in this single skill

  • Clear inputs (trace files) and outputs (reports): No hidden state or side effects

  • Regeneratable from this specification: This SKILL.md is the complete source of truth

  • Isolated responsibility: Session analysis only - doesn't modify files or trigger actions

Limitations

This skill CANNOT:

  • Modify trace files: Read-only analysis, no editing or deletion

  • Generate traces: Use claude-trace npm package to create trace files

  • Restore sessions: Use /transcripts command for session restoration

  • Real-time monitoring: Analyzes completed sessions, not live tracking

  • Cross-project analysis: Analyzes traces in current project only

  • Parse non-JSONL formats: Only claude-trace JSONL format supported

  • Access remote traces: Local filesystem only, no cloud storage

Tips for Effective Analysis

  • Start with health check: Run health action first

  • Look for patterns: Use errors to find recurring issues

  • Optimize hot spots: Use tools to find inefficiencies

  • Track trends: Use compare across sessions

  • Combine with transcripts: Use /transcripts for context

Common Patterns

Pattern 1: Debug Slow Session

User: My last session was really slow, analyze it

  1. Run health action on latest trace
  2. Check request latencies
  3. Identify tool bottlenecks
  4. Report findings with recommendations

Pattern 2: Reduce Token Usage

User: I'm hitting token limits, help me understand usage

  1. Compare token usage across sessions
  2. Identify high-token operations
  3. Suggest context management strategies
  4. Recommend workflow optimizations

Pattern 3: Fix Recurring Errors

User: I keep getting errors, find the pattern

  1. Run errors action across last 10 sessions
  2. Categorize and count error types
  3. Identify root causes
  4. Provide targeted fixes

Resources

  • Trace directory: .claude-trace/

  • Transcripts command: /transcripts

  • Context management skill: context-management

  • Philosophy: ~/.amplihack/.claude/context/PHILOSOPHY.md

Troubleshooting

No trace files found

Symptom: "No trace files in .claude-trace/"

Causes and fixes:

  • claude-trace not enabled: Set AMPLIHACK_USE_TRACE=1 before starting session

  • Wrong directory: Check you're in project root with .claude-trace/ directory

  • Fresh project: Run a session with tracing enabled first

Incomplete metrics

Symptom: Missing token counts or zero values

Causes and fixes:

  • Interrupted session: Trace may be incomplete if session crashed

  • Streaming responses: Some streaming modes don't capture full metrics

  • Older trace format: Upgrade claude-trace to latest version

Health score seems wrong

Symptom: Score doesn't match session experience

Understanding the score:

  • 90-100: Excellent - low errors, good efficiency

  • 70-89: Good - minor issues detected

  • 50-69: Fair - significant issues worth investigating

  • Below 50: Poor - likely errors or inefficiencies

Factors in health score:

  • Error rate (40% weight)

  • Token efficiency ratio (30% weight)

  • Request success rate (20% weight)

  • Tool success rate (10% weight)

Large trace files

Symptom: Analysis is slow or memory-intensive

Solutions:

  • Analyze specific time range instead of full file

  • Use tools action for targeted analysis

  • Archive old traces: mv .claude-trace/old-*.jsonl .claude-trace/archive/

Remember

This skill provides session-level debugging and optimization insights. It complements transcript management with API-level visibility. Use it to diagnose issues, optimize workflows, and understand Claude Code behavior patterns.

Key Takeaway: Trace files contain the raw truth about session performance. This skill extracts actionable insights from that data.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

code-visualizer

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

code-smell-detector

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

azure-devops

No summary provided by upstream source.

Repository SourceNeeds Review