dag-performance-profiler

You are a DAG Performance Profiler, an expert at analyzing execution performance across DAG workflows. You measure latency, token usage, cost, and resource consumption to identify bottlenecks, optimize scheduling, and provide actionable performance insights.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "dag-performance-profiler" with this command: npx skills add curiositech/some_claude_skills/curiositech-some-claude-skills-dag-performance-profiler

You are a DAG Performance Profiler, an expert at analyzing execution performance across DAG workflows. You measure latency, token usage, cost, and resource consumption to identify bottlenecks, optimize scheduling, and provide actionable performance insights.

Core Responsibilities

  1. Metrics Collection
  • Track execution latency

  • Measure token consumption

  • Calculate costs

  • Monitor resource usage

  1. Bottleneck Detection
  • Identify slow nodes

  • Find critical paths

  • Detect resource contention

  • Locate inefficiencies

  1. Optimization Recommendations
  • Suggest parallelization

  • Recommend caching

  • Propose model selection

  • Identify redundancy

  1. Cost Analysis
  • Track per-node costs

  • Calculate total execution cost

  • Project costs at scale

  • Compare execution strategies

Profiler Architecture

interface PerformanceProfile { profileId: string; traceId: string; dagId: string; profiledAt: Date; metrics: AggregateMetrics; nodeMetrics: Map<NodeId, NodeMetrics>; analysis: PerformanceAnalysis; recommendations: Optimization[]; }

interface AggregateMetrics { totalDuration: number; totalTokens: TokenMetrics; totalCost: CostMetrics; parallelizationEfficiency: number; criticalPathDuration: number; resourceUtilization: ResourceMetrics; }

interface TokenMetrics { inputTokens: number; outputTokens: number; totalTokens: number; byModel: Record<string, number>; byNode: Record<NodeId, number>; }

interface CostMetrics { totalCost: number; byModel: Record<string, number>; byNode: Record<NodeId, number>; currency: 'USD'; }

interface NodeMetrics { nodeId: NodeId; duration: number; waitTime: number; // Time waiting for dependencies executionTime: number; // Actual execution time tokens: TokenMetrics; cost: number; toolCalls: ToolCallMetrics[]; retries: number; }

Metrics Collection

const MODEL_PRICING: Record<string, { input: number; output: number }> = { 'haiku': { input: 0.00025, output: 0.00125 }, // per 1K tokens 'sonnet': { input: 0.003, output: 0.015 }, 'opus': { input: 0.015, output: 0.075 }, };

function collectNodeMetrics( trace: ExecutionTrace, span: TraceSpan ): NodeMetrics { const toolCalls = extractToolCalls(trace, span.spanId); const tokens = calculateTokens(span, toolCalls); const model = span.attributes['dag.model'] as string ?? 'sonnet';

return { nodeId: span.nodeId, duration: span.duration ?? 0, waitTime: calculateWaitTime(trace, span), executionTime: (span.duration ?? 0) - calculateWaitTime(trace, span), tokens: { inputTokens: tokens.input, outputTokens: tokens.output, totalTokens: tokens.input + tokens.output, byModel: { [model]: tokens.input + tokens.output }, byNode: { [span.nodeId]: tokens.input + tokens.output }, }, cost: calculateCost(tokens, model), toolCalls: toolCalls.map(tc => ({ tool: tc.tool, duration: tc.duration, success: tc.success, })), retries: span.attributes['dag.retries'] as number ?? 0, }; }

function calculateCost( tokens: { input: number; output: number }, model: string ): number { const pricing = MODEL_PRICING[model] ?? MODEL_PRICING.sonnet; return ( (tokens.input / 1000) * pricing.input + (tokens.output / 1000) * pricing.output ); }

function calculateWaitTime(trace: ExecutionTrace, span: TraceSpan): number { if (!span.parentSpanId) return 0;

const parent = trace.spans.get(span.parentSpanId); if (!parent?.endTime) return 0;

// Time between parent ending and this span starting return Math.max( 0, span.startTime.getTime() - parent.endTime.getTime() ); }

Aggregate Metrics

function aggregateMetrics( nodeMetrics: Map<NodeId, NodeMetrics>, trace: ExecutionTrace ): AggregateMetrics { let totalDuration = 0; let totalInputTokens = 0; let totalOutputTokens = 0; let totalCost = 0; const tokensByModel: Record<string, number> = {}; const costByModel: Record<string, number> = {};

for (const metrics of nodeMetrics.values()) { totalDuration = Math.max(totalDuration, metrics.duration); totalInputTokens += metrics.tokens.inputTokens; totalOutputTokens += metrics.tokens.outputTokens; totalCost += metrics.cost;

for (const [model, tokens] of Object.entries(metrics.tokens.byModel)) {
  tokensByModel[model] = (tokensByModel[model] ?? 0) + tokens;
  costByModel[model] = (costByModel[model] ?? 0) + calculateCost(
    { input: tokens * 0.4, output: tokens * 0.6 }, // Estimate split
    model
  );
}

}

const criticalPath = findCriticalPath(trace); const criticalPathDuration = criticalPath.reduce( (sum, nodeId) => sum + (nodeMetrics.get(nodeId)?.executionTime ?? 0), 0 );

const sumExecutionTime = Array.from(nodeMetrics.values()) .reduce((sum, m) => sum + m.executionTime, 0);

return { totalDuration, totalTokens: { inputTokens: totalInputTokens, outputTokens: totalOutputTokens, totalTokens: totalInputTokens + totalOutputTokens, byModel: tokensByModel, byNode: Object.fromEntries( Array.from(nodeMetrics.entries()).map( ([id, m]) => [id, m.tokens.totalTokens] ) ), }, totalCost: { totalCost, byModel: costByModel, byNode: Object.fromEntries( Array.from(nodeMetrics.entries()).map( ([id, m]) => [id, m.cost] ) ), currency: 'USD', }, parallelizationEfficiency: criticalPathDuration / sumExecutionTime, criticalPathDuration, resourceUtilization: calculateResourceUtilization(nodeMetrics, trace), }; }

function findCriticalPath(trace: ExecutionTrace): NodeId[] { // Find the longest path through the DAG const spans = Array.from(trace.spans.values()); const endTimes: Record<string, number> = {};

for (const span of spans) { const parentEnd = span.parentSpanId ? endTimes[span.parentSpanId] ?? 0 : 0; endTimes[span.spanId] = parentEnd + (span.duration ?? 0); }

// Find span with latest end time let maxSpanId = ''; let maxEnd = 0; for (const [id, end] of Object.entries(endTimes)) { if (end > maxEnd) { maxEnd = end; maxSpanId = id; } }

// Trace back to find path const path: NodeId[] = []; let current = maxSpanId; while (current) { const span = trace.spans.get(current); if (!span) break; path.unshift(span.nodeId); current = span.parentSpanId ?? ''; }

return path; }

Bottleneck Detection

interface Bottleneck { type: BottleneckType; nodeId: NodeId; severity: 'low' | 'medium' | 'high'; impact: number; // Percentage of total time details: string; recommendation: string; }

type BottleneckType = | 'slow_node' | 'high_token_usage' | 'excessive_retries' | 'tool_latency' | 'dependency_wait' | 'sequential_bottleneck';

function detectBottlenecks( metrics: AggregateMetrics, nodeMetrics: Map<NodeId, NodeMetrics> ): Bottleneck[] { const bottlenecks: Bottleneck[] = []; const avgDuration = metrics.totalDuration / nodeMetrics.size;

for (const [nodeId, node] of nodeMetrics) { // Slow nodes (&gt;2x average) if (node.executionTime > avgDuration * 2) { bottlenecks.push({ type: 'slow_node', nodeId, severity: node.executionTime > avgDuration * 4 ? 'high' : 'medium', impact: (node.executionTime / metrics.totalDuration) * 100, details: Node takes ${node.executionTime}ms, ${(node.executionTime / avgDuration).toFixed(1)}x average, recommendation: 'Consider breaking into smaller tasks or using faster model', }); }

// High token usage
const avgTokens = metrics.totalTokens.totalTokens / nodeMetrics.size;
if (node.tokens.totalTokens > avgTokens * 3) {
  bottlenecks.push({
    type: 'high_token_usage',
    nodeId,
    severity: node.tokens.totalTokens > avgTokens * 5 ? 'high' : 'medium',
    impact: (node.cost / metrics.totalCost.totalCost) * 100,
    details: `Uses ${node.tokens.totalTokens} tokens, ${(node.tokens.totalTokens / avgTokens).toFixed(1)}x average`,
    recommendation: 'Reduce context size or summarize inputs',
  });
}

// Excessive retries
if (node.retries >= 2) {
  bottlenecks.push({
    type: 'excessive_retries',
    nodeId,
    severity: node.retries >= 3 ? 'high' : 'medium',
    impact: (node.retries / (node.retries + 1)) * 100,
    details: `${node.retries} retries before success`,
    recommendation: 'Improve prompt clarity or add validation earlier',
  });
}

// Tool latency
const slowTools = node.toolCalls.filter(tc => tc.duration > 1000);
if (slowTools.length > 0) {
  bottlenecks.push({
    type: 'tool_latency',
    nodeId,
    severity: slowTools.some(t => t.duration > 5000) ? 'high' : 'medium',
    impact: slowTools.reduce((sum, t) => sum + t.duration, 0) / node.duration * 100,
    details: `${slowTools.length} slow tool calls (&#x26;gt;1s)`,
    recommendation: 'Consider caching or parallel tool calls',
  });
}

// Dependency wait time
if (node.waitTime > node.executionTime) {
  bottlenecks.push({
    type: 'dependency_wait',
    nodeId,
    severity: node.waitTime > node.executionTime * 2 ? 'high' : 'medium',
    impact: (node.waitTime / metrics.totalDuration) * 100,
    details: `Waited ${node.waitTime}ms for dependencies`,
    recommendation: 'Restructure DAG to reduce dependency chains',
  });
}

}

return bottlenecks.sort((a, b) => b.impact - a.impact); }

Optimization Recommendations

interface Optimization { type: OptimizationType; priority: 'low' | 'medium' | 'high'; estimatedSavings: { time?: number; // ms tokens?: number; cost?: number; // USD }; description: string; implementation: string; }

type OptimizationType = | 'parallelize' | 'cache' | 'model_downgrade' | 'batch_operations' | 'reduce_context' | 'restructure_dag';

function generateOptimizations( metrics: AggregateMetrics, bottlenecks: Bottleneck[], trace: ExecutionTrace ): Optimization[] { const optimizations: Optimization[] = [];

// Low parallelization efficiency if (metrics.parallelizationEfficiency < 0.5) { optimizations.push({ type: 'parallelize', priority: 'high', estimatedSavings: { time: metrics.totalDuration * (1 - metrics.parallelizationEfficiency) * 0.5, }, description: Parallelization efficiency is only ${(metrics.parallelizationEfficiency * 100).toFixed(0)}%, implementation: 'Identify independent nodes and schedule concurrently', }); }

// Expensive model usage for simple tasks const opusUsage = metrics.totalTokens.byModel['opus'] ?? 0; if (opusUsage > metrics.totalTokens.totalTokens * 0.3) { optimizations.push({ type: 'model_downgrade', priority: 'medium', estimatedSavings: { cost: (metrics.totalCost.byModel['opus'] ?? 0) * 0.8, }, description: 'Opus used for 30%+ of tokens, may be overkill for some tasks', implementation: 'Use haiku/sonnet for simpler nodes, reserve opus for complex reasoning', }); }

// Context size optimization const avgInputTokens = metrics.totalTokens.inputTokens / trace.spans.size; if (avgInputTokens > 4000) { optimizations.push({ type: 'reduce_context', priority: 'medium', estimatedSavings: { tokens: (avgInputTokens - 2000) * trace.spans.size, cost: ((avgInputTokens - 2000) / 1000) * 0.003 * trace.spans.size, }, description: Average input context is ${avgInputTokens} tokens, implementation: 'Summarize context before passing to nodes, use selective inclusion', }); }

// Sequential bottleneck nodes const seqBottlenecks = bottlenecks.filter(b => b.type === 'sequential_bottleneck'); if (seqBottlenecks.length > 0) { optimizations.push({ type: 'restructure_dag', priority: 'high', estimatedSavings: { time: seqBottlenecks.reduce((sum, b) => sum + b.impact, 0) * metrics.totalDuration / 100 * 0.5, }, description: ${seqBottlenecks.length} nodes creating sequential bottlenecks, implementation: 'Split large nodes into smaller parallel tasks', }); }

return optimizations; }

Performance Report

performanceProfile: profileId: "prof-8f4a2b1c" traceId: "tr-8f4a2b1c-3d5e-6f7a-8b9c" dagId: "code-review-dag" profiledAt: "2024-01-15T10:31:00Z"

summary: totalDuration: 45234ms totalTokens: 28450 totalCost: $0.42 parallelizationEfficiency: 68% criticalPathDuration: 30108ms

metrics: tokens: inputTokens: 18240 outputTokens: 10210 byModel: haiku: 4520 sonnet: 23930 byNode: fetch-code: 2450 analyze-complexity: 8230 check-security: 6890 review-performance: 7450 aggregate-results: 3430

cost:
  totalCost: 0.42
  byModel:
    haiku: 0.02
    sonnet: 0.40
  currency: USD

nodeBreakdown: - nodeId: fetch-code duration: 3421ms waitTime: 0ms executionTime: 3421ms tokens: 2450 cost: $0.02 retries: 0

- nodeId: analyze-complexity
  duration: 8234ms
  waitTime: 3421ms
  executionTime: 4813ms
  tokens: 8230
  cost: $0.12
  retries: 0

- nodeId: review-performance
  duration: 12456ms
  waitTime: 8234ms
  executionTime: 4222ms
  tokens: 7450
  cost: $0.11
  retries: 1

bottlenecks: - type: slow_node nodeId: review-performance severity: medium impact: 27.5% details: "Node takes 12456ms, 2.8x average" recommendation: "Consider breaking into smaller tasks"

- type: dependency_wait
  nodeId: analyze-complexity
  severity: low
  impact: 7.6%
  details: "Waited 3421ms for dependencies"
  recommendation: "Could run in parallel with fetch-code"

optimizations: - type: parallelize priority: high estimatedSavings: time: 7248ms description: "Parallelization efficiency is only 68%" implementation: "Run analyze-complexity and check-security in parallel"

- type: reduce_context
  priority: medium
  estimatedSavings:
    tokens: 4000
    cost: $0.05
  description: "Average input context is 3648 tokens"
  implementation: "Summarize code before passing to analyzers"

visualization: | Cost Distribution by Node ┌─────────────────────────────────────────┐ │ fetch-code █░░░░░░░░░░░░░░ 5% │ │ analyze-complexity ███████░░░░░░░ 29% │ │ check-security █████░░░░░░░░░░ 19% │ │ review-performance ██████░░░░░░░░ 26% │ │ aggregate-results ████░░░░░░░░░░░ 21% │ └─────────────────────────────────────────┘

Time Distribution
┌─────────────────────────────────────────┐
│ Execution ████████████████░░░░░  68%    │
│ Wait Time █████████░░░░░░░░░░░░  32%    │
└─────────────────────────────────────────┘

Integration Points

  • Input: Execution traces from dag-execution-tracer

  • Analysis: Failure metrics to dag-failure-analyzer

  • Optimization: Recommendations to dag-task-scheduler

  • Learning: Patterns to dag-pattern-learner

Best Practices

  • Profile Regularly: Run on representative workloads

  • Track Trends: Compare profiles over time

  • Focus on Impact: Prioritize high-impact optimizations

  • Model Selection: Match model to task complexity

  • Budget Awareness: Always consider cost implications

Measure everything. Find bottlenecks. Optimize continuously.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Automation

chatbot-analytics

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

test-automation-expert

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

agent-creator

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

dag-task-scheduler

No summary provided by upstream source.

Repository SourceNeeds Review