dotnet-profiling

Diagnosing .NET performance issues. dotnet-counters, dotnet-trace, dotnet-dump, flame graphs.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "dotnet-profiling" with this command: npx skills add wshaddix/dotnet-skills/wshaddix-dotnet-skills-dotnet-profiling

dotnet-profiling

Diagnostic tool guidance for investigating .NET performance problems. Covers real-time metric monitoring with dotnet-counters, event tracing and flame graph generation with dotnet-trace, and memory dump capture and analysis with dotnet-dump. Focuses on interpreting profiling data (reading flame graphs, analyzing heap dumps, correlating GC metrics) rather than just invoking tools.

Version assumptions: .NET SDK 8.0+ baseline. All three diagnostic tools (dotnet-counters, dotnet-trace, dotnet-dump) ship with the .NET SDK -- no separate installation required.

Out of scope: OpenTelemetry metrics collection and distributed tracing setup -- see [skill:dotnet-observability]. Microbenchmarking setup (BenchmarkDotNet) is owned by this epic's companion skill -- see [skill:dotnet-benchmarkdotnet]. Performance architecture patterns (Span<T>, ArrayPool, sealed devirtualization) are owned by this epic's companion skill -- see [skill:dotnet-performance-patterns]. Continuous benchmark regression detection in CI -- see [skill:dotnet-ci-benchmarking]. Architecture patterns (caching, resilience) -- see [skill:dotnet-architecture-patterns].

Cross-references: [skill:dotnet-observability] for GC/threadpool metrics interpretation and OpenTelemetry correlation, [skill:dotnet-benchmarkdotnet] for structured benchmarking after profiling identifies hot paths, [skill:dotnet-performance-patterns] for optimization patterns to apply based on profiling results.


dotnet-counters -- Real-Time Metric Monitoring

Overview

dotnet-counters provides real-time monitoring of .NET runtime metrics without modifying application code. Use it as a first-pass triage tool to identify whether a performance problem is CPU-bound, memory-bound, or I/O-bound before reaching for heavier instrumentation.

Monitoring Running Processes

# List running .NET processes
dotnet-counters ps

# Monitor default runtime counters for a process
dotnet-counters monitor --process-id <PID>

# Monitor with a specific refresh interval (seconds)
dotnet-counters monitor --process-id <PID> --refresh-interval 2

Key Built-In Counter Providers

ProviderCountersWhat It Tells You
System.RuntimeCPU usage, GC heap size, Gen 0/1/2 collections, threadpool queue length, exception countOverall runtime health
Microsoft.AspNetCore.HostingRequest rate, request duration, active requestsHTTP request throughput and latency
Microsoft.AspNetCore.Http.ConnectionsConnection duration, current connectionsWebSocket/SignalR connection load
System.Net.HttpRequests started/failed, active requests, connection pool sizeOutbound HTTP client behavior
System.Net.SocketsBytes sent/received, datagrams, connectionsNetwork I/O volume

Monitoring Specific Providers

# Monitor runtime and ASP.NET counters together
dotnet-counters monitor --process-id <PID> \
  --counters System.Runtime,Microsoft.AspNetCore.Hosting

# Monitor only GC-related counters
dotnet-counters monitor --process-id <PID> \
  --counters System.Runtime[gc-heap-size,gen-0-gc-count,gen-1-gc-count,gen-2-gc-count]

Custom EventCounters

Applications can publish custom counters for domain-specific metrics:

using System.Diagnostics.Tracing;

[EventSource(Name = "MyApp.Orders")]
public sealed class OrderMetrics : EventSource
{
    public static readonly OrderMetrics Instance = new();

    private EventCounter? _orderProcessingTime;
    private IncrementingEventCounter? _ordersProcessed;

    private OrderMetrics()
    {
        _orderProcessingTime = new EventCounter("order-processing-time", this)
        {
            DisplayName = "Order Processing Time (ms)",
            DisplayUnits = "ms"
        };
        _ordersProcessed = new IncrementingEventCounter("orders-processed", this)
        {
            DisplayName = "Orders Processed",
            DisplayRateTimeScale = TimeSpan.FromSeconds(1)
        };
    }

    public void RecordProcessingTime(double milliseconds)
        => _orderProcessingTime?.WriteMetric(milliseconds);

    public void RecordOrderProcessed()
        => _ordersProcessed?.Increment();

    protected override void Dispose(bool disposing)
    {
        _orderProcessingTime?.Dispose();
        _ordersProcessed?.Dispose();
        base.Dispose(disposing);
    }
}

Monitor custom counters:

dotnet-counters monitor --process-id <PID> --counters MyApp.Orders

Interpreting Counter Data

Use counter values to direct further investigation. See [skill:dotnet-observability] for correlating these runtime metrics with OpenTelemetry traces:

SymptomCounter EvidenceNext Step
High CPU usagecpu-usage > 80%, threadpool-queue-length lowCPU profiling with dotnet-trace
Memory growthgc-heap-size increasing, frequent Gen 2 GCMemory dump with dotnet-dump
Thread starvationthreadpool-queue-length growing, threadpool-thread-count at maxCheck for sync-over-async or blocking calls
Request latencyrequest-duration high, active-requests normalTrace individual requests with dotnet-trace
GC pausesHigh gen-2-gc-count, time-in-gc > 10%Allocation profiling with dotnet-trace gc-collect

Exporting Counter Data

# Export to CSV for analysis
dotnet-counters collect --process-id <PID> \
  --format csv \
  --output counters.csv \
  --counters System.Runtime

# Export to JSON for programmatic consumption
dotnet-counters collect --process-id <PID> \
  --format json \
  --output counters.json

dotnet-trace -- Event Tracing and Flame Graphs

Overview

dotnet-trace captures detailed event traces from a running .NET process. Traces can be analyzed as flame graphs to identify CPU hot paths, or configured for allocation tracking to find GC pressure sources.

CPU Sampling

CPU sampling records stack frames at a fixed interval to build a statistical profile of where the application spends time:

# Collect a CPU sampling trace (default profile)
dotnet-trace collect --process-id <PID> --duration 00:00:30

# Collect with the cpu-sampling profile (explicit)
dotnet-trace collect --process-id <PID> \
  --profile cpu-sampling \
  --output cpu-trace.nettrace

CPU Sampling vs Instrumentation

ApproachOverheadBest ForTool
CPU samplingLow (~2-5%)Finding CPU hot paths in productiondotnet-trace --profile cpu-sampling
InstrumentationHigh (10-50%+)Exact call counts, method entry/exit timingRider/VS profiler, PerfView

CPU sampling is safe for production use due to low overhead. Use it as the default approach. Reserve instrumentation for development environments where exact call counts matter.

Flame Graph Generation

Trace files (.nettrace) must be converted to a flame graph format for visual analysis:

Using Speedscope (browser-based, recommended):

# Convert to Speedscope format
dotnet-trace convert cpu-trace.nettrace --format Speedscope

# Opens cpu-trace.speedscope.json -- load at https://www.speedscope.app/

Using PerfView (Windows, deep .NET integration):

# Convert to Chromium trace format (also viewable in chrome://tracing)
dotnet-trace convert cpu-trace.nettrace --format Chromium

Reading Flame Graphs

Flame graphs display call stacks where:

  • Width of a frame represents the proportion of total sample time spent in that function (wider = more time)
  • Height represents call stack depth (taller stacks = deeper call chains)
  • Color is typically arbitrary (not meaningful) unless the tool uses a specific color scheme

Analysis workflow:

  1. Look for wide plateaus -- functions that consume a large proportion of samples
  2. Follow the widest frames upward to find which callers contribute the most time
  3. Identify unexpected width -- framework methods that should be fast appearing wide indicate misuse
  4. Compare before/after traces to validate optimizations reduced the width of target functions

Common patterns in .NET flame graphs:

PatternLikely CauseInvestigation
Wide System.Linq framesLINQ-heavy hot path with delegate overheadReplace with foreach loops or Span-based processing
Wide JIT_New / gc_heap::allocateExcessive allocations triggering GCAllocation profiling with --profile gc-collect
Wide Monitor.Enter / SpinLockLock contentionReview synchronization strategy
Wide System.Text.RegularExpressionsRegex backtrackingUse RegexOptions.NonBacktracking or compile regex
Deep async state machine framesAsync overhead in tight loopsConsider sync path for CPU-bound work

Allocation Tracking with gc-collect Profile

The gc-collect profile captures allocation events to identify what code paths allocate the most memory:

# Collect allocation data
dotnet-trace collect --process-id <PID> \
  --profile gc-collect \
  --duration 00:00:30 \
  --output alloc-trace.nettrace

This produces a trace that shows:

  • Which methods allocate the most bytes
  • Which types are allocated most frequently
  • Allocation sizes and the call stacks that trigger them

Correlate allocation data with GC counter evidence from dotnet-counters. If gen-2-gc-count is high, the allocation trace shows which code paths produce long-lived objects that survive to Gen 2. See [skill:dotnet-performance-patterns] for zero-allocation patterns to apply once hot allocation sites are identified.

Custom Trace Providers

Target specific event providers for focused tracing:

# Trace specific providers with keywords and verbosity
dotnet-trace collect --process-id <PID> \
  --providers "Microsoft-Diagnostics-DiagnosticSource:::FilterAndPayloadSpecs=[AS]System.Net.Http"

# Trace EF Core queries (useful with [skill:dotnet-efcore-patterns])
dotnet-trace collect --process-id <PID> \
  --providers Microsoft.EntityFrameworkCore

# Trace ASP.NET Core request processing
dotnet-trace collect --process-id <PID> \
  --providers Microsoft.AspNetCore

Trace File Management

FormatExtensionViewerCross-Platform
NetTrace.nettracePerfView, VS, dotnet-trace convertYes (capture); Windows (PerfView)
Speedscope.speedscope.jsonhttps://www.speedscope.app/Yes
Chromium.chromium.jsonChrome DevTools (chrome://tracing)Yes

dotnet-dump -- Memory Dump Analysis

Overview

dotnet-dump captures and analyzes process memory dumps. Use it to investigate memory leaks, large object heap fragmentation, and object reference chains. Unlike dotnet-trace, dumps capture a point-in-time snapshot of the entire managed heap.

Capturing Dumps

# Capture a full heap dump
dotnet-dump collect --process-id <PID> --output app-dump.dmp

# Capture a minimal dump (faster, smaller, but less detail)
dotnet-dump collect --process-id <PID> --type Mini --output app-mini.dmp

When to capture:

  • Memory usage has grown beyond expected baseline (compare against dotnet-counters gc-heap-size)
  • Application is approaching OOM conditions
  • Suspected memory leak after load testing
  • Investigating finalizer queue backlog

Analyzing Dumps with SOS Commands

Open the dump in the interactive analyzer:

dotnet-dump analyze app-dump.dmp

!dumpheap -- Heap Object Summary

Lists objects on the managed heap grouped by type, sorted by total size:

> dumpheap -stat

Statistics:
              MT    Count    TotalSize Class Name
00007fff2c6a4320      125        4,000 System.String[]
00007fff2c6a1230    8,432      269,824 System.String
00007fff2c7b5640    2,100      504,000 MyApp.Models.OrderEntity
00007fff2c6a0988   15,230    1,218,400 System.Byte[]

Analysis approach:

  1. Look for unexpectedly high counts or sizes for application types
  2. Compare counts against expected cardinality (e.g., 2,100 OrderEntity objects -- is that expected for current load?)
  3. Large System.Byte[] counts often indicate unbounded buffering or stream handling issues

Filter by type:

> dumpheap -type MyApp.Models.OrderEntity
> dumpheap -type System.Byte[] -min 85000

The -min 85000 filter shows Large Object Heap entries (objects >= 85,000 bytes that cause Gen 2 GC pressure).

!gcroot -- Finding Object Retention

Traces the reference chain from a GC root to a specific object, explaining why it is not collected:

> gcroot 00007fff3c4a2100

HandleTable:
    00007fff3c010010 (strong handle)
        -> 00007fff3c3a1000 MyApp.Services.CacheService
            -> 00007fff3c3a1020 System.Collections.Generic.Dictionary`2
                -> 00007fff3c4a2100 MyApp.Models.OrderEntity

Found 1 unique root(s).

Common root types and their meaning:

Root TypeMeaningLikely Issue
strong handleStatic field or GC handleStatic collection growing without eviction
pinned handlePinned for native interopBuffer pinned longer than needed
async state machineCaptured in async closureLong-running async operation holding references
finalizer queueWaiting for finalizer threadFinalizer backlog blocking collection
threadpoolReferenced from thread-local storageThread-static cache without cleanup

!finalizequeue -- Finalizer Queue Analysis

Shows objects waiting for finalization, which delays their collection by at least one GC cycle:

> finalizequeue

SyncBlocks to be cleaned up: 0
Free-Threaded Interfaces to be released: 0
MTA Interfaces to be released: 0
STA Interfaces to be released: 0
----------------------------------
generation 0 has 12 finalizable objects
generation 1 has 45 finalizable objects
generation 2 has 230 finalizable objects
Ready for finalization 8 objects

Key indicators:

  • High count in "Ready for finalization" means the finalizer thread is falling behind
  • Objects in Gen 2 finalizable list are expensive -- they survive two GC cycles minimum (one to schedule finalization, one to collect after finalization runs)
  • Types implementing ~Destructor() without IDisposable.Dispose() being called are the primary cause

Additional SOS Commands for Heap Analysis

CommandPurposeWhen to Use
dumpobj <address>Display field values of a specific objectInspect object state after finding it with dumpheap
dumparray <address>Display array contentsInvestigate large arrays found in heap stats
eeheap -gcShow GC heap segment layoutInvestigate LOH fragmentation
gcwhere <address>Show which GC generation holds an objectDetermine if an object is pinned or in LOH
dumpmt <MT>Display method table detailsInvestigate type metadata
threadsList all managed threads with stack tracesIdentify deadlocks or blocking
clrstackDisplay managed call stack for current threadCorrelate thread state with heap data

Memory Leak Investigation Workflow

  1. Baseline: Capture a dump after application startup and initial warm-up
  2. Load: Run the workload scenario suspected of leaking
  3. Compare: Capture a second dump after the workload completes
  4. Diff: Compare dumpheap -stat output between the two dumps -- look for types whose count or total size grew significantly
  5. Root: Use gcroot on instances of the growing type to find the retention chain
  6. Fix: Break the retention chain (remove from static collections, dispose event subscriptions, fix async lifetime issues)
# Tip: save dumpheap output for comparison
# In dump 1:
> dumpheap -stat > /tmp/heap-before.txt
# In dump 2:
> dumpheap -stat > /tmp/heap-after.txt
# Compare externally:
# diff /tmp/heap-before.txt /tmp/heap-after.txt

Profiling Workflow Summary

Use the diagnostic tools in a structured investigation workflow:

1. dotnet-counters (triage)
   ├── CPU high?         → dotnet-trace --profile cpu-sampling
   │                       → Convert to flame graph (Speedscope)
   │                       → Identify hot methods
   ├── Memory growing?   → dotnet-dump collect
   │                       → dumpheap -stat (find large/numerous types)
   │                       → gcroot (find retention chains)
   │                       → Fix retention + verify with second dump
   ├── GC pressure?      → dotnet-trace --profile gc-collect
   │                       → Identify allocation hot paths
   │                       → Apply zero-alloc patterns [skill:dotnet-performance-patterns]
   └── Thread starvation? → dotnet-dump analyze
                            → threads (list all managed threads)
                            → clrstack (check for blocking calls)

After profiling identifies the bottleneck, use [skill:dotnet-benchmarkdotnet] to create targeted benchmarks that quantify the improvement from fixes.


Agent Gotchas

  1. Start with dotnet-counters, not dotnet-trace -- counters have near-zero overhead and identify the category of problem (CPU, memory, threads). Only reach for trace or dump after counters narrow the investigation.
  2. Use CPU sampling (not instrumentation) in production -- sampling overhead is 2-5% and safe for production. Instrumentation adds 10-50%+ overhead and should be limited to development environments.
  3. Always convert traces to flame graphs for analysis -- reading raw .nettrace event logs is impractical. Use dotnet-trace convert --format Speedscope and open in https://www.speedscope.app/ for visual analysis.
  4. Capture two dumps for leak investigation -- a single dump shows current state but cannot distinguish normal resident objects from leaked ones. Compare heap statistics across two dumps taken before and after the suspected leak scenario.
  5. Filter dumpheap by -min 85000 to find LOH objects -- objects >= 85,000 bytes go to the Large Object Heap, which is only collected in Gen 2 GC. Large LOH counts indicate potential fragmentation.
  6. Interpret GC counter data with [skill:dotnet-observability] -- runtime GC/threadpool counters overlap with OpenTelemetry metrics. Use the observability skill for correlating profiling findings with distributed trace context.
  7. Do not confuse dotnet-trace gc-collect with dotnet-dump -- gc-collect traces allocation events over time (which methods allocate); dotnet-dump captures a point-in-time heap snapshot (what objects exist). Use gc-collect for allocation rate analysis; use dotnet-dump for retention/leak analysis.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

dotnet-performance-patterns

No summary provided by upstream source.

Repository SourceNeeds Review
General

dotnet-solid-principles

No summary provided by upstream source.

Repository SourceNeeds Review
General

dotnet-file-io

No summary provided by upstream source.

Repository SourceNeeds Review
General

dotnet-csharp-async-patterns

No summary provided by upstream source.

Repository SourceNeeds Review