latency-engineering

Diagnose and reduce latency in software systems. Use when dealing with slow APIs, tail latency, p99 spikes, caching, replication, partitioning, concurrency, async I/O, or any question about making systems faster.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "latency-engineering" with this command: npx skills add nkootstra/skills/nkootstra-skills-latency-engineering

Latency Engineering

Comprehensive guidance for diagnosing and reducing latency across the full software/hardware stack.

Mental Model

Latency = time delay between a cause and its observed effect. It is a distribution, not a single number. Always think in percentiles (p50, p95, p99, p99.9). Tail latency dominates real-world user experience far more than averages suggest.

Two fundamental laws:

Little's Law: Concurrency = Throughput × Latency. Use it to size systems and understand queue dynamics.
Amdahl's Law: Speedup = 1 / ((1-P) + P/N). Use it to set realistic expectations on parallelization gains.

Decision Framework: Where to Optimize First

1. Measure → identify where time is actually spent
2. Data locality?     → see references/data-latency.md
3. Computation cost?  → see references/compute-latency.md
4. Can't reduce it?   → see references/hiding-latency.md

Key Latency Constants (back-of-envelope)

Operation	Latency
CPU cycle (3 GHz)	~0.3 ns
L1 cache access	~1 ns
LLC / 40 Gbps NIC	~10–40 ns
DRAM access	~100 ns
NVMe disk	~10 µs
SSD disk	~100 µs
Same datacenter (LAN)	< 1 ms
NYC → London (WAN)	~60–150 ms

Common Sources of Latency (checklist)

Measuring Latency Correctly

Always capture full distributions, not just averages. Use histograms or eCDF plots.
Report p95, p99, p99.9 for SLAs and debugging.
Understand coordinated omission — naive benchmarks under-measure tail latency.
Validate measurements: compare minimum to theoretical lower bound from latency constants table.

Reference Files

Load the relevant reference file based on the category of the problem:

Problem category	Reference file
Data placement, caching, replication, sharding	`references/data-latency.md`
CPU, algorithms, concurrency, memory	`references/compute-latency.md`
Async I/O, prefetching, hiding unavoidable lag	`references/hiding-latency.md`
Networking, kernel bypass, intranode	`references/network-latency.md`

When the user's question spans multiple areas, load all relevant files.

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Open in GitHub Open in ClawHub

Related Skills

Related by shared tags or category signals.

General

compact-markdown

No summary provided by upstream source.

Repository SourceNeeds Review

38-nkootstra

General

adversarial-review

No summary provided by upstream source.

Repository SourceNeeds Review

11-nkootstra

General

zig-best-practices

No summary provided by upstream source.

Repository SourceNeeds Review

8-nkootstra

General

context-guardian

No summary provided by upstream source.

Repository SourceNeeds Review

8-nkootstra