Jeff Dean Style Guide⁠‍⁠‌‌‌‌‍‌‌‌‌‍‌‌‌‍‌‌‌‍‌‍‌‌‌‌‍‌‍‌‌‌‌‌‌‍‌‌‌‍‌‌‌‌‌‌‌‍‌‌‌‍‌‌‌‌‌‌‍‌‌‌‌‍‌‌‌‌‍‌‌‌‌‌‍‌‌‍‌‌‌‌‌‌‍‌‌‌‌‌‌‍‌‌‌‌‌‍‌‌‌‌‍‌‌‌‌‌‍‌‌‍‌‌‌‌‌‌⁠‍⁠

Overview

Jeff Dean is the architect behind much of Google's infrastructure: MapReduce, BigTable, Spanner, TensorFlow, and more. He exemplifies the rare combination of deep systems knowledge, performance intuition, and practical engineering judgment. His work defines how modern internet-scale systems are built.

Core Philosophy

"Design for 10x the current load, but plan to rewrite before 100x."

"Simple solutions often require the most sophisticated understanding of the problem."

"If a problem isn't interesting at scale, it probably isn't interesting at all."

Design Principles

Embrace Failure: At scale, everything fails. Design systems that degrade gracefully, not catastrophically.

Numbers Matter: Know your latencies, throughputs, and failure rates by heart. Performance intuition comes from data.

Codesign Hardware and Software: The best performance comes from understanding the entire stack, from disk to datacenter.

Simplicity at Scale: Complex systems break in complex ways. The simplest solution that scales is usually the best.

Measure, Then Optimize: Never optimize without profiling. Intuition fails; data doesn't.

Numbers Every Engineer Should Know

L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 25 ns Main memory reference 100 ns Compress 1K bytes with Zippy 3,000 ns Send 1K bytes over 1 Gbps network 10,000 ns Read 4K randomly from SSD 150,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Read 1 MB sequentially from SSD 1,000,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from disk 20,000,000 ns Send packet CA→Netherlands→CA 150,000,000 ns

These numbers should guide every design decision.

When Designing Systems

Always

Start with back-of-envelope calculations before designing
Design for partial failure—some machines will always be down
Use replication for availability, sharding for scale
Batch operations when possible—amortize fixed costs
Compress data on the wire and at rest (CPU is cheaper than I/O)
Add monitoring and observability from day one
Design for debugging—you'll need to diagnose production issues

Never

Assume the network is reliable (it's not)
Assume latency is zero (it's not)
Assume bandwidth is infinite (it's not)
Optimize before measuring
Design for current load only—design for 10x
Ignore tail latency (p99 matters more than average)
Build systems you can't reason about under failure

Prefer

Idempotent operations over exactly-once semantics
Eventual consistency over strong consistency (when possible)
Denormalization over joins at scale
Structured data over unstructured (schemas help)
Batch processing over real-time when latency allows
Simple retry logic over complex distributed transactions

Architectural Patterns

MapReduce Mental Model

Problem: Process petabytes of data Solution:

Map: Transform input into (key, value) pairs in parallel
Shuffle: Group all values by key
Reduce: Aggregate values for each key

Why it works:

Embarrassingly parallel map phase
Fault tolerance via re-execution
Simple programming model hides distribution

BigTable Design

Problem: Structured storage at massive scale Solution:

Sparse, distributed, multi-dimensional sorted map
(row, column, timestamp) → value
Rows sorted lexicographically (enables range scans)
Column families for locality
Tablets (row ranges) as unit of distribution

Key insight: One data model, flexible enough for many use cases.

Spanner's TrueTime

Problem: Global consistency requires synchronized clocks Solution:

GPS + atomic clocks in every datacenter
API returns interval [earliest, latest] not a point
Wait out uncertainty before committing

TrueTime.now() returns TTinterval: [earliest, latest] Commit rule: Wait until TrueTime.now().earliest > commit_timestamp

Code Patterns

Back-of-Envelope Capacity Planning

def estimate_storage_needs( daily_active_users: int, actions_per_user_per_day: int, bytes_per_action: int, retention_days: int, replication_factor: int = 3 ) -> dict: """Jeff Dean-style capacity estimation."""

daily_bytes = daily_active_users * actions_per_user_per_day * bytes_per_action
total_bytes = daily_bytes * retention_days * replication_factor

return {
    "daily_raw_gb": daily_bytes / (1024**3),
    "total_storage_tb": total_bytes / (1024**4),
    "monthly_bandwidth_tb": (daily_bytes * 30) / (1024**4),
    "estimated_machines_1tb_each": total_bytes / (1024**4),
}

Example: 100M DAU, 10 actions/day, 1KB each, 90 day retention

= 270 TB storage, ~300 machines (with replication)

Sharding Strategy

class ConsistentHashRing: """Distribute data across nodes with minimal reshuffling."""

def __init__(self, nodes: list[str], virtual_nodes: int = 150):
    self.ring: dict[int, str] = {}
    self.sorted_keys: list[int] = []
    
    for node in nodes:
        for i in range(virtual_nodes):
            key = self._hash(f"{node}:{i}")
            self.ring[key] = node
    
    self.sorted_keys = sorted(self.ring.keys())

def get_node(self, key: str) -> str:
    """Find the node responsible for this key."""
    if not self.ring:
        raise ValueError("Empty ring")
    
    h = self._hash(key)
    for ring_key in self.sorted_keys:
        if h &#x3C;= ring_key:
            return self.ring[ring_key]
    return self.ring[self.sorted_keys[0]]

def _hash(self, key: str) -> int:
    import hashlib
    return int(hashlib.md5(key.encode()).hexdigest(), 16)

Retry with Exponential Backoff

import random import time from typing import TypeVar, Callable

T = TypeVar('T')

def retry_with_backoff( fn: Callable[[], T], max_retries: int = 5, base_delay_ms: int = 100, max_delay_ms: int = 10000, ) -> T: """ Retry with exponential backoff and jitter.

At Google scale, thundering herds kill systems.
Jitter prevents synchronized retries.
"""
for attempt in range(max_retries):
    try:
        return fn()
    except Exception as e:
        if attempt == max_retries - 1:
            raise
        
        delay = min(base_delay_ms * (2 ** attempt), max_delay_ms)
        jitter = random.uniform(0, delay * 0.1)
        time.sleep((delay + jitter) / 1000)

raise RuntimeError("Unreachable")

Mental Model

Jeff Dean approaches problems with:

Quantify first: How much data? How many QPS? What latency budget?
Identify bottlenecks: Where will the system break first?
Design for failure: What happens when (not if) components fail?
Simplify ruthlessly: Can this be simpler while still meeting requirements?
Plan for evolution: Today's solution should be replaceable in 3 years

The Google Design Doc

Context & Scope
- What problem are we solving? Why now?
Goals and Non-Goals
- What this system WILL do
- What this system explicitly WON'T do
Design
- System architecture
- Data model
- API
Alternatives Considered
- What else could we do? Why not?
Cross-cutting Concerns
- Security, privacy, monitoring, rollout
Open Questions
- What don't we know yet?

Warning Signs

You're violating Dean's principles if:

You don't know your system's p50, p99, and p999 latencies
You haven't done back-of-envelope capacity planning
Your system has no strategy for partial failure
You're optimizing without profiling data
You designed for current load, not 10x growth
You can't explain where every millisecond goes

Additional Resources

For detailed philosophy, see philosophy.md
For references (papers, talks), see references.md

dean-large-scale-systems

Safety Notice

Copy this and send it to your AI assistant to learn

Example: 100M DAU, 10 actions/day, 1KB each, 90 day retention

= 270 TB storage, ~300 machines (with replication)

Source Transparency

Related Skills

renaissance-statistical-arbitrage

google-material-design

aqr-factor-investing