Red Teaming

Core principle: Assume the system will be attacked, gamed, or stressed by an intelligent adversary. Think like the attacker, not the designer. Find the weaknesses before they're exploited.

Red Team Mindset

A red team actively tries to break the system — not to validate assumptions, but to invalidate them.

Key shifts:

Assume hostile intent: How would a bad actor abuse this?
Assume failure: Start from "this has failed" — what enabled it?
Assume partial information: What does the adversary know that defenders don't?
Assume creativity: Attackers aren't constrained by intended use
Asymmetry: Defenders must protect everything; attackers only need one opening

Red Team Dimensions

Apply adversarial thinking across these dimensions depending on what's being evaluated:

1. Technical Attack Surface

What are the inputs to this system that could be manipulated?
What assumptions about data validity are we making?
What happens at edge cases, limits, or unexpected inputs?
What trust boundaries exist, and can they be crossed?
What does the failure mode look like under load, partial failure, or poisoned input?

2. Incentive & Game Theory Attacks

Who has an incentive to game or subvert this system?
What behavior does the incentive structure actually reward? (vs. what it intends to reward)
If a rational actor wanted to extract maximum value while contributing minimum, how would they?
Are there collusion risks between actors in the system?

3. Process & Human Attacks

Where does the system rely on human judgment, discipline, or vigilance?
What social engineering vectors exist?
What happens when a trusted insider acts against the system?
Where does ambiguity in the process allow inconsistent or exploitable behavior?

4. Assumption Attacks

What must be true for this to work, and what if each assumption is false?
What information asymmetry exists between parties?
What dependencies exist that could be weaponized?

5. Cascade & Systemic Attacks

What single failure propagates most widely?
What is the highest-impact, lowest-effort attack?
What would a sophisticated attacker do that a naive attacker wouldn't?
What's the "kill chain" — the sequence of steps to cause catastrophic failure?

Output Format

🎯 Attack Surface Map

Enumerate the surfaces available to an adversary:

Entry points (inputs, interfaces, dependencies)
Trust boundaries (where one actor's output becomes another's input)
High-value targets (what would an attacker want to reach or damage?)

💀 Top Attack Scenarios

For each significant attack:

Name: Short descriptive label
Actor: Who would do this? (external attacker / insider / automated / accidental)
Method: How exactly is it executed?
Impact: What happens if it succeeds? (confidentiality / integrity / availability / reputation / financial)
Likelihood: Low / Medium / High
Current defenses: What prevents this now?
Defense gaps: What's missing?

🏆 Highest-Risk Findings

Ranked by: (Likelihood × Impact) / Existing Defenses

The top 3 are the ones to fix first.

🔗 Kill Chain Analysis

For the most critical attack scenario, map the full kill chain:

[Initial access] → [Lateral movement] → [Exploitation] → [Impact]

At each step: What stops the attacker here? What's missing?

🛡️ Hardening Recommendations

For each high-risk finding:

Short-term mitigation (reduce exposure now, even imperfectly)
Long-term fix (eliminate or fundamentally reduce the attack surface)
Detection (if prevention fails, how do we know we've been attacked?)

Red Team Questions by Domain

Software / Architecture

What if the input is malformed, empty, enormous, or adversarially crafted?
What if a dependency returns unexpected data or fails silently?
What if two requests arrive simultaneously and race?
What if a credential or token is leaked?
What if a component is compromised from within?

AI / Agent Systems

What if the agent receives adversarially crafted input (prompt injection)?
What if the agent's context is poisoned by a prior step?
What if a tool the agent calls is compromised or returns false data?
What if the agent is asked to perform an action outside its intended scope?
What if two agents give conflicting instructions to a third?

Product / Business

What if a user tries to extract value without paying?
What if a user reverse-engineers the system to game metrics?
What if a competitor copies the model and undercuts pricing?
What if a key partner defects or changes terms?
What if regulatory conditions change?

Organization / Process

What if a key person leaves?
What if incentives push people to hide information from each other?
What if a process is followed to the letter but not the spirit?
What if a deadline creates pressure to skip safeguards?

Red Team Levels of Depth

Level	Description	When to Use
Opportunistic	Surface-level checks, low effort	Quick validation, early design
Systematic	Full attack surface enumeration	Pre-launch, major architecture changes
Adversarial	Deep creative attack — think like a sophisticated threat actor	High-stakes systems, security-critical features

Key Principle: Asymmetric Paranoia

The red team doesn't need to find every flaw. It needs to find the one flaw that matters most. Always prioritize: What is the highest-impact attack that currently has no defense?

red-teaming

Safety Notice

Copy this and send it to your AI assistant to learn