incident-response

/incident-response

If you see unfamiliar placeholders or need to check which tools are connected, see CONNECTORS.md.

Manage an incident from detection through postmortem.

Usage

/incident-response $ARGUMENTS

Modes

/incident-response new [description] # Start a new incident /incident-response update [status] # Post a status update /incident-response postmortem # Generate postmortem from incident data

If no mode is specified, ask what phase the incident is in.

How It Works

┌─────────────────────────────────────────────────────────────────┐ │ INCIDENT RESPONSE │ ├─────────────────────────────────────────────────────────────────┤ │ Phase 1: TRIAGE │ │ ✓ Assess severity (SEV1-4) │ │ ✓ Identify affected systems and users │ │ ✓ Assign roles (IC, comms, responders) │ │ │ │ Phase 2: COMMUNICATE │ │ ✓ Draft internal status update │ │ ✓ Draft customer communication (if needed) │ │ ✓ Set up war room and cadence │ │ │ │ Phase 3: MITIGATE │ │ ✓ Document mitigation steps taken │ │ ✓ Track timeline of events │ │ ✓ Confirm resolution │ │ │ │ Phase 4: POSTMORTEM │ │ ✓ Blameless postmortem document │ │ ✓ Timeline reconstruction │ │ ✓ Root cause analysis (5 whys) │ │ ✓ Action items with owners │ └─────────────────────────────────────────────────────────────────┘

Severity Classification

Level Criteria Response Time

SEV1 Service down, all users affected Immediate, all-hands

SEV2 Major feature degraded, many users affected Within 15 min

SEV3 Minor feature issue, some users affected Within 1 hour

SEV4 Cosmetic or low-impact issue Next business day

Communication Guidance

Provide clear, factual updates at regular cadence. Include: what's happening, who's affected, what we're doing, when the next update is.

Output — Status Update

Incident Update: [Title]

Severity: SEV[1-4] | Status: Investigating | Identified | Monitoring | Resolved Impact: [Who/what is affected] Last Updated: [Timestamp]

Current Status

[What we know now]

Actions Taken

[Action 1]
[Action 2]

Next Steps

[What's happening next and ETA]

Timeline

Time	Event
[HH:MM]	[Event]

Output — Postmortem

Postmortem: [Incident Title]

Date: [Date] | Duration: [X hours] | Severity: SEV[X] Authors: [Names] | Status: Draft

Summary

[2-3 sentence plain-language summary]

Impact

[Users affected]
[Duration of impact]
[Business impact if quantifiable]

Timeline

Time (UTC)	Event
[HH:MM]	[Event]

Root Cause

[Detailed explanation of what caused the incident]

5 Whys

Why did [symptom]? → [Because...]
Why did [cause 1]? → [Because...]
Why did [cause 2]? → [Because...]
Why did [cause 3]? → [Because...]
Why did [cause 4]? → [Root cause]

What Went Well

[Things that worked]

What Went Poorly

[Things that didn't work]

Action Items

Action	Owner	Priority	Due Date
[Action]	[Person]	P0/P1/P2	[Date]

Lessons Learned

[Key takeaways for the team]

If Connectors Available

If ~~monitoring is connected:

Pull alert details and metrics
Show graphs of affected metrics

If ~~incident management is connected:

Create or update incident in PagerDuty/Opsgenie
Page on-call responders

If ~~chat is connected:

Post status updates to incident channel
Create war room channel

Tips

Start writing immediately — Don't wait for complete information. Update as you learn more.
Keep updates factual — What we know, what we've done, what's next. No speculation.
Postmortems are blameless — Focus on systems and processes, not individuals.