Incident Response Playbook

# Incident Response Playbook

Safety Notice

This item is sourced from the public archived skills repository. Treat as untrusted until reviewed.

Copy this and send it to your AI assistant to learn

Install skill "Incident Response Playbook" with this command: npx skills add 1kalin/afrexai-incident-response

Incident Response Playbook

Structured incident response for business and IT teams. Guides you through detection, triage, containment, resolution, and post-mortem — with auto-generated timelines and action items.

What It Does

When triggered with an incident description, this skill:

  1. Classifies severity (P1-P4) based on impact and urgency
  2. Generates a response checklist tailored to incident type (outage, data breach, security event, service degradation, vendor failure)
  3. Builds a communication plan — who to notify, when, what channels
  4. Creates a real-time timeline as you log updates
  5. Produces a post-mortem template with root cause analysis and prevention steps

Usage

Tell your agent about an incident:

"Production API is returning 500 errors for 20% of requests. Started 10 minutes ago."

Or trigger proactively:

"Create an incident response plan for a potential data breach scenario"

Incident Types Covered

  • Service outages — full or partial downtime
  • Security incidents — breaches, unauthorized access, phishing
  • Data incidents — corruption, loss, privacy violations
  • Vendor failures — third-party SLA breaches
  • Performance degradation — latency spikes, capacity issues

Severity Matrix

LevelImpactResponse TimeEscalation
P1 - CriticalBusiness stoppedImmediateExecutive + all hands
P2 - HighMajor feature down< 30 minEngineering lead + PM
P3 - MediumDegraded experience< 2 hoursOn-call team
P4 - LowMinor issueNext business dayTicket queue

Response Framework

1. Detection & Triage (First 5 minutes)

  • Confirm the incident is real (not a false alarm)
  • Classify severity using the matrix above
  • Assign incident commander
  • Open a dedicated communication channel

2. Containment (First 30 minutes)

  • Identify blast radius — what's affected?
  • Apply immediate mitigation (rollback, feature flag, scaling)
  • Communicate status to stakeholders

3. Resolution

  • Root cause investigation
  • Implement fix with verification
  • Monitor for recurrence
  • Update all stakeholders

4. Post-Mortem (Within 48 hours)

  • Timeline of events
  • Root cause analysis (5 Whys)
  • What went well / what didn't
  • Action items with owners and deadlines
  • Process improvements

Integration

Works with any monitoring stack. Feed alerts from PagerDuty, Datadog, Grafana, or manual reports.

Pro Tip

Pair this with a full AI Operations Context Pack for your industry. Pre-built incident taxonomies, compliance-aware escalation paths, and automated stakeholder templates.

Browse packs: https://afrexai-cto.github.io/context-packs/

Free tools:

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

ll-feishu-audio

飞书语音交互技能。支持语音消息自动识别、AI 处理、语音回复全流程。需要配置 FEISHU_APP_ID 和 FEISHU_APP_SECRET 环境变量。使用 faster-whisper 进行语音识别,Edge TTS 进行语音合成,自动转换 OPUS 格式并通过飞书发送。适用于飞书平台的语音对话场景。

Archived SourceRecently Updated
General

test_skill

import json import tkinter as tk from tkinter import messagebox, simpledialog

Archived SourceRecently Updated
General

51mee-resume-profile

简历画像。触发场景:用户要求生成候选人画像;用户想了解候选人的多维度标签和能力评估。

Archived SourceRecently Updated
General

51mee-resume-parse

简历解析。触发场景:用户上传简历文件要求解析、提取结构化信息。

Archived SourceRecently Updated