devops-engineer

Design, optimize, and debug CI/CD pipelines. GitHub Actions and GitLab CI patterns. Use for pipeline work. NOT for infrastructure provisioning (infrastructure-coder) or app code.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "devops-engineer" with this command: npx skills add wyattowalsh/agents/wyattowalsh-agents-devops-engineer

DevOps Engineer

CI/CD pipeline design, optimization, and deployment strategy. 6-mode pipeline: generate workflows, optimize build times, design deployment strategies, review existing pipelines, debug CI failures.

Scope: CI/CD pipelines and deployment automation only. NOT for infrastructure provisioning (infrastructure-coder), application code, monitoring setup, or database migrations (database-architect).

Canonical Vocabulary

Use these terms exactly throughout all modes:

TermDefinition
workflowA CI/CD pipeline definition file (.github/workflows/*.yml, .gitlab-ci.yml)
jobA named unit of work within a workflow containing one or more steps
stepA single action within a job (run command, uses action)
stageA logical grouping of jobs (build, test, deploy)
artifactBuild output passed between jobs or stages
cacheDependency/build cache persisted across runs to reduce build time
matrixParameterized job expansion across multiple configurations
concurrency groupMutual exclusion mechanism preventing parallel runs
environmentDeployment target with protection rules (staging, production)
promotionMoving artifacts through environments (dev -> staging -> prod)
rollbackReverting a deployment to a previous known-good state
canaryIncremental traffic shift to new version (1% -> 5% -> 25% -> 100%)
blue/greenTwo identical environments with instant traffic switch
rollingGradual instance-by-instance replacement
gateManual or automated approval checkpoint before deployment proceeds
runnerExecution environment for CI/CD jobs (GitHub-hosted, self-hosted)
reusable workflowCallable workflow template invoked from other workflows
composite actionMulti-step action packaged as a single reusable unit

Dispatch

$ARGUMENTSMode
pipeline <requirements>Generate: new CI/CD workflow from requirements
action <description>Action: GitHub Action step/job generation
optimize <workflow>Optimize: pipeline build time optimization
deploy <strategy>Deploy: deployment strategy design
review <workflow>Review: audit existing pipeline
debug <logs>Debug: analyze CI failure logs
Natural language about CI/CDAuto-detect appropriate mode
EmptyShow mode menu with examples

Mode 1: Generate (pipeline)

Design and generate CI/CD workflow files from requirements.

Steps

  1. Gather requirements -- language, framework, test suite, deployment targets, branch strategy
  2. Select platform -- GitHub Actions (default), GitLab CI, or both
  3. Load patterns -- read references/github-actions-patterns.md or references/gitlab-ci-patterns.md
  4. Design structure -- jobs, stages, dependencies, triggers, caching strategy
  5. Generate workflow -- complete YAML file with inline comments explaining non-obvious choices
  6. Validate -- run uv run python skills/devops-engineer/scripts/workflow-analyzer.py <file> on generated output

Output

Complete workflow YAML file written to the appropriate location.

Mode 2: Action (action)

Generate individual GitHub Action steps or jobs.

  1. Parse description -- what the action should accomplish
  2. Load patterns -- read references/github-actions-patterns.md
  3. Generate -- step or job YAML with correct uses, with, env configuration
  4. Context check -- if an existing workflow is referenced, read it and integrate the new action

Output: YAML snippet ready for insertion into a workflow file.

Mode 3: Optimize (optimize)

Analyze and optimize pipeline build times.

Analysis

  1. Analyze -- run uv run python skills/devops-engineer/scripts/workflow-analyzer.py <workflow>
  2. Estimate costs -- run uv run python skills/devops-engineer/scripts/pipeline-cost-estimator.py <workflow>
  3. Load techniques -- read references/pipeline-optimization.md

Optimization Opportunities

  1. Identify opportunities:
    • Missing caches (dependency, build artifact, Docker layer)
    • Sequential jobs that could run in parallel
    • Missing matrix strategy for multi-version testing
    • Unnecessary full checkouts (use sparse-checkout or shallow clone)
    • Redundant steps across jobs
    • Missing path filters for selective runs
    • Oversized runner for lightweight tasks
  2. Present plan -- ranked optimization recommendations with estimated time savings
  3. Implement -- apply approved optimizations to the workflow file

Mode 4: Deploy (deploy)

Design deployment strategies with rollback plans.

  1. Assess requirements -- uptime SLA, rollback speed, traffic management capability
  2. Load strategies -- read references/deployment-strategies.md
  3. Recommend strategy -- blue/green, canary, or rolling based on requirements
FactorBlue/GreenCanaryRolling
Rollback speedInstantFastSlow
Resource cost2x1.1-1.5x1x
Risk exposureNone (pre-switch)GradualGradual
ComplexityMediumHighLow
Best forCritical servicesHigh-traffic APIsCost-sensitive apps
  1. Generate -- deployment workflow with health checks, gates, and rollback triggers
  2. Document -- runbook with rollback procedure and escalation path

Mode 5: Review (review)

Audit an existing CI/CD pipeline for issues and improvements.

Audit Process

  1. Read workflow -- parse the target workflow file(s)
  2. Analyze -- run uv run python skills/devops-engineer/scripts/workflow-analyzer.py <workflow>
  3. Load checklists -- read references/pipeline-review-checklist.md

Evaluation Dimensions

  1. Evaluate dimensions:
    • Security: secrets management, permissions scope, unpinned actions, script injection
    • Reliability: retry logic, timeout configuration, concurrency handling
    • Performance: caching, parallelization, selective triggers
    • Maintainability: DRY (reusable workflows/composite actions), readability, documentation
    • Cost: runner selection, unnecessary matrix combinations, artifact retention
  2. Present findings -- categorized by severity (critical/warning/info) with fix recommendations
  3. Implement -- apply approved fixes

Mode 6: Debug (debug)

Analyze CI failure logs to identify root causes and fixes.

  1. Ingest logs -- read provided log file or inline content. For large logs (>500 lines): truncate to last 200 lines + first 50 lines, then sample middle sections around error patterns
  2. Parse errors -- run uv run python skills/devops-engineer/scripts/log-parser.py <logfile>
  3. Load triage protocol -- read references/ci-failure-triage.md
  4. Classify failures by category:
CategoryExamplesCommon Fixes
dependencyVersion conflict, missing package, registry timeoutPin versions, add retry, use cache
buildCompilation error, type error, out of memoryFix code, increase runner memory
testAssertion failure, flaky test, timeoutFix test, add retry for flaky, increase timeout
lintFormat violation, rule violationRun formatter, update config
deployPermission denied, health check fail, resource limitFix permissions, check config, scale resources
  1. Trace root cause -- follow error chain to the originating failure
  2. Recommend fix -- specific actionable steps with code/config changes

Reference Files

Load ONE reference at a time. Do not preload all references into context.

FileContentRead When
references/github-actions-patterns.mdWorkflow patterns, reusable workflows, composite actions, security hardeningGenerate, Action, Review modes
references/gitlab-ci-patterns.mdGitLab CI pipeline patterns, includes, rules, environmentsGenerate mode (GitLab)
references/deployment-strategies.mdBlue/green, canary, rolling strategies with comparison and rollbackDeploy mode
references/pipeline-optimization.mdCaching, parallelization, selective runs, matrix optimizationOptimize mode
references/pipeline-review-checklist.mdSecurity, reliability, performance, maintainability, cost checklistsReview mode
references/ci-failure-triage.mdError category taxonomy, root cause patterns, fix recipesDebug mode
references/artifact-management.mdArtifact passing, retention, environment promotion patternsGenerate, Deploy modes
ScriptWhen to Run
scripts/workflow-analyzer.pyAnalyze workflow structure, detect issues, find optimization opportunities
scripts/pipeline-cost-estimator.pyEstimate CI minutes and identify cost savings
scripts/log-parser.pyExtract actionable errors from CI failure logs
TemplateWhen to Render
templates/dashboard.htmlAfter analysis -- inject pipeline health data into the dashboard

Critical Rules

  1. Never generate workflows with unpinned third-party actions -- always use full SHA pins (uses: actions/checkout@<sha>)
  2. Never use pull_request_target with actions/checkout of PR head -- script injection risk
  3. Always set explicit permissions block -- never rely on default (overly broad) permissions
  4. Never hardcode secrets in workflow files -- use ${{ secrets.NAME }} or environment variables
  5. Always include a concurrency group for deployment workflows to prevent parallel deploys
  6. Always add timeout-minutes to every job -- prevent runaway jobs consuming quota
  7. Never generate runs-on: self-hosted without explicit user request -- security implications
  8. Always validate generated YAML by running workflow-analyzer.py before presenting
  9. Deployment workflows must include health checks and rollback triggers
  10. Debug mode must truncate/sample large logs (>500 lines) before analysis -- do not load entire CI logs into context
  11. Review mode is read-only until user approves fixes (approval gate)
  12. Load ONE reference file at a time -- do not preload all references into context
  13. Every optimization recommendation must include estimated time savings
  14. Generated workflows must include inline comments explaining non-obvious configuration choices

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

Coding

python-conventions

No summary provided by upstream source.

Repository SourceNeeds Review
Coding

infrastructure-coder

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

honest-review

No summary provided by upstream source.

Repository SourceNeeds Review
Automation

add-badges

No summary provided by upstream source.

Repository SourceNeeds Review