design-and-build

Design and Build

Overview

This skill implements the Technical Solution (TS), Product Integration (PI), and Configuration Management (CM) process areas from the CMMI-based SDLC prescription.

Core principle: Architecture decisions require documentation (ADRs). Emergency shortcuts require retrospective documentation. "Best practice" is never justification - requirements are.

Reference: See docs/sdlc-prescription-cmmi-levels-2-4.md Section 3.2 for complete policy and practice definitions.

When to Use

Use this skill when:

Making architecture or design decisions (technology choice, patterns, structure)
Setting up build/integration systems (CI/CD, deployment pipelines)
Managing technical debt (tracking, prioritization, paydown)
Establishing configuration management (branching strategy, release process)
Facing "should we use X?" questions where X is trendy technology
Team experiencing git chaos, integration hell, or debt spiral

Do NOT use for:

Implementation details within existing architecture → Use domain-specific skills (python-engineering, web-backend)
Testing strategy → Use quality-assurance skill
Requirements or specification → Use requirements-lifecycle skill

Quick Reference

Situation Primary Reference Sheet Key Decision

"Should we use microservices?" Architecture & Design Requires ADR. Use decision framework: team size, domain complexity, ops maturity.

"Git workflow is chaos" Configuration Management Diagnose root cause first. GitFlow (L3 releases) vs GitHub Flow (continuous) vs Trunk (high maturity). Requires ADR.

"70% of time on bugs" Technical Debt Management CODE RED. Feature freeze, architectural audit, classify debt (architectural/tactical/unpayable).

"Setting up CI/CD" Build & Integration Requirements gathering first. Platform choice requires ADR. Start simple, add stages incrementally.

"Quick fix vs proper solution" Level 2→3→4 Scaling Level 3 requires retrospective ADR within 48 hours. Document as HOTFIX with paydown commitment.

Level-Based Governance

CRITICAL: This skill enforces governance based on project maturity level.

Level Detection

Check for project level in:

CLAUDE.md : CMMI Target Level: 3
User message: "This is a Level 3 project..."
Default if unspecified: Level 3

ADR Requirements by Level

Level ADR Required For Exception Protocol

Level 2 Major architecture decisions (platform choice, deployment strategy) Informal discussion OK, document decision in wiki/README

Level 3 ALL architectural decisions (tech stack, branching strategy, design patterns, CI/CD platform) Emergency HOTFIX: Retrospective ADR within 48 hours mandatory

Level 4 Everything in L3 + quantitative justification with metrics No exceptions - statistical baselines required

Emergency Exception Protocol (HOTFIX Pattern)

When: Production emergency, immediate fix needed, no time for full ADR process

Level 3 Requirements:

Fix the emergency (restore service)
Document the fix in issue tracker with "HOTFIX" label
Create retrospective ADR within 48 hours:
Incident timeline
Why proper fix wasn't feasible
Technical debt introduced
Paydown commitment with date (max 2 weeks)
Track paydown as high-priority ticket

Violation: Skipping retrospective ADR = governance failure. See Enforcement section below.

HOTFIX Frequency Limit: >5 HOTFIXes per month = systemic problem requiring architectural audit, not process exception.

What Counts as "Architectural Decision"? (Level 3)

Architectural decisions (ADR required):

Technology platform choice (language, framework, database, cache, message queue)
Branching strategy (GitFlow, GitHub Flow, trunk-based)
CI/CD platform (GitHub Actions, Azure Pipelines, Jenkins)
Deployment strategy (blue/green, canary, rolling)
Design patterns with broad impact (event-driven, microservices, monolith)
Module boundaries and interfaces (how system decomposes)
Authentication/authorization approach (OAuth, JWT, session-based)
Data storage strategy (SQL vs NoSQL, caching strategy, data partitioning)

Implementation details (no ADR required, track in code/PR):

Specific library choice within chosen framework (e.g., logging library in Python)
Variable/function naming conventions
Code organization within module (file structure)
Test framework choice (if it doesn't affect architecture)

Borderline decisions (when in doubt, write ADR):

If change affects >3 modules or files → ADR
If reversal would take >1 day → ADR
If future developers would ask "why did they choose this?" → ADR

Enforcement and Escalation

Level 3 Requirements:

Platform enforcement: Branch protection, CI gates, ADR linking
Process enforcement: ADR review before implementation, HOTFIX tracking
Metrics: % architectural changes with ADRs (target: 100%)
Violations escalate: Team lead → Engineering manager → Governance committee

For detailed enforcement mechanisms, escalation paths, and compliance metrics, see level-scaling.md .

Anti-Patterns and Red Flags

Resume-Driven Design

Detection: User says "I've heard X is best practice" or "Everyone uses Y"

Red Flags:

Technology mentioned before requirements
Appeal to popularity ("everyone uses microservices")
Buzzword bingo (serverless, Kubernetes, blockchain)
Can't articulate WHAT PROBLEM they're solving

Counter:

"What requirements drive this choice?"
"What alternatives exist?"
"Why is current approach inadequate?"
If they can't answer → "You're doing resume-driven design. Requirements first, technology second."

Forcing function: Require ADR with alternatives analysis. If they can't justify with measurable requirements, ADR review will reject it.

Architecture Astronaut

Detection: Over-engineered solution for simple problem

Red Flags:

Microservices for CRUD app
Service mesh for 2 services
Event sourcing for simple data model
"Future-proof" without concrete future requirements

Counter: "What's the SIMPLEST solution that meets requirements? Start there. Add complexity when demonstrated need exists, not hypothetically."

Cowboy Coding

Detection: No reviews, no standards, "works on my machine"

Red Flags:

Skipping pull requests
Force pushing to main
No CI/CD
"I'll add tests later"

Counter: Enforce basic CM practices. Level 2 minimum: branch protection, required PR reviews, CI runs on PRs.

Debt Spiral

Detection: Increasing % of time on bugs, velocity declining

Red Flags:

50% time on bugs = WARNING

60% time on bugs = CODE RED

Every feature breaks something else
Team morale declining

Counter: See Technical Debt Management reference sheet. CODE RED triggers feature freeze.

Reference Sheets

Load these on-demand for detailed guidance:

Reference Sheet When to Use Link

Architecture & Design Making technology choices, selecting patterns, designing system structure architecture-and-design.md

Implementation Standards Establishing coding standards, code review process, documentation requirements implementation-standards.md

Configuration Management Git chaos, branching strategy decisions, release management configuration-management.md

Build & Integration Setting up CI/CD, build optimization, deployment pipelines build-and-integration.md

Technical Debt Management Team spending >40% time on bugs, debt accumulating, velocity declining technical-debt-management.md

Level 2→3→4 Scaling Understanding what rigor is appropriate for your project tier level-scaling.md

Common Mistakes

Mistake Why It Fails Better Approach

"Emergency exempts process" Creates pattern where "urgent" = skip governance, accumulating undocumented debt Use HOTFIX pattern: retrospective ADR within 48 hours, mandatory

"I'll document it later" Later never comes, loses audit trail Document NOW (ADR takes 15 min) or schedule retrospective (48 hours max)

"This is too simple for ADR" Simple decisions have big impact, lose rationale for future If it's truly simple, ADR takes 10 min. If it takes longer, it wasn't simple.

"Everyone uses X, so we should" Resume-driven design, not requirements-driven Require measurable justification. "Everyone" is not a requirement.

"20% debt allocation" when 70% bugs Treats crisis as normal problem, ensures slow death

60% bugs = CODE RED. Feature freeze, not incremental paydown.

"Pick branching strategy" without diagnosis Treats symptom (conflicts) not cause (architecture? communication?) Root cause analysis FIRST. Git strategy is symptom, not disease.

Generic CI/CD template without context Wastes time on wrong solution Requirements gathering FIRST: build characteristics, deployment context, risk profile

Integration with Other Skills

When You're Doing Also Use For

Designing Python architecture axiom-python-engineering

Python-specific patterns and idioms

Designing web API axiom-web-backend

REST/GraphQL best practices

Making architecture decision governance-and-risk

Formal DAR process for critical choices

Setting up testing quality-assurance

Test strategy and coverage

Choosing platforms platform-integration

GitHub vs Azure DevOps specifics

Real-World Impact

Without this skill: Teams experience:

Undocumented architecture decisions that haunt future developers
Resume-driven design leading to over-engineered solutions
Git chaos with daily conflicts and lost work
Debt spirals consuming 70%+ of time
Emergency shortcuts becoming permanent anti-patterns

With this skill: Teams achieve:

Defensible audit trail through ADRs
Technology choices driven by requirements, not hype
Structured configuration management reducing conflicts by 80%+
Early crisis detection preventing debt spirals
Emergency protocols that maintain governance without blocking urgency

Next Steps

Determine project level: Check CLAUDE.md or ask user for CMMI target level (default: Level 3)
Identify situation: Use Quick Reference table to find relevant reference sheet
Load reference sheet: Read detailed guidance for specific domain
Enforce ADR requirements: Level 3 requires ADR for architectural decisions - no exceptions without HOTFIX protocol
Apply decision frameworks: Use systematic evaluation, not gut feelings or hype
Counter anti-patterns: Watch for resume-driven design, debt spirals, cowboy coding
Measure success: Establish baseline, set targets, schedule retrospectives

Remember: "Best practice" is never justification. Requirements are. If you can't articulate the requirement, you can't justify the architecture.

design-and-build

Safety Notice

Copy this and send it to your AI assistant to learn

Source Transparency

Related Skills

using-system-architect

using-ml-production

using-software-engineering