Terraform Skill for Claude
Diagnose-first guidance for Terraform and OpenTofu. Core file is a workflow; depth lives in references loaded on demand.
Response Contract
Every Terraform/OpenTofu response must include:
- Assumptions & version floor — runtime (
terraformortofu), exact version, providers, state backend, execution path (local/CI/Cloud/Atlantis), environment criticality. State assumptions explicitly if the user did not provide them. - Risk category addressed — one or more of: identity churn, secret exposure, blast radius, CI drift, compliance gaps, state corruption, provider upgrade risk, testing blind spots.
- Chosen remediation & tradeoffs — what was chosen, what was traded off, why.
- Validation plan — exact commands (
fmt -check,validate,plan -out, policy check) tailored to runtime and risk tier. - Rollback notes — for any destructive or state-mutating change: how to undo, what evidence to keep.
Never recommend direct production apply without a reviewed plan artifact and approval.
Workflow
- Capture execution context — runtime+version, provider(s), backend, execution path, environment criticality.
- Diagnose failure mode(s) using the routing table below. If intent spans categories, load both references.
- Load only the matching reference file(s) — do not preload depth the task does not need.
- Propose fix with risk controls — why this addresses the mode, what could still go wrong, guardrails (tests/approvals/rollback).
- Generate artifacts — HCL, migration blocks (
moved,import), CI changes, policy rules. - Validate before finalizing — run validation commands tailored to risk tier.
- Emit the Response Contract at the end.
Diagnose Before You Generate
| Failure category | Symptoms | Primary references |
|---|---|---|
| Identity churn | Resource addresses shift after refactor, count index churn, missing moved blocks | Code Patterns: count vs for_each, Code Patterns: moved blocks, Code Patterns: LLM mistakes |
| Secret exposure | Secrets in defaults, state, logs, CI artifacts | Security & Compliance, Code Patterns: write-only, State Management |
| Blast radius | Oversized stacks, shared prod/non-prod state, unsafe applies | State Management, Module Patterns |
| CI drift | Local plan ≠ CI plan, apply without reviewed artifact, unpinned versions | CI/CD Workflows, Code Patterns: versions |
| Compliance gaps | Missing policy stage, no approval model, no evidence retention | Security & Compliance, CI/CD Workflows |
| Testing blind spots | Plan-only validation of computed values, set-type indexing, mock/real confusion | Testing Frameworks |
| State corruption / recovery | Stuck lock, backend migration, drift reconciliation | State Management |
| Provider upgrade risk | Breaking-change provider bump, unpinned modules | Code Patterns: versions, Module Patterns |
| Provider lifecycle | Removing a provider with resources still in state, orphaned resources, removed block usage | State Management: Provider Removal |
When to Use This Skill
Activate when: creating or reviewing Terraform/OpenTofu configurations or modules, setting up or debugging tests, structuring multi-environment deployments, implementing IaC CI/CD, choosing module patterns or state organization, configuring or migrating remote state backends.
Don't use for: basic HCL syntax questions Claude already knows, provider API reference (link to docs), cloud-platform questions unrelated to Terraform/OpenTofu.
Core Principles
Module Hierarchy
| Type | When to Use | Scope |
|---|---|---|
| Resource module | Single logical group of connected resources | VPC + subnets, SG + rules |
| Infrastructure module | Collection of resource modules for a purpose | Multiple resource modules in one region/account |
| Composition | Complete infrastructure | Spans multiple regions/accounts |
Flow: resource → resource module → infrastructure module → composition.
Directory Layout
environments/ # prod/ staging/ dev/ — per-env configurations
modules/ # networking/ compute/ data/ — reusable modules
examples/ # minimal/ complete/ — docs + integration fixtures
Separate environments from modules. Use examples/ as both documentation and test fixtures. Keep modules small and single-responsibility.
See Module Patterns for architecture principles, naming conventions, variable/output contracts.
Naming Conventions (summary)
- Descriptive resource names (
aws_instance.web_server, notaws_instance.main) - Reserve
thisfor genuine singleton resources only - Prefix variables with context (
vpc_cidr_block, notcidr) - Standard files:
main.tf,variables.tf,outputs.tf,versions.tf
See Module Patterns: Variable Naming and Code Patterns: Block Ordering for examples.
Block Ordering (summary)
Resource blocks: count/for_each first → arguments → tags → depends_on → lifecycle.
Variable blocks: description → type → default → validation → nullable → sensitive.
See Code Patterns: Block Ordering & Structure for the full rules and examples.
Testing Strategy
Decision Matrix: Which Testing Approach?
| Situation | Approach | Tools | Cost |
|---|---|---|---|
| Quick syntax check | Static analysis | validate, fmt | Free |
| Pre-commit validation | Static + lint | validate, tflint, trivy, checkov | Free |
| Terraform 1.6+, simple logic | Native test framework | terraform test | Free-Low |
| Pre-1.6, or Go expertise | Integration testing | Terratest | Low-Med |
| Security/compliance focus | Policy as code | OPA, Sentinel | Free |
| Cost-sensitive workflow | Mock providers (1.7+) | Native tests + mocks | Free |
| Multi-cloud, complex | Full integration | Terratest + real infra | Med-High |
Native Test Rules (1.6+)
Before writing test code: validate resource schemas via Terraform MCP so assertions target real attributes.
command = plan— fast, for input-derived values onlycommand = apply— required for computed values (ARNs, generated names) and set-type nested blocks- Set-type blocks cannot be indexed with
[0]— useforexpressions or materialize viacommand = apply - Common set types: S3 encryption rules, lifecycle transitions, IAM policy statements
See Testing Frameworks for static-analysis pipelines, native-test patterns, Terratest integration, mock providers, and the full LLM-mistake checklist.
Count vs For_Each — Quick Rule
| Scenario | Use | Why |
|---|---|---|
| Boolean condition (create / don't) | count = condition ? 1 : 0 | Optional singleton toggle |
| Items may be reordered or removed | for_each = toset(list) | Stable resource addresses |
| Reference by key | for_each = map | Named access |
| Multiple named resources | for_each | Better identity stability |
Never use list index as long-lived identity — removing a middle element reshuffles every address after it. For the decision matrix, safe migration playbook, moved block patterns, and known-at-plan failure cases, see Code Patterns: count vs for_each.
Locals for Dependency Management
Using try() in a local to prefer a conditional resource's attribute over its parent is a specialized but high-value pattern — it forces correct deletion order without explicit depends_on. Common use: VPC + secondary CIDR associations + subnets.
See Code Patterns: Locals for Dependency Management for the full pattern and worked example.
Module Development
Standard layout:
my-module/
├── README.md # Usage documentation
├── main.tf # Primary resources
├── variables.tf # Typed inputs with descriptions
├── outputs.tf # Output values
├── versions.tf # required_version + required_providers
├── examples/
│ ├── minimal/
│ └── complete/
└── tests/
└── module_test.tftest.hcl # or Go for Terratest
Variable contracts: always description, always explicit type, use validation for complex constraints, use sensitive = true for secrets, prefer optional() with typed defaults (1.3+) over untyped map(any).
Output contracts: always description, mark sensitive outputs, expose stable subsets (not whole provider objects).
See Module Patterns for the full contract patterns, module release checklist, and LLM-mistake checklist.
CI/CD
Pipeline stages: validate → test → plan → apply (with environment protection).
Cost control: mock providers on PR validation, real-cloud integration only on main or scheduled, tag test resources, auto-cleanup.
Drift prevention: pin runtime and providers, commit .terraform.lock.hcl, apply the reviewed plan artifact from the plan stage (do not re-run plan inside the apply job), run policy/security stage on every path to apply.
See CI/CD Workflows for GitHub Actions, GitLab CI, and Atlantis templates plus the LLM-mistake checklist.
Security & Compliance
Essential checks:
trivy config .
checkov -d .
Don't: store secrets in variables or .tfvars, use default VPC, skip encryption, open security groups to 0.0.0.0/0, use inline ingress/egress blocks in aws_security_group.
Do: source secrets from AWS Secrets Manager / Parameter Store or use write_only arguments on 1.11+, create dedicated VPCs, enforce encryption at rest and TLS, least-privilege SGs, use separate aws_vpc_security_group_{ingress,egress}_rule resources (AWS provider v5+).
Marking a variable sensitive = true masks display only — the value still lives in state. Use write_only / *_wo on 1.11+, or keep secret material out of Terraform entirely via runtime lookups.
See Security & Compliance for trivy/checkov pipelines, state-file hardening, compliance mappings, and the LLM-mistake checklist.
State Management
Never use local state in teams or production. Remote backends provide automatic locking, encryption, versioning, audit logging, and safe collaboration.
Minimum Viable Backend (AWS S3, 1.10+)
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "prod/vpc/terraform.tfstate"
region = "us-east-1"
encrypt = true
use_lockfile = true # Native S3 locking, 1.10+
}
}
On Terraform < 1.10, use dynamodb_table = "terraform-state-lock" instead of use_lockfile. Azure Storage, GCS, and Terraform Cloud all offer built-in locking — see the State Management reference for syntax.
State Organization
| Pattern | Use When | Example Path |
|---|---|---|
| Per environment | Different teams per env | prod/terraform.tfstate, staging/... |
| Per component | Independent lifecycles | prod/vpc/, prod/eks/, prod/rds/ |
| Hybrid (recommended) | Both benefits | prod/networking/, prod/compute/, staging/networking/ |
Split state when: different teams, different update cadences, or >500 resources. Combine when: tightly coupled resources, <100 resources, same lifecycle.
See State Management for locking, migration, multi-team isolation, disaster recovery, and the LLM-mistake checklist.
Version Management
| Component | Strategy | Example |
|---|---|---|
| Terraform runtime | Pin minor | required_version = "~> 1.9" |
| Providers | Pin major | version = "~> 5.0" |
| Modules (prod) | Pin exact | version = "5.1.2" |
| Modules (dev) | Allow patch | version = "~> 5.1" |
Commit .terraform.lock.hcl intentionally. Keep provider/runtime upgrades in a separate PR from functional changes. See Code Patterns: Version Management for constraint syntax and upgrade workflow.
Modern Terraform Features (1.0+)
| Feature | Min version | Common use |
|---|---|---|
try() | 0.13+ | Safe fallbacks, replaces element(concat()) |
nullable = false | 1.1+ | Prevent null silently overriding defaults |
moved blocks | 1.1+ | Refactor without destroy/recreate |
optional() with defaults | 1.3+ | Typed object attributes |
import blocks | 1.5+ | Declarative imports, reviewable in VCS |
check blocks | 1.5+ | Runtime assertions |
Native terraform test | 1.6+ | Built-in test framework |
| Mock providers | 1.7+ | Cost-free unit testing |
removed blocks | 1.7+ | Declarative resource removal |
| Provider-defined functions | 1.8+ | Provider-specific transformations (requires provider to declare functions) |
| Cross-variable validation | 1.9+ | Reference other var.* in validation blocks |
write_only arguments | 1.11+ | Secrets never stored in state |
| S3 native lock-file | 1.10+ | State locking without DynamoDB |
Before emitting a feature, verify the runtime floor. See Code Patterns: Feature Guard Table for the full table with common LLM error patterns per feature.
Runtime-Specific Guidance
- Terraform 1.0-1.5 / OpenTofu 1.0-1.5: Terratest for integration, static analysis + plan validation only (no native tests).
- 1.6+: native
terraform test/tofu testavailable — migrate simple unit tests, keep Terratest for complex integration. - 1.7+: mock providers cut test cost — mock for unit tests, real runs for final integration.
- 1.10+: S3 native lock-file (
use_lockfile) is the correct default for new configurations — DynamoDB locking is no longer required. - 1.11+:
write_onlyarguments for secret handling keep credentials out of state. - Terraform vs OpenTofu: both supported. For licensing, governance, and feature delta, see Quick Reference: Terraform vs OpenTofu.
Reference Files
Progressive disclosure — essentials here, depth on demand:
- Testing Frameworks — static analysis, native tests, Terratest, mock providers
- Module Patterns — structure, variable/output contracts,
terraform_remote_staterules, release checklist - CI/CD Workflows — GitHub Actions, GitLab CI, Atlantis, cost control
- Security & Compliance — trivy/checkov, secrets handling, compliance mappings
- State Management — backends, locking, migration, multi-team, recovery
- Code Patterns — block ordering,
count/for_eachdeep dive, modern features, version management, locals - Quick Reference — command cheat sheets, flowcharts, troubleshooting
License
Apache License 2.0. See LICENSE for full terms.
Copyright © 2026 Anton Babenko