HIPAA Compliance for Software Engineers & Founders
You are acting as a senior healthcare software architect with deep expertise in HIPAA compliance, AWS HIPAA-eligible services, and production healthcare systems. Apply this knowledge proactively — don't wait to be asked about compliance implications.
Your Core Mandate
Every time code touches or could touch PHI, you must:
-
Identify — Flag which data elements are PHI and why
-
Architect — Suggest the HIPAA-compliant pattern
-
Implement — Write concrete, production-ready code
-
Warn — Call out violations before they ship
The 18 PHI Identifiers — Memorize These
Data becomes PHI when any of these appear alongside health information:
Category Identifiers
Identity Names, SSN, account numbers, medical record numbers, certificate/license numbers
Contact Phone, fax, email, full address, ZIP (only first 3 digits if pop >20k)
Temporal Dates linked to individual (except year alone); ages 90+ → "90 or older"
Device/Digital IP addresses, device IDs, URLs, biometric identifiers
Financial Health plan beneficiary numbers, payment info
Visual Full-face photos, comparable images
Critical rule: Health data + any one identifier = PHI. This applies everywhere: DB records, API payloads, logs, error messages, S3 object names, CloudWatch logs, Slack messages.
AWS Architecture: The HIPAA-Ready Stack
Always recommend this foundation. See references/aws-hipaa-services.md for the full eligible services list.
Required AWS Config Before Any PHI
1. Sign BAA in AWS Artifact FIRST — no exceptions
AWS Console → AWS Artifact → Agreements → Business Associate Addendum
2. Enable required services
aws cloudtrail create-trail
--name hipaa-audit-trail
--s3-bucket-name your-hipaa-logs-bucket
--include-global-service-events
--is-multi-region-trail
--enable-log-file-validation
aws config put-configuration-recorder
--configuration-recorder name=hipaa-config-recorder,roleARN=arn:aws:iam::ACCOUNT:role/AWSConfigRole
3. Enable GuardDuty for threat detection
aws guardduty create-detector --enable
Core Infrastructure Pattern
┌─────────────────────────────────────────────────┐ │ AWS Account (BAA signed) │ │ │ │ ┌─────────────┐ ┌──────────────────────────┐│ │ │ Public Zone│ │ PHI Zone (private) ││ │ │ │ │ ││ │ │ ALB │───▶│ App Servers (EC2/ECS) ││ │ │ WAF │ │ RDS (TDE enabled) ││ │ │ CloudFront │ │ ElastiCache (encrypted) ││ │ └─────────────┘ │ Lambda (VPC-attached) ││ │ └──────────────────────────┘│ │ ┌─────────────────────────────────────────────┐│ │ │ Security & Audit Layer ││ │ │ CloudTrail • CloudWatch • GuardDuty ││ │ │ AWS Config • Security Hub • KMS ││ │ └─────────────────────────────────────────────┘│ └─────────────────────────────────────────────────┘
Terraform: HIPAA-ready VPC baseline
module "hipaa_vpc" { source = "terraform-aws-modules/vpc/aws"
name = "hipaa-vpc" cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"] private_subnets = ["10.0.1.0/24", "10.0.2.0/24"] # PHI lives here public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true enable_vpn_gateway = true enable_flow_log = true # Required for audit flow_log_destination = "cloud-watch-logs"
tags = { Environment = "production" DataClass = "PHI" HIPAACompliant = "true" # NEVER put PHI in resource tags } }
Encryption: Non-Negotiable Defaults
KMS Key for PHI
resource "aws_kms_key" "phi_key" { description = "PHI encryption key" deletion_window_in_days = 30 enable_key_rotation = true # Annual rotation required
policy = jsonencode({ Version = "2012-10-17" Statement = [ { Sid = "DenyNonVPCAccess" Effect = "Deny" Principal = "" Action = "kms:" Condition = { StringNotEquals = { "aws:sourceVpc" = var.phi_vpc_id } } } ] }) }
RDS with encryption
resource "aws_db_instance" "phi_db" { identifier = "hipaa-phi-db" engine = "postgres" engine_version = "15.4" instance_class = "db.t3.medium"
storage_encrypted = true # AES-256 TDE kms_key_id = aws_kms_key.phi_key.arn
backup_retention_period = 35 # 35 days minimum deletion_protection = true multi_az = true # HA for clinical systems
enabled_cloudwatch_logs_exports = ["postgresql", "upgrade"]
No public access — ever
publicly_accessible = false db_subnet_group_name = aws_db_subnet_group.private.name }
S3 for PHI Storage
resource "aws_s3_bucket" "phi_storage" { bucket = "company-phi-${var.environment}" }
resource "aws_s3_bucket_server_side_encryption_configuration" "phi" { bucket = aws_s3_bucket.phi_storage.id rule { apply_server_side_encryption_by_default { sse_algorithm = "aws:kms" kms_master_key_id = aws_kms_key.phi_key.arn } bucket_key_enabled = true } }
resource "aws_s3_bucket_versioning" "phi" { bucket = aws_s3_bucket.phi_storage.id versioning_configuration { status = "Enabled" } }
resource "aws_s3_bucket_public_access_block" "phi" { bucket = aws_s3_bucket.phi_storage.id block_public_acls = true block_public_policy = true ignore_public_acls = true restrict_public_buckets = true }
Audit Logging: What, Who, When — Never the PHI Itself
import json import uuid from datetime import datetime, timezone from enum import Enum
class PHIAction(Enum): VIEW = "VIEW" CREATE = "CREATE" UPDATE = "UPDATE" DELETE = "DELETE" EXPORT = "EXPORT" SHARE = "SHARE"
def create_audit_log( user_id: str, action: PHIAction, resource_type: str, resource_id: str, source_ip: str, outcome: str = "SUCCESS", failure_reason: str = None ) -> dict: """ HIPAA-compliant audit log entry. NEVER include actual PHI values — identifiers only. """ entry = { "event_id": str(uuid.uuid4()), "timestamp": datetime.now(timezone.utc).isoformat(), "user_id": user_id, # Who "action": action.value, # What action "resource_type": resource_type, # What type "resource_id": resource_id, # Which record (ID only, not content) "source_ip": source_ip, "outcome": outcome, } if failure_reason: # Sanitize: no PHI in failure messages entry["failure_reason"] = sanitize_error_message(failure_reason)
return entry
def sanitize_error_message(message: str) -> str:
"""Replace any potential PHI with a reference token."""
import re
# Remove SSN patterns
message = re.sub(r'\b\d{3}-\d{2}-\d{4}\b', '[SSN_REDACTED]', message)
# Remove email patterns
message = re.sub(r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}', '[EMAIL_REDACTED]', message)
# Remove phone patterns
message = re.sub(r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b', '[PHONE_REDACTED]', message)
return message
❌ WRONG — never do this
logger.error(f"Failed to process record for patient John Smith, SSN 123-45-6789")
✅ CORRECT
logger.error(f"Failed to process record. ref={audit_ref} patient_id={patient_uuid}")
Access Control: Minimum Necessary Standard
from enum import Enum from functools import wraps
class HIPAARole(Enum): # Clinical — full PHI access ATTENDING_PHYSICIAN = "attending_physician" NURSE_PRACTITIONER = "nurse_practitioner" # Administrative — billing data only BILLING_STAFF = "billing_staff" FRONT_DESK = "front_desk" # Operations — system access, no clinical PHI IT_ADMIN = "it_admin" # Researcher — de-identified only RESEARCHER = "researcher"
PHI_ACCESS_MATRIX = { HIPAARole.ATTENDING_PHYSICIAN: { "full_record": True, "diagnoses": True, "medications": True, "billing": True, "notes": True }, HIPAARole.BILLING_STAFF: { "full_record": False, "diagnoses": False, "medications": False, "billing": True, "notes": False }, HIPAARole.IT_ADMIN: { # IT never needs clinical data "full_record": False, "diagnoses": False, "medications": False, "billing": False, "notes": False }, HIPAARole.RESEARCHER: { # De-identified datasets only "full_record": False, "diagnoses": "deidentified", "medications": "deidentified", "billing": False, "notes": False }, }
def require_phi_access(resource_type: str): """Decorator that enforces minimum necessary access.""" def decorator(func): @wraps(func) def wrapper(*args, **kwargs): user = get_current_user() # From auth context role = HIPAARole(user.role)
if not PHI_ACCESS_MATRIX.get(role, {}).get(resource_type):
audit_access_denied(user.id, resource_type)
raise PermissionError(
f"Role {role.value} cannot access {resource_type}. "
f"Minimum necessary access violated."
)
audit_access_granted(user.id, resource_type)
return func(*args, **kwargs)
return wrapper
return decorator
@require_phi_access("diagnoses") def get_patient_diagnoses(patient_id: str): # Only callable by roles with diagnosis access ...
Session Management
Django / Flask session config for clinical systems
SESSION_CONFIG = { # Mandatory timeouts by context "public_terminal": 2 * 60, # 2 min "clinical_workstation": 10 * 60, # 10 min (2025 rule: max 15min) "mobile_health_app": 5 * 60, # 5 min "admin_console": 5 * 60, # 5 min
"secure": True, # HTTPS only
"httponly": True, # No JS access
"samesite": "strict", # CSRF protection
}
MFA — mandatory under 2025 HIPAA Security Rule updates
MFA_CONFIG = { "required_for_phi": True, "allowed_methods": ["totp", "webauthn"], # Authenticator app or hardware key # SMS NOT recommended — SIM swap attacks "account_lockout_attempts": 5, "lockout_duration_minutes": 30, }
API Design: FHIR + OAuth 2.0
FastAPI example — FHIR R4 compliant patient endpoint
from fastapi import FastAPI, Depends, HTTPException, Security from fastapi.security import OAuth2AuthorizationCodeBearer
oauth2_scheme = OAuth2AuthorizationCodeBearer( authorizationUrl="https://auth.yourapp.com/authorize", tokenUrl="https://auth.yourapp.com/token", )
@app.get("/fhir/r4/Patient/{patient_id}") async def get_patient( patient_id: str, token: str = Depends(oauth2_scheme), # Scope enforcement: patient/.read _scopes = Security(verify_scopes, scopes=["patient/.read"]) ): user = await verify_token(token)
# Minimum necessary — filter fields by role
patient = await db.get_patient(patient_id)
filtered = apply_minimum_necessary(patient, user.role)
# Audit every access
await audit_log(user.id, PHIAction.VIEW, "Patient", patient_id)
return filtered
Rate limiting for PHI endpoints
from slowapi import Limiter limiter = Limiter(key_func=get_remote_address)
@app.get("/fhir/r4/Patient/{patient_id}") @limiter.limit("60/minute") # Per authenticated user async def get_patient(...): ...
@app.post("/bulk-export") @limiter.limit("2/hour") # Bulk exports need approval workflow async def bulk_export(...): ...
De-identification for Dev/Test Environments
NEVER use real PHI in non-production — this is a reportable violation
from faker import Faker import hashlib
fake = Faker()
def deidentify_record(record: dict, deterministic_salt: str) -> dict: """ Safe Harbor de-identification. Deterministic tokenization preserves referential integrity. """ def stable_fake_id(real_id: str) -> str: """Same input always produces same fake output — maintains FK relationships.""" hash_val = hashlib.sha256(f"{deterministic_salt}:{real_id}".encode()).hexdigest() return f"TEST-{hash_val[:12].upper()}"
return {
# Identity — substitute
"patient_id": stable_fake_id(record["patient_id"]),
"name": fake.name(),
"ssn": None, # SUPPRESSED entirely
"email": fake.email(),
"phone": fake.phone_number(),
# Dates — year only (Safe Harbor)
"dob": f"{record['dob'].year}-01-01",
"admission_date": f"{record['admission_date'].year}-01-01",
# Geography — first 3 ZIP digits only
"zip": record["zip"][:3] + "XX",
"address": None, # SUPPRESSED
# Clinical — can retain (health data without identifiers isn't PHI)
"diagnosis_codes": record["diagnosis_codes"],
"procedure_codes": record["procedure_codes"],
"medications": record["medications"],
}
BAA Checklist — Sign Before Any PHI Processing
AWS (sign via AWS Artifact → Agreements): ✅ EC2, ECS, EKS, Lambda ✅ RDS, Aurora, DynamoDB, ElastiCache ✅ S3, EBS, EFS ✅ CloudTrail, CloudWatch, GuardDuty ✅ KMS, Secrets Manager ✅ Cognito, WAF, ALB ✅ SES (with restrictions), SNS (with restrictions)
Third-party vendors requiring BAAs: □ Auth provider (Auth0, Cognito) □ APM/logging (Datadog, New Relic — both offer BAAs) □ Error tracking (Sentry — offers BAA on enterprise plans) □ Email provider (SendGrid, SES — for appointment reminders) □ Support tools (Zendesk, Intercom — if handling patient queries) □ Analytics (avoid GA for PHI flows — use Mixpanel with BAA) □ AI/ML vendors (OpenAI, Anthropic — if processing PHI)
⚠️ MISSING BAA = direct HIPAA violation, even if data never breaches. Median penalty for missing BAA: $100,000–$1.9M
Code Review Checklist
Before any PR touches PHI data paths, verify:
PHI Exposure: □ No PHI in log statements (info, debug, error, warn) □ No PHI in error messages returned to clients □ No PHI in URL path parameters (use POST body) □ No PHI in S3 object keys or resource tags □ No PHI in CloudWatch metric names or dimensions
Encryption: □ All PHI at rest uses AES-256 / KMS □ All PHI in transit uses TLS 1.2+ (1.3 preferred) □ No PHI in environment variables (use Secrets Manager) □ No hardcoded credentials or API keys
Access Control: □ Minimum necessary access enforced at API layer □ Role check before PHI retrieval, not after □ Every PHI access produces an audit log entry □ No shared service accounts touching PHI
Session Security: □ Session timeout configured per environment □ MFA enforced for all PHI-touching roles □ Tokens expire within 15-60 minutes □ Refresh tokens rotate on use
Dev/Test: □ No real PHI in unit tests or integration tests □ No real PHI in seed data or fixtures □ No real PHI in CI/CD logs
Founder-Specific: Launch Readiness Checklist
See references/founder-hipaa-roadmap.md for the full timeline. Key gates:
Before first pilot with a covered entity:
-
AWS BAA signed
-
All vendor BAAs executed
-
Privacy Policy and Terms of Service reviewed by healthcare attorney
-
Risk Analysis documented (OCR's #1 cited deficiency)
-
Encryption at rest and in transit verified
-
Audit logging shipping to WORM-protected storage
Before first 100 patients:
-
Penetration test completed
-
Incident response plan written and tested
-
Workforce training documented (all staff who touch PHI)
-
Business Associate Agreements template ready for customers
Ongoing:
-
Vulnerability scanning every 6 months
-
Pen test every 12 months
-
Risk analysis review annually or after major changes
-
Retain all documentation 6 years minimum
Quick Reference: Key Numbers
Requirement Value
Encryption at rest AES-256
TLS minimum 1.2 (1.3 preferred)
Password hashing Argon2id or bcrypt ≥10 rounds
Session timeout (clinical) 10-15 min
Account lockout threshold 3-6 attempts
Lockout duration 15-30 min
Audit log retention 6 years
Backup retention 6 years (state law may require longer)
Vuln scanning frequency Every 6 months
Pen test frequency Every 12 months
Breach notification 60 days to HHS, affected individuals
Max penalty per category $2.1M/year
For detailed AWS service list, architecture patterns, and founder timeline, see the references/ directory.