Promotion Pipeline
The homelab uses an OCI artifact promotion pipeline for immutable, auditable deployments. Changes flow through three stages: build, validate in integration, promote to live. This skill covers end-to-end tracing and debugging.
Pipeline Overview
PR merged to main (kubernetes/ changed) | v build-platform-artifact.yaml (GHA)
- Discovers latest stable tag in GHCR, bumps patch
- Pushes OCI artifact with tag X.Y.Z-rc.N
- Adds tags: sha-<short>, integration-<short> | v Integration Cluster
- OCIRepository polls GHCR with semver ">= 0.0.0-0" (includes RCs)
- Detects new X.Y.Z-rc.N (higher than previous stable)
- Flux reconciles platform Kustomization | v Flux Alert (validation-success)
- Watches platform Kustomization for "Reconciliation finished"
- Fires repository_dispatch to GitHub (event_type: Kustomization/platform.flux-system)
- Idempotency guard: workflow skips if artifact already has validated-<sha> tag | v tag-validated-artifact.yaml (GHA)
- Finds integration-<sha> artifact, extracts RC tag
- Strips RC suffix: X.Y.Z-rc.N --> X.Y.Z
- Tags artifact: validated-<sha> + X.Y.Z (stable semver) | v Live Cluster
- OCIRepository polls GHCR with semver ">= 0.0.0" (stable only)
- Detects new X.Y.Z stable tag
- Flux reconciles platform (production deployment)
Artifact Tagging Strategy
Each artifact accumulates tags as it progresses through the pipeline:
Tag Created By Stage Purpose
X.Y.Z-rc.N
build workflow Build Pre-release semver for integration polling
sha-<7char>
build workflow Build Immutable commit reference
integration-<7char>
build workflow Build Marks artifact for integration consumption
validated-<7char>
tag workflow Promotion Traceability for validated artifacts
X.Y.Z
tag workflow Promotion Stable semver for live polling
Version numbering: The build workflow queries GHCR for the highest stable X.Y.Z tag, bumps patch to X.Y.(Z+1) , then creates X.Y.(Z+1)-rc.N . When validated, the RC suffix is stripped to produce X.Y.(Z+1) .
Source Types by Cluster
Cluster Source Type Semver Constraint What It Accepts
dev GitRepository N/A Git main branch directly
integration OCIRepository
= 0.0.0-0
All versions including pre-releases (-rc.N )
live OCIRepository
= 0.0.0
Stable versions only (no -rc suffix)
The semver constraint is set in the config module (infrastructure/modules/config/main.tf ) and applied via flux-operator bootstrap. The -0 suffix in >= 0.0.0-0 is what allows pre-release versions per semver specification.
Tracing a Change End-to-End
Stage 1: GitHub Actions Build
Check if build workflow triggered
gh run list --workflow=build-platform-artifact.yaml --limit=5
View specific run details
gh run view <run-id>
Check workflow logs
gh run view <run-id> --log
The build triggers on push to main when kubernetes/** files change. If no Kubernetes files changed, the workflow does not run.
Stage 2: OCI Artifact in GHCR
List recent artifacts and their tags
flux list artifact oci://ghcr.io/<owner>/homelab/platform --limit=10
Find artifact for a specific commit
flux list artifact oci://ghcr.io/<owner>/homelab/platform | grep <short-sha>
Stage 3: Integration Cluster Pickup
Check OCIRepository status (is it seeing the new artifact?)
KUBECONFIG=~/.kube/integration.yaml kubectl get ocirepository -n flux-system -o wide
Check what version is currently deployed
KUBECONFIG=~/.kube/integration.yaml kubectl get ocirepository flux-system -n flux-system -o jsonpath='{.status.artifact.revision}'
Check platform Kustomization reconciliation
KUBECONFIG=~/.kube/integration.yaml kubectl get kustomization platform -n flux-system
Force reconciliation if stuck
KUBECONFIG=~/.kube/integration.yaml flux reconcile source oci flux-system -n flux-system
Stage 4: Validation Alert
Check the validation-success Alert status
KUBECONFIG=~/.kube/integration.yaml kubectl describe alert validation-success -n flux-system
Check the github-dispatch Provider
KUBECONFIG=~/.kube/integration.yaml kubectl get providers -n flux-system
Check if Alert fired recently (events)
KUBECONFIG=~/.kube/integration.yaml kubectl get events -n flux-system --field-selector involvedObject.name=validation-success
Stage 5: Tag Workflow
Check if tag workflow triggered
gh run list --workflow=tag-validated-artifact.yaml --limit=5
If using workflow_dispatch for manual promotion
gh workflow run tag-validated-artifact.yaml -f artifact_sha=<7char-sha>
Stage 6: Live Cluster Pickup
Check OCIRepository status
KUBECONFIG=~/.kube/live.yaml kubectl get ocirepository -n flux-system -o wide
Check current deployed version
KUBECONFIG=~/.kube/live.yaml kubectl get ocirepository flux-system -n flux-system -o jsonpath='{.status.artifact.revision}'
Check platform Kustomization
KUBECONFIG=~/.kube/live.yaml kubectl get kustomization platform -n flux-system
Debugging: Artifact Stuck in Integration
Is the OCI artifact in GHCR? | +-- NO --> Check build-platform-artifact workflow | - Did the workflow trigger? (push to main with kubernetes/ changes) | - Check GHCR auth: GITHUB_TOKEN must have packages:write | - Check workflow logs for "flux push artifact" errors | +-- YES -> Is integration OCIRepository seeing it? | +-- NO --> Check semver constraint | - Must be ">= 0.0.0-0" to accept RC versions | - Run: kubectl get ocirepository -n flux-system -o yaml | grep semver | - Check OCIRepository .status.conditions for errors | +-- YES -> Is platform Kustomization reconciling? | +-- NO --> Check Kustomization status | - kubectl describe kustomization platform -n flux-system | - Look for dependency failures, schema errors | +-- YES -> Is the Alert firing repository_dispatch? | +-- NO --> Check Alert and Provider | - Alert "validation-success" must watch platform Kustomization | - Provider "github-dispatch" needs flux-system secret with GitHub token | - Token needs repo scope for repository_dispatch | +-- YES -> Check tag-validated-artifact workflow - Idempotency guard: already has validated-<sha> tag? - Check workflow logs for tag errors
Debugging: Live Not Updating
Is the artifact tagged with stable semver (X.Y.Z)? | +-- NO --> Promotion did not complete | - Check tag-validated-artifact workflow ran successfully | - Verify it created both validated-<sha> and X.Y.Z tags | +-- YES -> Is live OCIRepository seeing the stable tag? | +-- NO --> Check semver constraint | - Must be ">= 0.0.0" (excludes pre-releases) | - Verify the stable tag is higher than current deployed version | - Force poll: flux reconcile source oci flux-system -n flux-system | +-- YES -> Is Kustomization reconciling? | +-- NO --> Check Kustomization status and dependencies +-- YES -> Deployment should be in progress - Check HelmRelease statuses: flux get helmreleases -A - Check for failing health checks blocking rollout
Canary-Checker Validation
The platform-validation Canary in the monitoring namespace runs health checks every 60 seconds:
Check Type What It Validates
kubernetes-api
HTTP Kubernetes API responds (200 or 401)
flux-pods-healthy
Kubernetes All Flux pods in Running state with Ready condition
Check canary status
KUBECONFIG=~/.kube/integration.yaml kubectl get canaries -n monitoring
Check individual check results
KUBECONFIG=~/.kube/integration.yaml kubectl describe canary platform-validation -n monitoring
Check canary-checker metrics in Prometheus
canary_check{name="platform-validation"} == 0 means healthy
Alerts fire if canary checks fail:
Alert Condition Severity
CanaryCheckFailure
canary_check == 1 for 2m critical
CanaryCheckHighFailureRate
20% failure rate over 15m warning
Manual Promotion (Emergency)
When automatic promotion fails, manually tag the artifact:
Authenticate to GHCR
echo $GITHUB_TOKEN | docker login ghcr.io -u $GITHUB_USER --password-stdin
Find the integration artifact
flux list artifact oci://ghcr.io/<owner>/homelab/platform | grep integration
Tag manually (replace <sha> with 7-char commit SHA)
flux tag artifact
oci://ghcr.io/<owner>/homelab/platform:integration-<sha>
--tag validated-<sha>
flux tag artifact
oci://ghcr.io/<owner>/homelab/platform:integration-<sha>
--tag <X.Y.Z> # The stable semver to assign
Alternatively, use workflow_dispatch to trigger the tag workflow manually:
gh workflow run tag-validated-artifact.yaml -f artifact_sha=<7char-sha>
Rollback Procedure
Option 1: Pin OCIRepository to a Specific Version
Find previous stable artifact
flux list artifact oci://ghcr.io/<owner>/homelab/platform | grep -E '^\d+.\d+.\d+$'
Patch live OCIRepository to pin a specific tag
KUBECONFIG=~/.kube/live.yaml kubectl patch ocirepository flux-system -n flux-system
--type=merge
-p '{"spec":{"ref":{"tag":"<previous-stable-tag>"}}}'
Remember to revert the pin after fixing the issue -- otherwise new promotions will be ignored.
Option 2: Revert the PR and Let Pipeline Run
The safest rollback is to revert the breaking PR on main. The pipeline will build a new artifact with the reverted state, which will naturally promote through integration to live.
Option 3: Re-tag a Previous Artifact
Tag a known-good artifact with a higher stable semver
flux tag artifact
oci://ghcr.io/<owner>/homelab/platform:validated-<old-sha>
--tag <higher-X.Y.Z>
This works because the live OCIRepository picks the highest semver. Ensure the new tag is higher than the current one.
Common Failure Modes
Symptom Cause Fix
Build succeeds, integration does not update OCIRepository semver does not match RC tags Verify >= 0.0.0-0 in OCIRepository spec
Validation passes, live does not update Tag workflow did not create stable semver tag Check tag-validated-artifact workflow logs
repository_dispatch not received by GHA GitHub token in flux-system secret lacks repo scope Update token with correct scopes
Tag workflow fires repeatedly (~10min) Alert fires on every Flux reconciliation cycle Normal -- idempotency guard skips already-validated artifacts
Artifact push fails in build workflow GHCR auth issue Check GITHUB_TOKEN has packages:write permission
Live picks up wrong version Semver ordering issue with RC numbering Verify stable tag is strictly higher than current
Integration shows "no matching artifact" OCIRepository URL or semver misconfigured Check oci_url and oci_semver in cluster bootstrap config
Key Files Reference
File Purpose
.github/workflows/build-platform-artifact.yaml
Build and push OCI artifact on merge to main
.github/workflows/tag-validated-artifact.yaml
Promote validated artifact (tag stable semver)
kubernetes/platform/config/flux-notifications/canary-alert.yaml
Alert that triggers repository_dispatch
kubernetes/platform/config/flux-notifications/github-provider.yaml
GitHub dispatch provider for Flux alerts
kubernetes/platform/config/canary-checker/platform-health.yaml
Platform health validation checks
infrastructure/modules/config/main.tf
OCI semver constraints per cluster
infrastructure/modules/bootstrap/resources/instance-oci.yaml.tftpl
OCIRepository bootstrap template
Cross-References
Document Focus
.github/CLAUDE.md
Complete pipeline architecture and debugging guide
kubernetes/clusters/CLAUDE.md
Per-cluster source types and promotion path
kubernetes/platform/CLAUDE.md
Flux patterns, version management
flux-gitops skill Adding Helm releases and ResourceSet patterns