Upgrading Karpenter on EKS

Goal

Safely upgrade Karpenter on an EKS cluster. Version-agnostic — the examples use 1.8.1 → 1.12.1; substitute your own source/target versions.

Official docs

Upgrade guide

Compatibility matrix

CRD ownership fix

Prerequisites

Check K8s ↔ Karpenter compatibility (matrix). E.g. K8s 1.35 needs Karpenter ≥ 1.9, so 1.8.x is below the minimum.
You can jump straight to a later 1.x — CRDs are additive within 1.x, so 1.8 → 1.12 in one hop is fine. But per-version actions are cumulative — apply the notes for every minor in between (see Version notes below), don’t skip them.
Check for new IAM permissions for every version you cross (see Version notes below) — e.g. 1.11 and 1.12 each add EC2 describe perms. If you manage the controller role via the terraform-aws-modules/eks karpenter submodule, recent versions already include them.
Adopt/upgrade all CRDs the chart ships, not a hardcoded list — newer versions add more. As of 1.12 there are 4: ec2nodeclasses.karpenter.k8s.aws, nodepools.karpenter.sh, nodeclaims.karpenter.sh, nodeoverlays.karpenter.sh.

Set these once (used throughout):

VERSION=1.12.1
KARPENTER_NS=karpenter   # your Karpenter namespace (upstream default: kube-system)

Version notes (cumulative)

Apply every minor's notes, not just the target's

Upgrading 1.8 → 1.12 means applying the required changes from 1.9, 1.10, 1.11 and 1.12 — they stack. Read the upgrade guide for each version between your source and target.

Version	What it needs
1.9	IAM policy split into multiple managed policies (structural only — no new permissions). Re-check if you manage the controller policy yourself.
1.10	Extra EventBridge rule (`detail-type`) for Capacity Reservation interruption warnings.
1.11	New IAM permission `ec2:DescribePlacementGroups` (+ placement-group resource ARN).
1.12	New IAM permission `ec2:DescribeInstanceStatus`; CA-bundle drift can replace existing nodes; optional ARC zonal-shift (`arc-zonal-shift:GetManagedResource`); installs the `nodeoverlays` CRD — alpha feature, but the CRD is always present, so adopt it (see Gotchas).

Background: why CRDs need a separate release

Karpenter’s main chart installs its CRDs only on first install and never upgrades them. Upgrade just the controller and the CRDs silently stay on the old version. The fix is the dedicated karpenter-crd chart, managed as its own release alongside the controller.

Upgrade order

Always upgrade CRDs first, then the controller, and keep both on the same version.

Steps

Pre-flight + backup
      │
      ▼
Adopt CRDs into karpenter-crd          (one-time per cluster)
      │
      ▼
Upgrade karpenter-crd  ──▶  Upgrade karpenter controller     ← CRDs ALWAYS first
      │
      ▼
Verify   (revert the controller if needed)

Step 1: Pre-flight

kubectl config current-context        # make sure it's the right cluster!
kubectl -n "$KARPENTER_NS" get pods
kubectl get nodepools,nodeclaims,ec2nodeclasses,nodeoverlays -A -o yaml \
  > karpenter-cr-backup-$(date +%F).yaml

Step 2: Adopt existing CRDs into `karpenter-crd` (one time per cluster)

If the CRDs were originally created by the main chart (or kubectl apply), the first karpenter-crd apply fails with invalid ownership metadata. Hand ownership to Helm once:

zsh word-splitting

List the CRDs inline — zsh doesn’t word-split $VAR, so an unquoted $CRDS is passed as a single argument and kubectl returns NotFound.

kubectl label crd \
  ec2nodeclasses.karpenter.k8s.aws nodepools.karpenter.sh nodeclaims.karpenter.sh nodeoverlays.karpenter.sh \
  app.kubernetes.io/managed-by=Helm --overwrite
kubectl annotate crd \
  ec2nodeclasses.karpenter.k8s.aws nodepools.karpenter.sh nodeclaims.karpenter.sh nodeoverlays.karpenter.sh \
  meta.helm.sh/release-name=karpenter-crd --overwrite
kubectl annotate crd \
  ec2nodeclasses.karpenter.k8s.aws nodepools.karpenter.sh nodeclaims.karpenter.sh nodeoverlays.karpenter.sh \
  meta.helm.sh/release-namespace="$KARPENTER_NS" --overwrite

Step 3: Upgrade CRDs, then the controller

Apply the CRD chart first, then the controller — both on the same $VERSION:

# 1) CRDs first
helm upgrade --install karpenter-crd oci://public.ecr.aws/karpenter/karpenter-crd \
  --namespace "$KARPENTER_NS" --create-namespace --version "$VERSION"
 
# 2) then the controller (keep your existing config)
helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter \
  --namespace "$KARPENTER_NS" --version "$VERSION" --reuse-values   # or -f your-values.yaml

Managing via IaC

If Karpenter is deployed through Terraform/Terragrunt/Argo/Flux, don’t run helm by hand — bump the chart version of both releases (CRD release first) and apply in that order.

Step 4: Verify

kubectl -n "$KARPENTER_NS" rollout status deploy/karpenter --timeout=300s
kubectl -n "$KARPENTER_NS" logs deploy/karpenter --tail=100 | grep -iE "error|panic" || echo ok
kubectl get nodepools,nodeclaims -A   # all Ready

Benign startup race
A one-off creating scheduler, no nodepools found right after restart is the NodePool cache not synced yet — it stops within a minute. Confirm it cleared:
kubectl -n "$KARPENTER_NS" logs deploy/karpenter --since=2m | grep -i "no nodepools" || echo ok

Rollback

Revert the controller to the previous $VERSION and re-apply. Leave the CRDs on the newer version — within 1.x they’re additive, so the older controller tolerates them.

Watch-outs

Drift-driven node replacement

A new version may re-resolve AMIs or change drift hashing, so existing nodes can be marked drifted and replaced (one at a time, with drain). Normal — bound it with NodePool disruption.budgets if the timing is bad.

replicas: 1 → a few seconds with no provisioning during the rollout. Run 2 for HA.
webhook.enabled: false is correct for 1.x — leave it.

Gotchas

Adopt all CRDs, not a fixed list. 1.12 ships a 4th CRD, nodeoverlays.karpenter.sh. The chart always installs it — the NodeOverlay feature is alpha and off by default, but the CRD is present regardless, so the karpenter-crd release must own it too. The upstream troubleshooting page is simply out of date — it still lists only 3 and omits nodeoverlays, so its copy-paste commands miss it and the first apply fails on that CRD’s ownership. Adopt the full current set — 4 as of 1.12.
zsh word-splitting. An unquoted $CRDS is passed as one argument in zsh → kubectl ... NotFound. List CRDs inline, or use an array.

Appendix — hotfix CRDs without the release

To upgrade the CRDs right now without touching the release, apply them straight from the tagged source (replace vX.Y.Z):

BASE=https://raw.githubusercontent.com/aws/karpenter-provider-aws/vX.Y.Z/pkg/apis/crds
kubectl apply --server-side --force-conflicts \
  -f $BASE/karpenter.sh_nodepools.yaml \
  -f $BASE/karpenter.sh_nodeclaims.yaml \
  -f $BASE/karpenter.sh_nodeoverlays.yaml \
  -f $BASE/karpenter.k8s.aws_ec2nodeclasses.yaml

Last resort

This bypasses Helm ownership — prefer the karpenter-crd release so the CRDs stay tracked. Use only as an emergency hotfix.

configs

Explorer

Upgrading Karpenter on EKS

Prerequisites

Version notes (cumulative)

Background: why CRDs need a separate release

Steps

Step 1: Pre-flight

Step 2: Adopt existing CRDs into `karpenter-crd` (one time per cluster)

Step 3: Upgrade CRDs, then the controller

Step 4: Verify

Rollback

Watch-outs

Gotchas

Appendix — hotfix CRDs without the release

Graph View

Table of Contents

Backlinks

configs

Explorer

Upgrading Karpenter on EKS

Prerequisites

Version notes (cumulative)

Background: why CRDs need a separate release

Steps

Step 1: Pre-flight

Step 2: Adopt existing CRDs into karpenter-crd (one time per cluster)

Step 3: Upgrade CRDs, then the controller

Step 4: Verify

Rollback

Watch-outs

Gotchas

Appendix — hotfix CRDs without the release

Graph View

Table of Contents

Backlinks

Step 2: Adopt existing CRDs into `karpenter-crd` (one time per cluster)