Skip to content

Upgrading to 21.0.0 wants to rebuild entire eks clusterΒ #3429

@ggillies

Description

@ggillies

Description

When attempting to upgrade to version 21.0.0 of the terraform module combined with aws 6.4 terraform provider, terraform now wants to tear down and completely rebuild my eks cluster running in eks auto mode.

Versions

  • Module version [Required]: 21.0.0

  • Terraform version: v1.9.6

  • Provider version(s):
  • provider registry.terraform.io/cloudflare/cloudflare v5.7.1
  • provider registry.terraform.io/hashicorp/archive v2.7.1
  • provider registry.terraform.io/hashicorp/aws v6.4.0
  • provider registry.terraform.io/hashicorp/cloudinit v2.3.7
  • provider registry.terraform.io/hashicorp/local v2.5.3
  • provider registry.terraform.io/hashicorp/null v3.2.4
  • provider registry.terraform.io/hashicorp/random v3.7.2
  • provider registry.terraform.io/hashicorp/time v0.13.1
  • provider registry.terraform.io/hashicorp/tls v4.1.0
  • provider registry.terraform.io/hashicorp/vault v5.1.0

Reproduction Code [Required]

This is taken from the docs for a simple EKS auto mode cluster

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "21.0.0"

  name = "myapp-${var.environment}"

  endpoint_public_access       = true
  endpoint_public_access_cidrs = local.public_cidrs

  upgrade_policy = {
    support_type = "STANDARD"
  }

  
  enable_cluster_creator_admin_permissions = true

  access_entries = local.access_entries

  compute_config = {
    enabled    = true
    node_pools = ["general-purpose"]
  }

  vpc_id     = var.vpc.id
  subnet_ids = var.private_subnet_ids

}

Steps to reproduce the behavior:

  • Have the code above in your terraform codebase.
  • Update to aws provider 6.x and version 21.0.0 of the terraform-aws-eks module
  • Run terraform plan

Expected behavior

The eks cluster does not want to be rebuilt

Actual behavior

The terraform cluster wants to be rebuilt

Terminal Output Screenshot(s)

  # module.eks[0].module.eks.aws_eks_cluster.this[0] must be replaced
+/- resource "aws_eks_cluster" "this" {
      ~ arn                           = "arn:aws:eks:eu-central-1:REMOVED:cluster/REMOVED" -> (known after apply)
      ~ certificate_authority         = [
          - {
              - data = "REMOVED"
            },
        ] -> (known after apply)
      + cluster_id                    = (known after apply)
      ~ created_at                    = "2025-05-23T02:01:23Z" -> (known after apply)
      ~ endpoint                      = "https://REMOVED.gr7.eu-central-1.eks.amazonaws.com" -> (known after apply)
      ~ id                            = "REMOVED" -> (known after apply)
      ~ identity                      = [
          - {
              - oidc = [
                  - {
                      - issuer = "https://REMOVED"
                    },
                ]
            },
        ] -> (known after apply)
        name                          = "REMOVED"
      ~ platform_version              = "eks.28" -> (known after apply)
      ~ status                        = "ACTIVE" -> (known after apply)
        tags                          = {
            "terraform-aws-modules" = "eks"
        }
      ~ version                       = "1.31" -> (known after apply)
        # (5 unchanged attributes hidden)

      ~ compute_config {
          - node_role_arn = "arn:aws:iam::REMOVED:role/REMOVED-eks-auto-20250523020048533600000003" -> null # forces replacement
            # (2 unchanged attributes hidden)
        }

      ~ kubernetes_network_config {
          ~ service_ipv4_cidr = "172.20.0.0/16" -> (known after apply)
          + service_ipv6_cidr = (known after apply)
            # (1 unchanged attribute hidden)

            # (1 unchanged block hidden)
        }

      - timeouts {}

      ~ vpc_config {
          ~ cluster_security_group_id = "REMOVED" -> (known after apply)
          ~ vpc_id                    = "REMOVED" -> (known after apply)
            # (5 unchanged attributes hidden)
        }

        # (4 unchanged blocks hidden)
    }

Additional context

This problem was noted on the merge request for 21.0.0 here

I believe It's caused by the change in variables.tf changing compute_config default from {} to object(...), combined with the change in main.tf where before it was

node_role_arn = local.auto_mode_enabled && length(try(compute_config.value.node_pools, [])) > 0 ? try(compute_config.value.node_role_arn, aws_iam_role.eks_auto[0].arn, null) : null

but now it's

node_role_arn = compute_config.value.node_pools != null ? try(compute_config.value.node_role_arn, aws_iam_role.eks_auto[0].arn, null) : null

I think the try block now always returns null as now compute_config.node_role_arn will always return null (and thus always succeed in the try) if left unset.

Before then the default was {}, try would fail on this. I think we need to use coalesce instead to ignore nulls

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions