Skip to content

Upgrade GKE cluster gke-fleet-dev to Kubernetes 1.34#978

Open
plural-copilot[bot] wants to merge 1 commit into
mainfrom
agent/upgrade-gke-fleet-dev-k8s-1.34-1778613385327
Open

Upgrade GKE cluster gke-fleet-dev to Kubernetes 1.34#978
plural-copilot[bot] wants to merge 1 commit into
mainfrom
agent/upgrade-gke-fleet-dev-k8s-1.34-1778613385327

Conversation

@plural-copilot
Copy link
Copy Markdown

Summary

This PR upgrades the GKE cluster gke-fleet-dev from Kubernetes 1.33 to 1.34 by updating the nextKubernetesVersion configuration in the ServiceDeployment.

Changes

  • Updated nextKubernetesVersion from "1.33" to "1.34" in bootstrap/fleets/gke-fleet/clusters/dev.yaml

Impact

This change will trigger:

  1. GKE control plane upgrade to Kubernetes 1.34
  2. Blue/green node pool rotation to deploy nodes running 1.34
  3. Zero-downtime upgrade via the terraform blue/green strategy

Stack Reference

Plural Stack: gke-fleet-cluster-dev

Update nextKubernetesVersion from 1.33 to 1.34 in the ServiceDeployment
configuration. This will trigger the GKE control plane upgrade and blue/green
node pool rotation to Kubernetes 1.34.
Copy link
Copy Markdown
Author

@plural-copilot plural-copilot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR was generated by the claude Plural Agent Runtime. Here's some useful information you might want to know to evaluate the ai's perfomance:

Name Details
💬 Prompt ## Task: Upgrade GKE cluster gke-fleet-dev from Kubernetes 1.33 to 1.34...
🔗 Run history View run history

Copy link
Copy Markdown
Author

@plural-copilot plural-copilot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plural has generated a plan for this PR, full details here

Plan Details
Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the
last "terraform apply" which may have affected this plan:

  # module.gke.google_container_cluster.primary has changed
  ~ resource "google_container_cluster" "primary" {
        id                                       = "projects/pluralsh-test-384515/locations/us-central1/clusters/gke-fleet-dev"
      ~ master_version                           = "1.33.5-gke.2392000" -> "1.33.10-gke.1067000"
        name                                     = "gke-fleet-dev"
      ~ node_version                             = "1.33.5-gke.2392000" -> "1.33.9-gke.1060000"
        # (37 unchanged attributes hidden)

      ~ node_pool {
            name                        = "default-pool"
          ~ version                     = "1.33.5-gke.2392000" -> "1.33.10-gke.1067000"
            # (7 unchanged attributes hidden)

            # (4 unchanged blocks hidden)
        }
      ~ node_pool {
            name                        = "blue"
          ~ version                     = "1.33.8-gke.1026000" -> "1.33.9-gke.1060000"
            # (7 unchanged attributes hidden)

            # (5 unchanged blocks hidden)
        }
      ~ node_pool {
            name                        = "green"
          ~ version                     = "1.33.8-gke.1026000" -> "1.33.9-gke.1060000"
            # (7 unchanged attributes hidden)

            # (5 unchanged blocks hidden)
        }

        # (28 unchanged blocks hidden)
    }


Unless you have made equivalent changes to your configuration, or ignored the
relevant attributes using ignore_changes, the following plan may include
actions to undo or respond to these changes.

─────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
  ~ update in-place

Terraform will perform the following actions:

  # plural_cluster.this will be updated in-place
  ~ resource "plural_cluster" "this" {
      ~ agent_deployed = true -> (known after apply)
        id             = "5e1553c4-aa35-49c6-aae6-7a12551a0341"
      ~ kubeconfig     = {
          ~ token                  = (sensitive value)
            # (2 unchanged attributes hidden)
        }
        name           = "gke-fleet-dev"
        tags           = {
            "fleet" = "gke-fleet"
            "tier"  = "dev"
        }
        # (7 unchanged attributes hidden)
    }

  # module.gke.google_container_node_pool.pools["blue"] will be updated in-place
  ~ resource "google_container_node_pool" "pools" {
        id                          = "projects/pluralsh-test-384515/locations/us-central1/clusters/gke-fleet-dev/nodePools/blue"
        name                        = "blue"
      ~ version                     = "1.33.9-gke.1060000" -> "1.33"
        # (10 unchanged attributes hidden)

        # (6 unchanged blocks hidden)
    }

  # module.gke.google_container_node_pool.pools["green"] will be updated in-place
  ~ resource "google_container_node_pool" "pools" {
        id                          = "projects/pluralsh-test-384515/locations/us-central1/clusters/gke-fleet-dev/nodePools/green"
        name                        = "green"
      ~ version                     = "1.33.9-gke.1060000" -> "1.33"
        # (10 unchanged attributes hidden)

        # (6 unchanged blocks hidden)
    }

Plan: 0 to add, 3 to change, 0 to destroy.

Copy link
Copy Markdown
Author

@plural-copilot plural-copilot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plural AI has generated a summary of what this plan entails here

Plan Summary

Summary

GKE cluster gke-fleet-dev is performing a blue/green node pool rotation for Kubernetes version management. This is a low-risk, configuration-only change with no immediate infrastructure impact.

Root Cause

The Terraform plan reflects drift reconciliation from out-of-band Kubernetes version upgrades that already occurred in GKE. The node pools have been upgraded outside of Terraform (master: 1.33.10, nodes: 1.33.9), and Terraform is now adjusting the version configuration to align with the cluster's blue/green deployment strategy by setting node pool versions to 1.33 (which pins to the master's minor version).

Key Evidence

  • No resources are being created or destroyed - Plan shows 0 to add, 0 to destroy, 3 to change
  • Changes are configuration-only - The plan only updates the version attribute on two node pools (blue and green) from 1.33.9-gke.10600001.33 and refreshes the plural_cluster resource
  • Drift detected externally - Terraform detected that GKE auto-upgraded the master from 1.33.51.33.10 and nodes from 1.33.5/1.33.81.33.9/1.33.10 outside of Terraform's control
  • Blue/green strategy in action - The code uses local.active_node_group and local.drain_node_group logic to determine which node pool should be active based on whether the version number is even/odd, implementing a zero-downtime upgrade pattern
  • Version simplification - Changing from specific patch versions (e.g., 1.33.9-gke.1060000) to minor version (1.33) allows GKE to manage patch updates within the same minor version automatically

Contextual Observations

  • This is a dev environment - Cluster tags indicate tier = dev, so the blast radius is limited to non-production workloads
  • Blue/green node pools are designed for safe upgrades - The cluster uses two node pools (blue and green) that alternate as active/drain during upgrades. Based on version 1.33 (odd minor version), green should be the active pool and blue should be draining
  • No immediate node replacements - Since the actual GKE node versions are already at 1.33.9, this configuration change won't trigger node pool recreation or rolling updates
  • Plural cluster resource refresh - The plural_cluster resource will be updated with the current cluster state (kubeconfig token refresh), which is a normal state refresh operation
  • Auto-upgrade is disabled - The code explicitly sets auto_upgrade = false for node pools, giving you control over upgrade timing
  • Workload identity configured - The cluster has proper IAM bindings for Plural stacks and external-dns, indicating this is an actively managed fleet cluster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant