Skip to content

Upgrade kubernetes nodes to cgroupv2Β #12638

@xin-hedera

Description

@xin-hedera

Problem

Kubernetes will drop cgroupv1 support soon and there are still some node pools in both our preprod and prod clusters not on cgroupv2.

Solution

Upgrade the node pools citus-coordinator, citus-worker, and k6-pool in all our k8s clusters from cgroupv1 to cgroupv2.

Run ./tools/cluster-management/upgrade-k8s-version-citus.sh to upgrade to a newer recommended version, when asked about Enter path to Linux config file (leave blank to skip):, provide path to a yaml file with the following:

linuxConfig:
  cgroupMode: 'CGROUP_MODE_V2'

Note for most existing clusters the node pools version is behind the recommended version. However the newer standalone staging clusters are all up-to-date, we may need to either run the steps manually without upgrading k8s version or upgrade to a newer non-recommended patch release.

For mainnet-staging-na, we can run the following command for the related node pools with the required system config file, when all workloads are torn down.

$ gcloud container node-pools update NAME --cluster CLUSTER --project PROJECT --region REGION --system-config-from-file SYSTEM_CONFIG_FILE

For preprod cluster, find a time slot when there is no impact to perf tests since only perf env in preprod cluster depends on the citus pools.

Alternatives

No response


Action Items:

  • Upgrade GKE Cluster and node-pools to latest GKE v1.34: 1.34.3-gke.1318000

    • mainnet-staging-na
    • mirrornode-preprod-2024-02
    • staging-sm
    • staging-lg
    • staging-council
    • shadow-previewnet
    • previewnet
    • testnet-eu
    • testnet-na
    • mainnet-eu
    • mainnet-na
  • Set CGROUP_MODE_V2 in any node pool that is still V1

    • mainnet-staging-na
    • mirrornode-preprod-2024-02
    • staging-sm
    • staging-lg
    • staging-council
    • shadow-previewnet
    • previewnet
    • testnet-eu
    • testnet-na
    • mainnet-eu
    • mainnet-na

Metadata

Metadata

Assignees

Labels

enhancementType: New featureopsTasks relating to network operations

Projects

Status

πŸ‘· In progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions