Skip to content

migrations: add migrations for prefer-closest-numa-nodes and max-allowable-numa-nodes#4778

Merged
piyush-jena merged 3 commits intobottlerocket-os:developfrom
piyush-jena:add-k8s-settings
Apr 21, 2026
Merged

migrations: add migrations for prefer-closest-numa-nodes and max-allowable-numa-nodes#4778
piyush-jena merged 3 commits intobottlerocket-os:developfrom
piyush-jena:add-k8s-settings

Conversation

@piyush-jena
Copy link
Copy Markdown
Contributor

@piyush-jena piyush-jena commented Mar 3, 2026

Issue number:

Related to: #4750

Related to:

Description of changes:
Add 2 topology manager policy options:

  1. max-allowable-numa-nodes - GA k8s-1.35+
  2. prefer-closest-numa-nodes - GA k8s-1.32+

Testing done:
Migration testing:

  1. Before upgrade
[ssm-user@control]$ apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "36151b8b",
    "pretty_name": "Bottlerocket OS 1.56.0 (aws-k8s-1.35)",
    "variant_id": "aws-k8s-1.35",
    "version_id": "1.56.0"
  }
}
[ssm-user@control]$ apiclient set \
  kubernetes.cpu-manager-policy=static \
  kubernetes.topology-manager-policy="best-effort" \
  kubernetes.topology-manager-policy-options.prefer-closest-numa-nodes="true"
Failed to change settings: Failed PATCH request to '/settings/keypair?tx=apiclient-set-KXBvywfwgeVYcZdS': Status 400 when PATCHing /settings/keypair?tx=apiclient-set-KXBvywfwgeVYcZdS: Unable to match your input to the data model.  We may not have enough type information.  Please try the --json input form.  Cause: Error during deserialization: unknown field `topology-manager-policy-options`, expected one of `cluster-name`, `cluster-certificate`, `api-server`, `node-labels`, `node-taints`, `static-pods`, `authentication-mode`, `bootstrap-token`, `standalone-mode`, `eviction-hard`, `eviction-soft`, `eviction-soft-grace-period`, `eviction-max-pod-grace-period`, `kube-reserved`, `system-reserved`, `allowed-unsafe-sysctls`, `server-tls-bootstrap`, `cloud-provider`, `registry-qps`, `registry-burst`, `event-qps`, `event-burst`, `kube-api-qps`, `kube-api-burst`, `container-log-max-size`, `container-log-max-files`, `container-log-max-workers`, `container-log-monitor-interval`, `cpu-cfs-quota-enforced`, `cpu-manager-policy`, `cpu-manager-reconcile-period`, `cpu-manager-policy-options`, `topology-manager-scope`, `topology-manager-policy`, `pod-pids-limit`, `image-gc-high-threshold-percent`, `image-gc-low-threshold-percent`, `image-minimum-gc-age`, `image-maximum-gc-age`, `provider-id`, `log-level`, `credential-providers`, `server-certificate`, `server-key`, `shutdown-grace-period`, `shutdown-grace-period-for-critical-pods`, `memory-manager-reserved-memory`, `memory-manager-policy`, `reserved-cpus`, `memory-swap-behavior`, `hostname-override-source`, `seccomp-default`, `device-ownership-from-security-context`, `single-process-oom-kill`, `static-pods-enabled`, `max-pods`, `cluster-dns-ip`, `cluster-domain`, `node-ip`, `pod-infra-container-image`, `hostname-override`, `ids-per-pod`, `max-parallel-image-pulls` at line 1 column 118

bash-5.2# updog check-update -a --json
[
  {
    "variant": "aws-k8s-1.35",
    "arch": "x86_64",
    "version": "1.57.0",
    "max_version": "1.57.0",
    "waves": {
      "0": "2026-03-09T23:16:35.592575499Z",
      "20": "2026-03-10T02:16:35.592575499Z",
      "102": "2026-03-10T22:16:35.592575499Z",
      "307": "2026-03-11T22:16:35.592575499Z",
      "819": "2026-03-13T22:16:35.592575499Z",
      "1228": "2026-03-14T22:16:35.592575499Z",
      "1843": "2026-03-15T22:16:35.592575499Z"
    },
    "images": {
      "boot": "bottlerocket-aws-k8s-1.35-x86_64-1.57.0-54e01036-boot.ext4.lz4",
      "root": "bottlerocket-aws-k8s-1.35-x86_64-1.57.0-54e01036-root.ext4.lz4",
      "hash": "bottlerocket-aws-k8s-1.35-x86_64-1.57.0-54e01036-root.verity.lz4"
    }
  }
]
  1. After upgrading to v1.57.0
[ssm-user@control]$ apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "54e01036",
    "pretty_name": "Bottlerocket OS 1.57.0 (aws-k8s-1.35)",
    "variant_id": "aws-k8s-1.35",
    "version_id": "1.57.0"
  }
}
[ssm-user@control]$ apiclient set \
  kubernetes.cpu-manager-policy=static \
  kubernetes.topology-manager-policy="best-effort" \
  kubernetes.topology-manager-policy-options.prefer-closest-numa-nodes="true"
[ssm-user@control]$ apiclient get settings.kubernetes
{
  "settings": {
    "kubernetes": {
      "authentication-mode": "aws",
      "cloud-provider": "external",
      "cluster-dns-ip": "10.100.0.10",
      "cluster-domain": "cluster.local",
      "cpu-manager-policy": "static",
      "credential-providers": {
        "ecr-credential-provider": {
          "cache-duration": "12h",
          "enabled": true,
          "image-patterns": [
            "*.dkr.ecr.*.amazonaws.com",
            "*.dkr.ecr.*.amazonaws.com.cn",
            "*.dkr.ecr.*.amazonaws.eu",
            "*.dkr-ecr.*.on.aws",
            "*.dkr-ecr.*.on.amazonwebservices.com.cn",
            "*.dkr.ecr-fips.*.amazonaws.com",
            "*.dkr.ecr-fips.*.amazonaws.eu",
            "*.dkr.ecr.*.cloud.adc-e.uk",
            "*.dkr.ecr-fips.*.cloud.adc-e.uk",
            "*.dkr.ecr.*.c2s.ic.gov",
            "*.dkr.ecr-fips.*.c2s.ic.gov",
            "*.dkr.ecr.*.sc2s.sgov.gov",
            "*.dkr.ecr-fips.*.sc2s.sgov.gov",
            "*.dkr.ecr.*.csp.hci.ic.gov",
            "*.dkr.ecr-fips.*.csp.hci.ic.gov",
            "public.ecr.aws"
          ]
        }
      },
      "device-ownership-from-security-context": true,
      "hostname-override": "ip-172-31-10-220.us-west-2.compute.internal",
      "hostname-override-source": "private-dns-name",
      "max-pods": 29,
      "node-ip": "172.31.10.220",
      "provider-id": "aws:///us-west-2c/i-0fce061d5c684b9a8",
      "seccomp-default": true,
      "server-tls-bootstrap": true,
      "shutdown-grace-period": "150s",
      "shutdown-grace-period-for-critical-pods": "30s",
      "standalone-mode": false,
      "topology-manager-policy": "best-effort",
      "topology-manager-policy-options": {
        "prefer-closest-numa-nodes": true
      }
    }
  }
}

bash-5.2# cat /etc/kubernetes/kubelet/config
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 2m0s
    enabled: true
  x509:
    clientCAFile: "/etc/kubernetes/pki/ca.crt"
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s
clusterDomain: cluster.local
clusterDNS:
- 10.100.0.10
kubeReserved:
  cpu: "70m"
  memory: "574Mi"
  ephemeral-storage: "1Gi"
kubeReservedCgroup: "/runtime"
cpuCFSQuota: true
cpuManagerPolicy: static
topologyManagerPolicy: best-effort
topologyManagerPolicyOptions:
  prefer-closest-numa-nodes: "true"
podPidsLimit: 1048576
providerID: aws:///us-west-2c/i-0fce061d5c684b9a8
resolvConf: "/run/netdog/resolv.conf"
hairpinMode: hairpin-veth
readOnlyPort: 0
cgroupDriver: systemd
cgroupRoot: "/"
runtimeRequestTimeout: 15m
protectKernelDefaults: true
serializeImagePulls: false
seccompDefault: true
serverTLSBootstrap: true
tlsCipherSuites:
- TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
volumePluginDir: "/var/lib/kubelet/plugins/volume/exec"
maxPods: 29
staticPodPath: "/etc/kubernetes/static-pods/"
shutdownGracePeriod: 150s
shutdownGracePeriodCriticalPods: 30s
failSwapOn: false
failCgroupV1: false
featureGates:
  DynamicResourceAllocation: true
  MutableCSINodeAllocatableCount: true
  1. After downgrading back to v1.56.0
[ssm-user@control]$ apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "36151b8b",
    "pretty_name": "Bottlerocket OS 1.56.0 (aws-k8s-1.35)",
    "variant_id": "aws-k8s-1.35",
    "version_id": "1.56.0"
  }
}
[ssm-user@control]$ apiclient get settings.kubernetes
{
  "settings": {
    "kubernetes": {
      "authentication-mode": "aws",
      "cloud-provider": "external",
      "cluster-dns-ip": "10.100.0.10",
      "cluster-domain": "cluster.local",
      "cpu-manager-policy": "static",
      "credential-providers": {
        "ecr-credential-provider": {
          "cache-duration": "12h",
          "enabled": true,
          "image-patterns": [
            "*.dkr.ecr.*.amazonaws.com",
            "*.dkr.ecr.*.amazonaws.com.cn",
            "*.dkr.ecr.*.amazonaws.eu",
            "*.dkr-ecr.*.on.aws",
            "*.dkr-ecr.*.on.amazonwebservices.com.cn",
            "*.dkr.ecr-fips.*.amazonaws.com",
            "*.dkr.ecr-fips.*.amazonaws.eu",
            "*.dkr.ecr.*.cloud.adc-e.uk",
            "*.dkr.ecr-fips.*.cloud.adc-e.uk",
            "*.dkr.ecr.*.c2s.ic.gov",
            "*.dkr.ecr-fips.*.c2s.ic.gov",
            "*.dkr.ecr.*.sc2s.sgov.gov",
            "*.dkr.ecr-fips.*.sc2s.sgov.gov",
            "*.dkr.ecr.*.csp.hci.ic.gov",
            "*.dkr.ecr-fips.*.csp.hci.ic.gov",
            "public.ecr.aws"
          ]
        }
      },
      "device-ownership-from-security-context": true,
      "hostname-override": "ip-172-31-10-220.us-west-2.compute.internal",
      "hostname-override-source": "private-dns-name",
      "max-pods": 29,
      "node-ip": "172.31.10.220",
      "provider-id": "aws:///us-west-2c/i-0fce061d5c684b9a8",
      "seccomp-default": true,
      "server-tls-bootstrap": true,
      "shutdown-grace-period": "150s",
      "shutdown-grace-period-for-critical-pods": "30s",
      "standalone-mode": false,
      "topology-manager-policy": "best-effort"
    }
  }
}

bash-5.2# cat /etc/kubernetes/kubelet/config
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 2m0s
    enabled: true
  x509:
    clientCAFile: "/etc/kubernetes/pki/ca.crt"
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s
clusterDomain: cluster.local
clusterDNS:
- 10.100.0.10
kubeReserved:
  cpu: "70m"
  memory: "574Mi"
  ephemeral-storage: "1Gi"
kubeReservedCgroup: "/runtime"
cpuCFSQuota: true
cpuManagerPolicy: static
topologyManagerPolicy: best-effort
podPidsLimit: 1048576
providerID: aws:///us-west-2c/i-0fce061d5c684b9a8
resolvConf: "/run/netdog/resolv.conf"
hairpinMode: hairpin-veth
readOnlyPort: 0
cgroupDriver: systemd
cgroupRoot: "/"
runtimeRequestTimeout: 15m
protectKernelDefaults: true
serializeImagePulls: false
seccompDefault: true
serverTLSBootstrap: true
tlsCipherSuites:
- TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
volumePluginDir: "/var/lib/kubelet/plugins/volume/exec"
maxPods: 29
staticPodPath: "/etc/kubernetes/static-pods/"
shutdownGracePeriod: 150s
shutdownGracePeriodCriticalPods: 30s
failSwapOn: false
failCgroupV1: false
featureGates:
  DynamicResourceAllocation: true
  MutableCSINodeAllocatableCount: true

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

Comment thread sources/Cargo.toml Outdated
Comment thread Twoliter.lock Outdated
@piyush-jena piyush-jena marked this pull request as draft March 23, 2026 21:57
@piyush-jena piyush-jena force-pushed the add-k8s-settings branch 4 times, most recently from a819e70 to 5ecbe8e Compare April 9, 2026 22:30
@piyush-jena piyush-jena marked this pull request as ready for review April 9, 2026 22:31
@piyush-jena piyush-jena force-pushed the add-k8s-settings branch 4 times, most recently from 7d6a6ec to ced4869 Compare April 10, 2026 02:00
Comment thread sources/Cargo.toml
Signed-off-by: Piyush Jena <jepiyush@amazon.com>
Add AddSettingsMigration for:
- settings.kubernetes.topology-manager-policy-options
- settings.kubernetes.topology-manager-policy-options.prefer-closest-numa-nodes
- settings.kubernetes.topology-manager-policy-options.max-allowable-numa-nodes

Signed-off-by: Piyush Jena <jepiyush@amazon.com>
Signed-off-by: Piyush Jena <jepiyush@amazon.com>
@piyush-jena piyush-jena marked this pull request as draft April 16, 2026 21:14
@piyush-jena
Copy link
Copy Markdown
Contributor Author

This PR is pending a bottlerocket-core-kit release carrying this change: bottlerocket-os/bottlerocket-core-kit#901

@ytsssun
Copy link
Copy Markdown
Contributor

ytsssun commented Apr 20, 2026

Q - from the core-kit change, it seems that we did "no-op" for the versions that does not support them. Can you add some test result confirming that the settings does not get rendered?

@piyush-jena
Copy link
Copy Markdown
Contributor Author

piyush-jena commented Apr 21, 2026

@ytsssun Yeah should have attached some results in the core-kit PR but here it is:

bash-5.2# apiclient set settings.kubernetes.topology-manager-policy-options.max-allowable-numa-nodes="8"
bash-5.2# apiclient set settings.kubernetes.topology-manager-policy-options.prefer-closest-numa-nodes=true
bash-5.2# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "ec6cc83b-dirty",
    "pretty_name": "Bottlerocket OS 1.60.0 (aws-k8s-1.30)",
    "variant_id": "aws-k8s-1.30",
    "version_id": "1.60.0"
  }
}
bash-5.2# cat /etc/kubernetes/kubelet/config
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 2m0s
    enabled: true
  x509:
    clientCAFile: "/etc/kubernetes/pki/ca.crt"
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s
clusterDomain: cluster.local
clusterDNS:
- 10.100.0.10
kubeReserved:
  cpu: "80m"
  memory: "893Mi"
  ephemeral-storage: "1Gi"
kubeReservedCgroup: "/runtime"
cpuCFSQuota: true
cpuManagerPolicy: none
podPidsLimit: 1048576
providerID: aws:///us-west-2c/i-0ea28b43f4df4dfd2
resolvConf: "/run/netdog/resolv.conf"
hairpinMode: hairpin-veth
readOnlyPort: 0
cgroupDriver: systemd
cgroupRoot: "/"
runtimeRequestTimeout: 15m
protectKernelDefaults: true
serializeImagePulls: false
seccompDefault: false
serverTLSBootstrap: true
tlsCipherSuites:
- TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
volumePluginDir: "/var/lib/kubelet/plugins/volume/exec"
maxPods: 58
staticPodPath: "/etc/kubernetes/static-pods/"
failSwapOn: false

The settings are not rendered here in 1.30 variant.

[root@admin]# sheltie
bash-5.2# apiclient set settings.kubernetes.topology-manager-policy-options.max-allowable-numa-nodes="8"
bash-5.2# apiclient set settings.kubernetes.topology-manager-policy-options.prefer-closest-numa-nodes=true
bash-5.2# apiclient get os
{
  "os": {
    "arch": "x86_64",
    "build_id": "ec6cc83b-dirty",
    "pretty_name": "Bottlerocket OS 1.60.0 (aws-k8s-1.34)",
    "variant_id": "aws-k8s-1.34",
    "version_id": "1.60.0"
  }
}
bash-5.2# cat /etc/kubernetes/kubelet/config
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
address: 0.0.0.0
authentication:
  anonymous:
    enabled: false
  webhook:
    cacheTTL: 2m0s
    enabled: true
  x509:
    clientCAFile: "/etc/kubernetes/pki/ca.crt"
authorization:
  mode: Webhook
  webhook:
    cacheAuthorizedTTL: 5m0s
    cacheUnauthorizedTTL: 30s
clusterDomain: cluster.local
clusterDNS:
- 10.100.0.10
kubeReserved:
  cpu: "80m"
  memory: "893Mi"
  ephemeral-storage: "1Gi"
kubeReservedCgroup: "/runtime"
cpuCFSQuota: true
cpuManagerPolicy: none
topologyManagerPolicyOptions:
  prefer-closest-numa-nodes: "true"
podPidsLimit: 1048576
providerID: aws:///us-west-2b/i-02a8b38b5fbb5f05f
resolvConf: "/run/netdog/resolv.conf"
hairpinMode: hairpin-veth
readOnlyPort: 0
cgroupDriver: systemd
cgroupRoot: "/"
runtimeRequestTimeout: 15m
protectKernelDefaults: true
serializeImagePulls: false
seccompDefault: true
serverTLSBootstrap: true
tlsCipherSuites:
- TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256
- TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
- TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305
- TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
volumePluginDir: "/var/lib/kubelet/plugins/volume/exec"
maxPods: 58
staticPodPath: "/etc/kubernetes/static-pods/"
shutdownGracePeriod: 150s
shutdownGracePeriodCriticalPods: 30s
failSwapOn: false
featureGates:
  DynamicResourceAllocation: true
  MutableCSINodeAllocatableCount: true

In 1.34 variant, only 1 of the setting is rendered.

Copy link
Copy Markdown
Contributor

@ytsssun ytsssun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@piyush-jena piyush-jena marked this pull request as ready for review April 21, 2026 23:58
@piyush-jena piyush-jena merged commit e2cf5d1 into bottlerocket-os:develop Apr 21, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants