-
Notifications
You must be signed in to change notification settings - Fork 634
Description
/kind bug
What steps did you take and what happened:
Updating the MachinePool, AWSMachinePool or KubeadmConfig resources does not trigger an instanceRefresh on the AWS ASG.
I expect that with the awsmachinepool.spec.refreshPreferences.disable left on it's default value or false, that changes to the MachinePool, AWSMachinePool, and KubeadmConfig would automatically trigger an instance refresh to rotate nodes in the pool to use the updated settings. Currently, I must manually start instance refreshes using the AWS UI or CLI in order for instances to be replaced when my specs change.
What did you expect to happen:
These are the MachinePool, AWSMachinePool, and KubeadmConfig I'm working with.
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
annotations:
cluster.x-k8s.io/replicas-managed-by: external-autoscaler
name: worker-private-efficient
spec:
clusterName: awscmhdev2
replicas: 2
template:
metadata: {}
spec:
bootstrap:
configRef:
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfig
name: worker-private-efficient
clusterName: awscmhdev2
infrastructureRef:
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSMachinePool
name: worker-private-efficient
version: v1.24.7
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSMachinePool
metadata:
annotations:
cluster-api-provider-aws: "true"
name: worker-private-efficient
spec:
additionalTags:
k8s.io/cluster-autoscaler/awscmhdev2: owned
k8s.io/cluster-autoscaler/enabled: "true"
awsLaunchTemplate:
additionalSecurityGroups:
- filters:
- name: tag:Name
values:
- capi-hostports
- name: tag:network-zone
values:
- qa
- name: tag:region
values:
- us-east-2
ami:
id: ami-0f6b6efcd422b9d85
iamInstanceProfile: nodes.cluster-api-provider-aws.sigs.k8s.io
instanceType: c5a.2xlarge
rootVolume:
deviceName: /dev/sda1
encrypted: true
iops: 16000
size: 100
throughput: 1000
type: gp3
sshKeyName: ""
capacityRebalance: true
defaultCoolDown: 5m0s
maxSize: 5
minSize: 0
mixedInstancesPolicy:
overrides:
- instanceType: c5a.2xlarge
- instanceType: m5a.2xlarge
subnets:
- filters:
- name: availability-zone-id
values:
- use2-az1
- name: tag:type
values:
- nodes
- filters:
- name: availability-zone-id
values:
- use2-az2
- name: tag:type
values:
- nodes
- filters:
- name: availability-zone-id
values:
- use2-az3
- name: tag:type
values:
- nodes
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfig
metadata:
name: worker-private-efficient
spec:
files:
- content: |
vm.max_map_count=262144
path: /etc/sysctl.d/90-vm-max-map-count.conf
- content: |
fs.inotify.max_user_instances=256
path: /etc/sysctl.d/91-fs-inotify.conf
format: cloud-config
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
cloud-provider: aws
eviction-hard: memory.available<500Mi,nodefs.available<10%
kube-reserved: cpu=500m,memory=2Gi,ephemeral-storage=1Gi
node-labels: role.node.kubernetes.io/worker=true
protect-kernel-defaults: "true"
system-reserved: cpu=500m,memory=1Gi,ephemeral-storage=1Gi
name: '{{ ds.meta_data.local_hostname }}'
preKubeadmCommands:
- sudo systemctl restart systemd-sysctl
I have not set disable: true in my refreshPreferences in the AWSMachinePool spec.
$ kubectl explain awsmachinepool.spec.refreshPreferences.disable
KIND: AWSMachinePool
VERSION: infrastructure.cluster.x-k8s.io/v1beta2
FIELD: disable <boolean>
DESCRIPTION:
Disable, if true, disables instance refresh from triggering when new launch
templates are detected. This is useful in scenarios where ASG nodes are
externally managed.
This is the current state of the runcmd in the LaunchTemplate in AWS, version 1566.
runcmd:
- "sudo systemctl restart systemd-sysctl"
- kubeadm join --config /run/kubeadm/kubeadm-join-config.yaml && echo success > /run/cluster-api/bootstrap-success.complete
I apply a change to add a command to the KubeadmConfig, such as this.
preKubeadmCommands:
- sudo systemctl restart systemd-sysctl
- echo "hello world"
I see the LaunchTemplate, has a new version, and wait for 10 minutes.
I notice that there is no active instance refresh started for my ASG in the instance refresh tab, and my instances are still using the old LaunchTemplate version.
Environment:
- Cluster-api-provider-aws version: v2.0.2
- Kubernetes version: (use
kubectl version):
$ kubectl version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.25.1
Kustomize Version: v4.5.7
Server Version: v1.24.7