-
Notifications
You must be signed in to change notification settings - Fork 635
Open
Labels
kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.needs-priorityneeds-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.Indicates an issue or PR lacks a `triage/foo` label and requires one.
Description
/kind bug
AWSManagedMachinePool becomes responsive with a bad bootstrap (network call to aws eks or invalid command leading to cloud-init failure in AL2).
How to reproduce:
- Create/Update MachinePool, AWSManagedMachinePool, and oneof EKSConfig with an invalid bootstrap/pre-bootstrap command e.g. ./bin/exec-something-does-not-exist.exeonalinuxsurelydoesnotwork
- Observe event on AWSManagedMachinePool:
EKSNodegroupReconciliationFailed: failed to wait for nodegroup to be active: failed to wait for EKS nodegroup "eks-pool": request cancelled while waiting, context canceled
We can see reconciliation is stuck waiting on EKS node group to not be in "updating" phase. It will never get there because the update will never complete as nodes will never successfully join and replace existing ones. This leads to manual commands needing to be run since we cannot fix the invalid bootstrap via CAPA resources anymore.
- Cluster-api-provider-aws version: v2.9.1
- Kubernetes version: (use
kubectl version): 1.30+ - OS (e.g. from
/etc/os-release): AL2
Metadata
Metadata
Assignees
Labels
kind/bugCategorizes issue or PR as related to a bug.Categorizes issue or PR as related to a bug.needs-priorityneeds-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.Indicates an issue or PR lacks a `triage/foo` label and requires one.