Skip to content

Machine with cloud-init 23.3.0 or newer fails to join clusterΒ #4745

@dlipovetsky

Description

@dlipovetsky

/kind bug

What steps did you take and what happened:

I used https://github.com/kubernetes-sigs/image-builder/ to create an Ubuntu 20.04 AMI with the latest available cloud-init package, 23.3.3. The machine fails to join the cluster.

What did you expect to happen:

The machine should join the cluster.

Anything else you would like to add:

In #1490, CAPA began writing sensitive user-data to AWS Secrets Manager (#1924 added support for an alternative, the SSM Parameter Store). CAPA replaced the user-data produced by CABPK with a mechanism to fetch the user-data from the service. This mechanism relied on an "include" that would, by design, fail the first time cloud-init ran. CAPA relied on cloud-init ignoring the failure.

As of canonical/cloud-init#367, cloud-init stopped ignoring the failure by default, but introduced a feature flag that allowed cloud-init to ignore the failure, as it had in the past. The default settings caused the cloud-init boot to fail, and kubernetes-sigs/image-builder#406 used the feature flag as a work around.

More recently, as of canonical/cloud-init#4228, the feature flag itself was removed. Without the feature flag, the existing workaround has no effect, and cloud-init boot fails.

@supershal and I looked into this issue, and filed kubernetes-sigs/image-builder#1333. We finally understand the root cause.

The most CAPA-maintained AMIs were created with cloud-init 22.4.2, instead of the default cloud-init version.

Environment:

  • Cluster-api-provider-aws version: main
  • Kubernetes version: (use kubectl version): v1.27.8
  • OS (e.g. from /etc/os-release): Ubuntu 20.04

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.lifecycle/rottenDenotes an issue or PR that has aged beyond stale and will be auto-closed.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions