-
Notifications
You must be signed in to change notification settings - Fork 635
📖 KEP: adds nodedm proposal #5678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+302
−0
Merged
Changes from 1 commit
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,311 @@ | ||
--- | ||
title: Proposal EKS Support in CAPA for nodeadm | ||
authors: | ||
- "@faiq" | ||
reviewers: | ||
creation-date: 2025-09-22 | ||
last-updated: 2025-09-22 | ||
status: proposed | ||
see-also: | ||
- https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/3518 | ||
replaces: [] | ||
superseded-by: [] | ||
--- | ||
|
||
|
||
## Table of Contents | ||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
- [Goals](#goals) | ||
- [Non-Goals](#non-goals) | ||
- [Proposal](#proposal) | ||
- [User Stories](#user-stories) | ||
- [Alternatives](#alternatives) | ||
|
||
|
||
## Summary | ||
|
||
Currently, EKS support in the Cluster API Provider for AWS (CAPA) is broken for Amazon Linux 2023 (AL2023) because the `bootstrap.sh` script is no longer supported. This proposal introduces a new Custom Resource Definition (CRD), `NodeadmConfig`, to handle the new `nodeadm` bootstrapping method required by AL2023. This approach is favored over modifying the existing `EKSConfig` type to maintain a cleaner API, avoid fields that are not cross-compatible between bootstrapping methods, and simplify the future deprecation of the `bootstrap.sh` implementation. | ||
|
||
----- | ||
|
||
## Motivation | ||
|
||
Currently EKS support in CAPA is broken for AL2023 (Amazon Linux 2023) because the bootstrapping method that was previously being used to provision EKS nodes is no longer supported `bootstrap.sh`. Users who are using AL2023 see errors like this on the worker nodes: | ||
|
||
```bash | ||
[root@localhost bin]# /etc/eks/bootstrap.sh default_dk-eks-133-control-plane | ||
|
||
|
||
\!\!\!\!\!\!\!\!\!\! | ||
\!\!\!\!\!\!\!\!\!\! ERROR: bootstrap.sh has been removed from AL2023-based EKS AMIs. | ||
\!\!\!\!\!\!\!\!\!\! | ||
\!\!\!\!\!\!\!\!\!\! EKS nodes are now initialized by nodeadm. | ||
\!\!\!\!\!\!\!\!\!\! | ||
\!\!\!\!\!\!\!\!\!\! To migrate your user data, see: | ||
\!\!\!\!\!\!\!\!\!\! | ||
\!\!\!\!\!\!\!\!\!\! https://awslabs.github.io/amazon-eks-ami/nodeadm/ | ||
\!\!\!\!\!\!\!\!\!\! | ||
|
||
```` | ||
|
||
In CAPA our implementation of the EKS bootstrapping method is currently tied to the `bootstrap.sh` script and is implemented by the `EKSConfig` type. | ||
|
||
Additionally, the EKS team is not publishing any more AmazonLinux (AL2) AMIs after November 26th, 2025, and Kubernetes version 1.32 is the last version for which AL2 AMIs will be released. This makes the transition to a new bootstrapping method for AL2023 urgent. | ||
|
||
### Goals | ||
|
||
* Restore the ability to provision EKS nodes using CAPA with AL2023 AMIs. | ||
* Introduce a new, clean API (`NodeadmConfig`) specifically for the `nodeadm` bootstrap method. | ||
* Provide a clear upgrade path for users moving from `EKSConfig` (`bootstrap.sh`) to `NodeadmConfig` (`nodeadm`). | ||
* Make future deprecation of the `bootstrap.sh` implementation in `EKSConfig` easier. | ||
|
||
### Non-Goals | ||
* Create a metatype that can handle both bootstrap.sh and nodeadm. | ||
* Handle Operating Systems with different bootstrapping mechanisms like bottlerocket. | ||
----- | ||
|
||
## Proposal | ||
|
||
This KEP proposes a new type that handles bootstrapping with `nodeadm` alone. This new type, `NodeadmConfig`, will wrap the API implementation for the Nodeadm option as a bootstrap provider. | ||
|
||
This approach is proposed due to drawbacks with the alternative of modifying the existing `EKSConfig` type, which would involve the introduction of new fields to distinguish between bootstrap methods and lead to a confusing API where some fields are only valid for one method. | ||
|
||
Examples of fields in the existing API that are no longer valid with `nodeadm`: | ||
|
||
* `ContainerRuntime` | ||
* `DNSClusterIP` | ||
* `DockerConfigJSON` | ||
* `APIRetryAttempts` | ||
* `PostBootstrapCommands` | ||
* `BootstrapCommandOverride` | ||
|
||
The **pros** of this approach are: | ||
|
||
* A cleaner API that’s more descriptive for each bootstrap method. | ||
* A new implementation will make deprecating EKSConfig’s `bootstrap.sh` implementation easier. | ||
|
||
The **cons** are: | ||
|
||
* The scope of work to support EKS nodes grows significantly and is pushed out. | ||
* Users need to know that setting `AWSMachine.Spec.Ami.EKSLookupType` to `AmazonLinux` won’t work with the new `nodeadm` bootstrap method. | ||
|
||
### User Stories | ||
|
||
* As a cluster admin I need to provision nodes to my EKS cluster using Kubernetes 1.33 or higher | ||
* As a cluster admin I need to provision EKS worker nodes using the latest AL2023 AMIs | ||
* As a cluster admin I need to upgrade my existing EKS cluster nodes from an AL2-based version (e.g., 1.32) to an AL2023-based version (e.g., 1.33) with minimal disruption. | ||
|
||
### API Design | ||
|
||
On a high level this new type `NodeadmConfig` wraps the API implementation for the Nodeadm option as a bootstrap provider. | ||
|
||
```go | ||
// NodeadmConfigSpec defines the desired state of NodeadmConfig. | ||
type NodeadmConfigSpec struct { | ||
// Kubelet contains options for kubelet. | ||
// +optional | ||
Kubelet *KubeletOptions `json:"kubelet,omitempty"` | ||
|
||
// Containerd contains options for containerd. | ||
// +optional | ||
Containerd *ContainerdOptions `json:"containerd,omitempty"` | ||
|
||
// Instance contains options for the node's operating system and devices. | ||
// +optional | ||
Instance *InstanceOptions `json:"instance,omitempty"` | ||
|
||
// FeatureGates holds key-value pairs to enable or disable application features. | ||
// +optional | ||
FeatureGates map[Feature]bool `json:"featureGates,omitempty"` | ||
|
||
// PreBootstrapCommands specifies extra commands to run before bootstrapping nodes. | ||
// +optional | ||
PreBootstrapCommands []string `json:"preBootstrapCommands,omitempty"` | ||
|
||
// Files specifies extra files to be passed to user_data upon creation. | ||
// +optional | ||
Files []File `json:"files,omitempty"` | ||
|
||
// Users specifies extra users to add. | ||
// +optional | ||
Users []User `json:"users,omitempty"` | ||
|
||
// NTP specifies NTP configuration. | ||
// +optional | ||
NTP *NTP `json:"ntp,omitempty"` | ||
|
||
// DiskSetup specifies options for the creation of partition tables and file systems on devices. | ||
// +optional | ||
DiskSetup *DiskSetup `json:"diskSetup,omitempty"` | ||
|
||
// Mounts specifies a list of mount points to be setup. | ||
// +optional | ||
Mounts []MountPoints `json:"mounts,omitempty"` | ||
} | ||
``` | ||
|
||
----- | ||
|
||
## Design Details | ||
|
||
### Upgrade Strategy | ||
|
||
A valid concern that CAPA users will have is upgrading existing clusters to machines that use the new bootstrap `Nodeadm` CRD. This is largely assuaged by CAPI's immutable design where new machines will be deployed by referencing new templates. In summary, upgrading should not be a problem. | ||
faiq marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
#### MachineDeployment Upgrade Example | ||
|
||
A user with a `MachineDeployment` using `EKSConfig` for Kubernetes v1.32 would upgrade to v1.33 by creating a new `NodeadmConfigTemplate` and updating the `MachineDeployment` to reference it and the new Kubernetes version. New machines are rolled out according to the `MachineDeployment` update strategy. | ||
|
||
**Before (v1.32 with `EKSConfigTemplate`):** | ||
|
||
```yaml | ||
apiVersion: bootstrap.cluster.x-k8s.io/v1beta2 | ||
kind: EKSConfigTemplate | ||
metadata: | ||
name: default132 | ||
spec: | ||
template: | ||
spec: | ||
postBootstrapCommands: | ||
- "echo \"bye world\"" | ||
--- | ||
apiVersion: cluster.x-k8s.io/v1beta1 | ||
kind: MachineDeployment | ||
metadata: | ||
name: default | ||
spec: | ||
clusterName: default | ||
template: | ||
spec: | ||
bootstrap: | ||
configRef: | ||
apiVersion: bootstrap.cluster.x-k8s.io/v1beta2 | ||
kind: EKSConfigTemplate | ||
name: default132 | ||
infrastructureRef: | ||
kind: AWSMachineTemplate | ||
name: default132 | ||
version: v1.32.0 | ||
```` | ||
|
||
**After (v1.33 with `NodeadmConfigTemplate`):** | ||
|
||
```yaml | ||
apiVersion: bootstrap.cluster.x-k8s.io/v1beta2 | ||
kind: NodeadmConfigTemplate | ||
metadata: | ||
name: default | ||
spec: | ||
template: | ||
spec: | ||
preBootstrapCommands: | ||
- "echo \"hello world\"" | ||
--- | ||
apiVersion: cluster.x-k8s.io/v1beta1 | ||
kind: MachineDeployment | ||
metadata: | ||
name: default | ||
spec: | ||
clusterName: default | ||
template: | ||
spec: | ||
bootstrap: | ||
configRef: | ||
apiVersion: bootstrap.cluster.x-k8s.io/v1beta2 | ||
kind: NodeadmConfigTemplate | ||
name: default | ||
infrastructureRef: | ||
kind: AWSMachineTemplate | ||
name: default | ||
version: v1.33.0 | ||
``` | ||
|
||
#### MachinePool Upgrade Example | ||
|
||
The flow would be very similar for `MachinePools`. A user would update the `MachinePool` resource to reference a new `NodeadmConfigTemplate` and the target Kubernetes version. | ||
|
||
**Before (v1.32 with `EKSConfigTemplate`):** | ||
|
||
```yaml | ||
apiVersion: bootstrap.cluster.x-k8s.io/v1beta2 | ||
kind: EKSConfigTemplate | ||
metadata: | ||
name: default-132 | ||
spec: | ||
template: | ||
spec: {} | ||
--- | ||
apiVersion: cluster.x-k8s.io/v1beta1 | ||
kind: MachinePool | ||
metadata: | ||
name: default | ||
spec: | ||
clusterName: default | ||
template: | ||
spec: | ||
bootstrap: | ||
configRef: | ||
apiVersion: bootstrap.cluster.x-k8s.io/v1beta2 | ||
kind: EKSConfigTemplate | ||
name: default-132 | ||
infrastructureRef: | ||
kind: AWSMachinePool | ||
name: default | ||
version: v1.32.0 | ||
``` | ||
|
||
**After (v1.33 with `NodeadmConfigTemplate`):** | ||
|
||
```yaml | ||
apiVersion: bootstrap.cluster.x-k8s.io/v1beta2 | ||
kind: NodeadmConfigTemplate | ||
metadata: | ||
name: default-133 | ||
spec: | ||
template: | ||
spec: | ||
preBootstrapCommands: | ||
- "echo \"hello from v1.33.0\"" | ||
--- | ||
apiVersion: cluster.x-k8s.io/v1beta1 | ||
kind: MachinePool | ||
metadata: | ||
name: default | ||
spec: | ||
clusterName: default | ||
template: | ||
spec: | ||
bootstrap: | ||
configRef: | ||
apiVersion: bootstrap.cluster.x-k8s.io/v1beta2 | ||
kind: NodeadmConfigTemplate | ||
name: default-133 | ||
infrastructureRef: | ||
kind: AWSMachinePool | ||
name: default | ||
version: v1.33.0 | ||
``` | ||
|
||
### Test Plan | ||
|
||
* Unit tests for the new code. | ||
* Integration tests for new Nodeadm Controller. | ||
* E2e tests exercising the migration from EKSConfig to NodeadmConfig, | ||
|
||
|
||
## Drawbacks | ||
|
||
* The scope of work to support EKS nodes grows significantly and is pushed out. | ||
* Users need to know that setting `AWSMachine.Spec.Ami.EKSLookupType` to `AmazonLinux` won’t work with the new `nodeadm` bootstrap method. | ||
faiq marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
||
----- | ||
|
||
## Alternatives | ||
|
||
The primary alternative considered was to modify the existing `EKSConfig` type to support `nodeadm`. Currently, there’s work being done upstream to address this gap. On a high level, [this PR](https://github.com/kubernetes-sigs/cluster-api-provider-aws/pull/5553) is adding a new bootstrapping implementation to the existing `EKSConfig` type with some additional API fields to distinguish between bootstrap methods. | ||
|
||
However, there are some drawbacks with this implementation regarding the API design: | ||
|
||
* **Introduction of new fields to distinguish between bootstrap methods**: This complicates the API. | ||
* **Fields that are valid for `bootstrap.sh` are not valid for `nodeadm` and vice versa**: This would lead to a confusing user experience where users could set fields that have no effect for their chosen bootstrap method. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.