Skip to content

Conversation

@faiq
Copy link
Contributor

@faiq faiq commented Oct 7, 2025

What type of PR is this?
/kind feature

What this PR does / why we need it:

This PR implements the nodeadm config type outlined by the KEP #5678

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Special notes for your reviewer:

Checklist:

  • squashed commits
  • includes documentation
  • includes emoji in title
  • adds unit tests
  • adds or updates e2e tests

Release note:

Implement nodeadm bootstrapping type

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. release-note Denotes a PR that will be considered when it comes time to generate release notes. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 7, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign neolit123 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested review from AndiDog and nrb October 7, 2025 20:30
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 7, 2025
@faiq
Copy link
Contributor Author

faiq commented Oct 7, 2025

testing with this manifest

apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta2
    kind: AWSManagedControlPlane
    name: default-control-plane
  infrastructureRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta2
    kind: AWSManagedControlPlane
    name: default-control-plane
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta2
kind: AWSManagedControlPlane
metadata:
  name: default-control-plane
spec:
  addons:
  - name: kube-proxy
    version: v1.32.0-eksbuild.2
  network:
    cni:
      cniIngressRules:
      - description: kube-proxy metrics
        fromPort: 10249
        protocol: tcp
        toPort: 10249
      - description: NVIDIA Data Center GPU Manager metrics
        fromPort: 9400
        protocol: tcp
        toPort: 9400
      - description: Prometheus node exporter metrics
        fromPort: 9100
        protocol: tcp
        toPort: 9100
  region: us-west-2
  sshKeyName: ""
  version: v1.33.0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSMachineTemplate
metadata:
  name: default
spec:
  template:
    spec:
      cloudInit:
        insecureSkipSecretsManager: true
      ami:
        eksLookupType: AmazonLinux2023
      instanceMetadataOptions:
        httpTokens: required
        httpPutResponseHopLimit: 2
      iamInstanceProfile: nodes.cluster-api-provider-aws.sigs.k8s.io
      instanceType: m5a.16xlarge
      rootVolume:
        size: 80
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta2
kind: NodeadmConfigTemplate
metadata:
  name: default
spec:
  template:
    spec: {}
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: default
spec:
  clusterName: default
  replicas: 1
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta2
          kind: NodeadmConfigTemplate
          name: default
      clusterName: default
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
        kind: AWSMachineTemplate
        name: default
      version: v1.33.0

@faiq
Copy link
Contributor Author

faiq commented Oct 7, 2025

/retest

2 similar comments
@faiq
Copy link
Contributor Author

faiq commented Oct 8, 2025

/retest

@faiq
Copy link
Contributor Author

faiq commented Oct 8, 2025

/retest

@faiq faiq force-pushed the faiq/nodeadm-upstream branch 2 times, most recently from 8f854bd to 81f3664 Compare October 8, 2025 20:43
@faiq
Copy link
Contributor Author

faiq commented Oct 8, 2025

/test ?

@k8s-ci-robot
Copy link
Contributor

@faiq: The following commands are available to trigger required jobs:

/test pull-cluster-api-provider-aws-build
/test pull-cluster-api-provider-aws-build-docker
/test pull-cluster-api-provider-aws-e2e-blocking
/test pull-cluster-api-provider-aws-test
/test pull-cluster-api-provider-aws-verify

The following commands are available to trigger optional jobs:

/test pull-cluster-api-provider-aws-apidiff-main
/test pull-cluster-api-provider-aws-e2e
/test pull-cluster-api-provider-aws-e2e-clusterclass
/test pull-cluster-api-provider-aws-e2e-conformance
/test pull-cluster-api-provider-aws-e2e-conformance-with-ci-artifacts
/test pull-cluster-api-provider-aws-e2e-eks
/test pull-cluster-api-provider-aws-e2e-eks-gc
/test pull-cluster-api-provider-aws-e2e-eks-testing

Use /test all to run the following jobs that were automatically triggered:

pull-cluster-api-provider-aws-apidiff-main
pull-cluster-api-provider-aws-build
pull-cluster-api-provider-aws-build-docker
pull-cluster-api-provider-aws-e2e-blocking
pull-cluster-api-provider-aws-test
pull-cluster-api-provider-aws-verify

In response to this:

/test ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@faiq
Copy link
Contributor Author

faiq commented Oct 8, 2025

/test /test pull-cluster-api-provider-aws-e2e-eks

@dsanders1234
Copy link

Does this work with AWSManagedMachinePool ?

@faiq faiq changed the title 🌱 feat: starts bootstrapping new type with kubebuilder 🌱 feat: implements nodeadm bootstrapping type Oct 9, 2025
@faiq faiq force-pushed the faiq/nodeadm-upstream branch from 81f3664 to 59ecae0 Compare October 9, 2025 19:01
@faiq
Copy link
Contributor Author

faiq commented Oct 9, 2025

@dsanders1234 try this

 apiVersion: cluster.x-k8s.io/v1beta1
 kind: MachinePool
 metadata:
   name: default
 spec:
   clusterName: default
   template:
     spec:
       bootstrap:
         #dataSecretName: ""
         configRef:
           apiVersion: bootstrap.cluster.x-k8s.io/v1beta2
           kind: NodeadmConfig
           name: default
       clusterName: default
       infrastructureRef:
         apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
         kind: AWSManagedMachinePool
         name: default
       version: v1.33.0
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta2
kind: AWSManagedMachinePool
metadata:
  name: default
spec:
  roleName: "nodes.cluster-api-provider-aws.sigs.k8s.io"
  scaling:
    minSize: 1
    maxSize: 3
  amiType: CUSTOM
  awsLaunchTemplate:
    ami:
      eksLookupType: AmazonLinux2023
    instanceMetadataOptions:
      httpTokens: required
      httpPutResponseHopLimit: 2
    instanceType: "m5a.16xlarge"
    rootVolume:
      size: 80
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta2
kind: NodeadmConfig
metadata:
  name: default
spec:
  kubelet:
    config:
      evictionHard:
        memory.available: "2000Mi"

@faiq
Copy link
Contributor Author

faiq commented Oct 9, 2025

/test pull-cluster-api-provider-aws-e2e-eks

@faiq faiq force-pushed the faiq/nodeadm-upstream branch 2 times, most recently from 10503ae to 476eb43 Compare October 9, 2025 20:20
@faiq
Copy link
Contributor Author

faiq commented Oct 9, 2025

/test pull-cluster-api-provider-aws-e2e-eks

@faiq
Copy link
Contributor Author

faiq commented Oct 9, 2025

/retest

3 similar comments
@faiq
Copy link
Contributor Author

faiq commented Oct 9, 2025

/retest

@faiq
Copy link
Contributor Author

faiq commented Oct 9, 2025

/retest

@faiq
Copy link
Contributor Author

faiq commented Oct 9, 2025

/retest

@faiq faiq force-pushed the faiq/nodeadm-upstream branch from 476eb43 to 234d905 Compare October 10, 2025 02:57
@faiq
Copy link
Contributor Author

faiq commented Nov 11, 2025

/test pull-cluster-api-provider-aws-e2e-eks

@matzegebbe
Copy link
Contributor

First of all, thanks for the work on this PR - I'm currently testing the build artifacts from this branch of @faiq in our environment.

During evaluation, I encountered RBAC-related errors such as:

nodeadmconfigs.bootstrap.cluster.x-k8s.io is forbidden: User "system:serviceaccount:capa-system:capa-controller-manager" cannot list resource "nodeadmconfigs" in API group "bootstrap.cluster.x-k8s.io" at the cluster scope

but we can of course fix that, along with a few other small stumbling blocks with the new CRDs and configuration.

While I'm working through these issues and completing our internal validation, I wanted to kindly ask whether there are any remaining blockers preventing this PR from being approved and merged.

EKS nodes images based on Amazon Linux 2 will go out of support on November 26 2025, and we are already running into additional problems with AL2 and the latest runc version (see: awslabs/amazon-eks-ami#2498
). In my opinion, ensuring that Cluster API and CAPA offer solid support for running fully supported machine images on AWS is critical for the project’s overall reliability and long-term relevance.

Thanks in advance for any update or clarification on the merge timeline ✌️

@faiq
Copy link
Contributor Author

faiq commented Nov 17, 2025

While I'm working through these issues and completing our internal validation, I wanted to kindly ask whether there are any remaining blockers preventing this PR from being approved and merged.

As far as I can tell, there are none other than the open source maintainers being busy both with their day jobs. If you can please push in the k8s slack for some reviews. Also, feel free to review this as well. All comments welcome.

@faiq faiq force-pushed the faiq/nodeadm-upstream branch from 9f4ef81 to 483bbb3 Compare November 20, 2025 21:26
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 21, 2025
@faiq faiq force-pushed the faiq/nodeadm-upstream branch from 483bbb3 to 439664b Compare November 26, 2025 17:08
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 26, 2025
@faiq
Copy link
Contributor Author

faiq commented Nov 26, 2025

/test pull-cluster-api-provider-aws-e2e-eks

@faiq
Copy link
Contributor Author

faiq commented Nov 26, 2025

/retest

@faiq faiq force-pushed the faiq/nodeadm-upstream branch from 439664b to b685eea Compare November 26, 2025 17:42
@faiq
Copy link
Contributor Author

faiq commented Nov 26, 2025

/test pull-cluster-api-provider-aws-e2e-eks

@faiq
Copy link
Contributor Author

faiq commented Nov 26, 2025

/retest

1 similar comment
@faiq
Copy link
Contributor Author

faiq commented Nov 26, 2025

/retest

@faiq faiq force-pushed the faiq/nodeadm-upstream branch from b685eea to 0dd3488 Compare November 26, 2025 23:32
@faiq
Copy link
Contributor Author

faiq commented Nov 26, 2025

/test pull-cluster-api-provider-aws-e2e-eks

@AndiDog AndiDog changed the title 🌱 feat: implements nodeadm bootstrapping type 🌱 Implement nodeadm bootstrapping type Nov 28, 2025
Copy link
Contributor

@AndiDog AndiDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only reviewed a few pieces for now, might come back to this again later. Someone with more EKS experience would be great as reviewer.

@@ -0,0 +1,70 @@
apiVersion: cluster.x-k8s.io/v1beta1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please add a non-ClusterClass template as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can. I feel like it will also need another set of tests, which seems like a lot here considering I've already added 2 e2e tests. Maybe we can follow up in another PR?

@faiq faiq force-pushed the faiq/nodeadm-upstream branch from 0dd3488 to 1ca7f1c Compare December 1, 2025 16:11
@faiq
Copy link
Contributor Author

faiq commented Dec 1, 2025

/test pull-cluster-api-provider-aws-e2e-eks

@faiq
Copy link
Contributor Author

faiq commented Dec 1, 2025

/retest

1 similar comment
@faiq
Copy link
Contributor Author

faiq commented Dec 1, 2025

/retest

@faiq
Copy link
Contributor Author

faiq commented Dec 1, 2025

/test pull-cluster-api-provider-aws-e2e-eks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-priority release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants