Skip to content

Conversation

@justinsb
Copy link
Contributor

@justinsb justinsb commented Aug 16, 2025

  • Initial spike: GCPMachinePool

  • GCPMachinePool: generated code/manifests

This continues the work started by @BrennenMM7 in #901 . I also combined in the support from cluster-api-provider-aws to see what we want to borrow from that, and will whittle the code we don't need from cluster-api-provider-aws away.


NONE

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 16, 2025
@netlify
Copy link

netlify bot commented Aug 16, 2025

Deploy Preview for kubernetes-sigs-cluster-api-gcp ready!

Name Link
🔨 Latest commit c122d09
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-sigs-cluster-api-gcp/deploys/68ee404a3ab16f00083b51af
😎 Deploy Preview https://deploy-preview-1506--kubernetes-sigs-cluster-api-gcp.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 16, 2025
@k8s-ci-robot k8s-ci-robot requested a review from cpanato August 16, 2025 14:33
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot requested a review from dims August 16, 2025 14:33
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Aug 16, 2025
@justinsb
Copy link
Contributor Author

This PR is WIP while I whittle down the unneeded code from cluster-api-provider-aws and generally make this reviewable. But I am uploading as this is a checkpoint that works (in a limited way!)

@justinsb justinsb force-pushed the machinepool branch 5 times, most recently from 428790f to 5906e99 Compare August 16, 2025 19:42
@justinsb justinsb changed the title WIP: Minimal MachinePool support Minimal MachinePool support Aug 23, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 23, 2025
@justinsb
Copy link
Contributor Author

Removing the WIP. I will still try to whittle down the code by extracting helpers etc, but it's already approaching the reviewable ballpark!

@justinsb
Copy link
Contributor Author

So the linter is blowing up on the TODO comments. How do we want to track next steps in code? If we don't want to do // TODO because golangci, maybe we do // TASK?

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 23, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 29, 2025
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 6, 2025
@justinsb justinsb force-pushed the machinepool branch 2 times, most recently from 3eea450 to 87cedb8 Compare September 28, 2025 11:18
resources:
- gcpmachinepools
verbs:
- delete
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not folder into the same block as e.g. gcpmachines because we don't need create. I'm not sure that we need create on e.g. gcpclusters either, but ... that's a separate issue

}

// FUTURE: do we need to verify that the instances are actually running?
machinePoolScope.GCPMachinePool.Spec.ProviderIDList = providerIDList
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like this very much (it shouldn't be spec, it requires us to poll the cloud API), but it seems to be the MachinePool contract.

@justinsb
Copy link
Contributor Author

So I think I finally have this working with an e2e test in #1539 and (hopefully) in a mergeable state.

The big thing for the e2e test was populating spec.providerIDList. I don't love that contract, but it is the MachinePool contract.

}

// Logger is a concrete logger using logr underneath.
type Logger struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious why we are adding a new logger here? And is it going to replace the loggers in other controllers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I added this because CAPA does it, but I also think it's a good call @bochengchu - it feels orthogonal. I think I will remove it.

@bochengchu
Copy link
Contributor

LGTM

@justinsb
Copy link
Contributor Author

justinsb commented Oct 3, 2025

I removed the logging that I agree is orthogonal - thanks @bochengchu

@justinsb
Copy link
Contributor Author

justinsb commented Oct 6, 2025

/retest

@justinsb
Copy link
Contributor Author

So I think this is looking reasonable (IMO) - the failing test is apidiff and we are adding to the API, so that is expected!

cc @csantanapr @cpanato

@justinsb
Copy link
Contributor Author

I think once we get #1542 and kubernetes/test-infra#35686 in we should see sensible results from apidiff, but ... it will still fail because we are changing the API (at least AIUI)

In order for nodes to be associated to the MachinePool, we need to populate the
spec.providerIDList field.  This field is known to the MachinePool controller.
@justinsb justinsb force-pushed the machinepool branch 2 times, most recently from 62f145b to 35204fe Compare October 14, 2025 12:16
@damdo
Copy link
Member

damdo commented Oct 14, 2025

@k8s-ci-robot
Copy link
Contributor

@damdo: GitHub didn't allow me to assign the following users: barbacbd.

Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @cpanato @salasberryfin @damdo @barbacbd @theobarberbany

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@justinsb
Copy link
Contributor Author

Thanks to @salasberryfin for merging the other two PRs, apidiff is now completing. It's actually passing, so it is not testing our public API, but rather our internal API, which is ... not what I assumed.

But in any case, please take a look - would be great to get this in!

@damdo
Copy link
Member

damdo commented Oct 24, 2025

@justinsb are you expecting for this to go in first and the E2Es for @bochengchu to go in after or? What's the best strategy here :) LMK

@justinsb
Copy link
Contributor Author

Hi! So there are two workstreams: MachinePool and HA Internal Load Balancer.

I'm doing MachinePool and @bochengchu is doing HA Internal Load Balancer.

(Currently) MachinePool is split into two PRs:

It seemed like a good idea at the time to split out the tests to keep the code smaller, though actually as a reviewer I'm not sure it would have made my life easier, so ... sorry :-). LMK if you want me to combine them, but #1539 is passing on top of this PR, so if we could approve this one and I will rebase the tests, it would be great to get this in!

I can look now at @bochengchu 's PRs. I still have approval rights in this repo as a cluster-lifecycle lead, so I can approve them if nobody objects. I was waiting for an e2e test before doing so. I think the implementation is in #1533 and the tests are in #1550. In that case I think we are right to split them because I will ok-to-test #1550 now, we expect tests to fail until #1533 is in, and then I guess we can ask for the tests to be rebased on top of the implementation. Then we have two test runs, one with the fix, one without, and hopefully the one with the fix passes and the other fails :-)

TLDR: For MachinePool, an lgtm/approve would be great here, I can then rebase the MachinePool tests. If you and others don't object, I can review and approve the Load Balancer fixes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants