-
Notifications
You must be signed in to change notification settings - Fork 217
feat: add support for gke clusterclass #1442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support for gke clusterclass #1442
Conversation
Skipping CI for Draft Pull Request. |
✅ Deploy Preview for kubernetes-sigs-cluster-api-gcp ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
29808fb
to
93c4e3d
Compare
833f5a4
to
8def4a9
Compare
7856476
to
36977fa
Compare
/retest |
1ebac57
to
3cc41e5
Compare
/retest |
1 similar comment
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot for working on this. This PR has been open for while, in my opinion we should merge it as is and improve in follow up PRs if needed. The feature is experimental anyway.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alexander-demicev, salasberryfin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/remove-lifecycle-stale |
/hold |
3cc41e5
to
04bf0da
Compare
/retest |
04bf0da
to
7211e08
Compare
Signed-off-by: Carlos Salas <[email protected]>
Signed-off-by: Carlos Salas <[email protected]>
15e4c60
to
c11ca9e
Compare
Signed-off-by: Carlos Salas <[email protected]>
/lgtm |
/unhold |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR adds support for managed clusters (GKE) provisioning using ClusterClass, implementing the template types required to satisfy the CAPI contract.
New templates for GKE cluster provisioning via ClusterClass are added to
./templates
:./templates/cluster-template-gke-clusterclass.yaml
and./templates/cluster-template-gke-topology.yaml
./templates/cluster-template-gke-autopilot-clusterclass.yaml
and./templates/cluster-template-gke-autopilot-topology
I've tried these two templates with the current CAPI and CAPG and both clusters are provisioned successfully.
The CAPI contract for using ClusterClass requires that a bootstrap object is passed in the infrastructure definition, which is a problem when provisioning managed clusters, which do not need cluster bootstrapping. Other providers have used different alternatives to bypass this requirement: CAPZ does reference Kubeadm (which then creates a dependency on the provider) and CAPA implements its own version of a bootstrap provider for EKS. I think the simplest solution is to create a "semi-dummy"
GKEConfig
controller that will report theReady
status to core CAPI whenGKEMachinePools
are available. This controller does not apply changes other than reporting the readiness of the cluster's infrastructure. When using GKE Autopilot there's no need to reference a bootstrap object as underlying infrastructure is completely managed by GCP, as can be seen in the new templates.Pending: adding e2e tests
Existing GKE E2E tests are not running because of issues with GCP permissions which we're trying to get fixed here. There is a separate PR to enable these tests on CI. Considering GKE is still an experimental feature of CAPG, I'd suggest we try to get this feature added and, in parallel, work on having E2E permissions fixed. Specially considering that this adds lots of lines of new code, having to keep this PR up to date for a long time may become quite complex.
Which issue(s) this PR fixes:
Fixes #1387
Special notes for your reviewer:
Changes affect the experimental GKE feature which hopefully helps with publishing changes in the API.
NOTE: After bumping to CAPI v1.10 this PR required a number of changes. Since some of these have already been applied in existing GKE (non-clusterclass) logic, there was some overlap of code and I've switched some methods to functions so that they can be reused across different resources (i.e.
GCPManagedMachinePool
andGCPManagedMachinePoolTemplate
have equivalentValidateUpdate
logic and it would not make sense to maintain separate code for each resource). Some of the changes also apply to using individual validation of fields in the specification, rather than doing bulk validation of multiple fields at the same time: this means the code is more verbose but I consider it is overall more clear and definitely makes it more reusable and scalable down the line.Follow-ups:
TODOs:
Release note: