Skip to content

Commit bee8d4f

Browse files
Document how to change a ClusterClass
1 parent 3ef369e commit bee8d4f

File tree

1 file changed

+165
-7
lines changed

1 file changed

+165
-7
lines changed

docs/book/src/tasks/experimental-features/cluster-class/change-clusterclass.md

Lines changed: 165 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,190 @@
11
# Changing a ClusterClass
22

3-
When you change a ClusterClass, the system validates the required changes according to the [compatibility rules defined in the ClusterClass proposal](https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/202105256-cluster-class-and-managed-topologies.md#clusterclass-compatibility).
3+
## Selecting a strategy
44

5-
According to [Cluster API operational practices](https://cluster-api.sigs.k8s.io/tasks/updating-machine-templates.html), the recommended way for updating templates is by template rotation (create a new template, update the template reference in the ClusterClass, and then delete the old template).
5+
When planning a change to a ClusterClass, users should always take into consideration
6+
how those changes might impact the existing Clusters already using the ClusterClass, if any.
7+
8+
There are two strategies for defining how a ClusterClass change rolls out to existing Clusters:
9+
10+
- Roll out ClusterClass changes to existing Cluster in a controlled/incremental fashion.
11+
- Roll out ClusterClass changes to all the existing Cluster immediately.
12+
13+
The first strategy is the recommended choice for people starting with ClusterClass; it
14+
requires the users to create a new ClusterClass with the expected changes, and then
15+
[rebase](#rebase) each Cluster to use the newly created ClusterClass.
16+
17+
By splitting the change to the ClusterClass and its rollout
18+
to Clusters into separate steps the user will reduce the risk of introducing unexpected
19+
changes on existing Clusters, or at least limit the blast radius of those changes
20+
to a small number of Clusters already rebased (in fact it is similar to a canary deployment).
21+
22+
The second strategy listed above instead requires changing a ClusterClass "in place", which can
23+
be simpler and faster than creating a new ClusterClass. However, this approach
24+
means that changes are immediately propagated to all the Clusters already using the
25+
modified ClusterClass. Any operation involving many Clusters at the same time has intrinsic risks,
26+
and it can impact heavily on the underlying infrastructure in case the operation triggers
27+
machine rollout across the entire fleet of Clusters.
28+
29+
However, regardless of which strategy you are choosing to implement your changes to a ClusterClass,
30+
please make sure to:
31+
32+
- [Plan ClusterClass changes](#planning-clusterclass-changes) before applying them.
33+
- Understand what [Compatibility Checks](#compatibility-checks) are and how to prevent changes
34+
that can lead to non-functional Clusters.
35+
36+
If instead you are interested in understanding more about which kind of
37+
effects you should expect on the Clusters, or if you are interested in additional details
38+
about the internals of the topology reconciler you can start reading the notes in the
39+
[Plan ClusterClass changes](#planning-clusterclass-changes) documentation or looking at the [reference](#reference)
40+
documentation at the end of this page.
41+
42+
## Changing ClusterClass templates
43+
44+
Templates are an integral part of a ClusterClass, and thus the same considerations
45+
described in the previous paragraph apply. When changing
46+
a template referenced in a ClusterClass users should also always plan for how the
47+
change should be propagated to the existing Clusters and choose the strategy that best
48+
suits expectations.
49+
50+
According to the [Cluster API operational practices](../../updating-machine-templates.md),
51+
the recommended way for updating templates is by template rotation:
52+
- Create a new template
53+
- Update the template reference in the ClusterClass
54+
- Delete the old template
55+
56+
<aside class="note">
57+
<h1>In place template mutations</h1>
58+
59+
In case a provider supports in place template mutations, the Cluster API topology controller
60+
will adapt to them during the next reconciliation, but the system is not watching for those changes.
61+
Meaning, when the underlying template is updated the changes
62+
may not be reflected immediately, however they will be picked up during the next full reconciliation.
63+
The maximum time for the next full reconciliation is equal to the CAPI controller
64+
sync period (defaults to 10 minutes).
65+
66+
</aside>
667

768
<aside class="note warning">
69+
<h1>Reusing templates across ClusterClasses</h1>
70+
71+
As already discussed in [writing a cluster class](write-clusterclass.md), while it is technically possible to
72+
re-use a template across ClusterClasses, this practice is not recommended because it makes it difficult
73+
to reason about the impact of changing such a template can have on existing Clusters.
74+
75+
</aside>
76+
77+
Also in case of changes to the ClusterClass templates, please make sure to:
78+
79+
- [Plan ClusterClass changes](#planning-clusterclass-changes) before applying them.
80+
- Understand what [Compatibility Checks](#compatibility-checks) are and how to prevent changes
81+
that can lead to non-functional Clusters.
82+
83+
You can learn more about this reading the notes in the [Plan ClusterClass changes](#planning-clusterclass-changes) documentation or
84+
looking at the [reference](#reference) documentation at the end of this page.
85+
86+
## Rebase
87+
88+
Rebasing is an operational practice for transitioning a Cluster from one ClusterClass to another,
89+
and the operation can be triggered by simply changing the value in `Cluster.spec.topology.class`.
890

91+
Also in this case, please make sure to:
92+
93+
- [Plan ClusterClass changes](#planning-clusterclass-changes) before applying them.
94+
- Understand what [Compatibility Checks](#compatibility-checks) are and how to prevent changes
95+
that can lead to non-functional Clusters.
96+
97+
You can learn more about this reading the notes in the [Plan ClusterClass changes](#planning-clusterclass-changes) documentation or
98+
looking at the [reference](#reference) documentation at the end of this page.
99+
100+
## Compatibility Checks
101+
102+
When changing a ClusterClass, the system validates the required changes according to
103+
a set of "compatibility rules" in order to prevent changes which would lead to a non-functional
104+
Cluster, e.g. changing the InfrastructureProvider from AWS to Azure.
105+
106+
If the proposed changes are evaluated as dangerous, the operation is rejected.
107+
108+
<aside class="note warning">
9109
<h1>Warning</h1>
10110

11-
Changing a ClusterClass triggers changes on all the Clusters using the ClusterClass.
111+
In the current implementation there are no compatibility rules for changes to provider
112+
templates, so you should refer to the provider documentation to avoid
113+
potentially dangerous changes on those objects.
12114

13115
</aside>
14116

15-
If changes are evaluated as potentially leading to a non-functional Cluster, the operation is rejected. It is important to note that the current implementation ensures only a minimal set of compatibility rules are applied; most importantly, there are no provider specific rules at present, so you should refer to the provider documentation for preventing potentially dangerous changes on your infrastructure.
117+
For additional info see [compatibility rules](https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/proposals/202105256-cluster-class-and-managed-topologies.md#clusterclass-compatibility)
118+
defined in the ClusterClass proposal.
119+
120+
## Planning ClusterClass changes
121+
122+
It is highly recommended to always generate a plan for ClusterClass changes before applying them,
123+
no matter if you are creating a new ClusterClass and rebasing Clusters or if you are changing
124+
your ClusterClass in place.
125+
126+
The clusterctl tool provides a new alpha command for this operation, [clusterctl alpha topology plan](../../../clusterctl/commands/alpha-topology-plan.md).
127+
128+
The output of this command will provide you all the details about how those changes would impact
129+
Clusters, but the following notes can help you to understand what you should
130+
expect when planning your ClusterClass changes:
131+
132+
- Users should expect the resources in a Cluster (e.g. MachineDeployments) to behave consistently
133+
no matter if a change is applied via a ClusterClass or directly as you do in a Cluster without
134+
a ClusterClass. In other words, if someone changes something on a KCP object triggering a
135+
control plane Machines rollout, you should expect the same to happen when the same change
136+
is applied to the KCP template in ClusterClass.
16137

138+
- User should expect the Cluster topology to change consistently irrespective of how the change has been
139+
implemented inside the ClusterClass; in other words, if you change a template field "in place", if you
140+
rotate the template referenced in the ClusterClass by pointing to a new template with the same field
141+
changed, or if you change the same field via a patch, the effects on the Cluster are the same.
17142

18-
Once the changes are applied, the topology controller reacts as described in the following table.
143+
- Users should expect the Cluster topology to change consistently irrespective of how the change has been
144+
applied to the ClusterClass. In other words, if you change a template field "in place", or if you
145+
rotate the template referenced in the ClusterClass by pointing to a new template with the same field
146+
changed, or if you change the same field via a patch, the effects on the Cluster are the same.
147+
148+
See [reference](#reference) for more details.
149+
150+
## Reference
151+
152+
### Effects on the Clusters
153+
154+
The following table documents the effects each ClusterClass change can have on a Cluster.
19155

20156
| Changed field | Effects on Clusters |
21157
|-------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
22158
| infrastructure.ref | Corresponding InfrastructureCluster objects are updated (in place update). |
23159
| controlPlane.metadata | If labels/annotations are added, changed or deleted the ControlPlane objects are updated (in place update).<br /><br /> In case of KCP, corresponding controlPlane Machines are updated (rollout) only when adding or changing labels or annotations; deleted label should be removed manually from machines or they will go away automatically at the next machine rotation. |
24160
| controlPlane.ref | Corresponding ControlPlane objects are updated (in place update). <br /> If updating ControlPlane objects implies changes in the spec, the corresponding ControlPlane Machines are updated accordingly (rollout). |
25161
| controlPlane.machineInfrastructure.ref | If the referenced template has changes only in metadata labels or annotations, the corresponding InfrastructureMachineTemplates are updated (in place update). <br /> <br />If the referenced template has changes in the spec:<br /> - Corresponding InfrastructureMachineTemplate are rotated (create new, delete old)<br /> - Corresponding ControlPlane objects are updated with the reference to the newly created template (in place update)<br /> - The corresponding controlPlane Machines are updated accordingly (rollout). |
26-
| workers.machineDeployments | If a new MachineDeploymentClass is added, no changes are triggered to the Clusters. <br />If an existing MachineDeploymentClass is changed, effect depends on the type of change (see below). <br /><br />Note: Deleting an existing MachineDeploymentClass is not supported.  |
162+
| workers.machineDeployments | If a new MachineDeploymentClass is added, no changes are triggered to the Clusters. <br />If an existing MachineDeploymentClass is changed, effect depends on the type of change (see below). |
27163
| workers.machineDeployments[].metadata | If labels/annotations are added, changed or deleted the MachineDeployment objects are updated (in place update) and corresponding worker Machines are updated (rollout). |
28164
| workers.machineDeployments[].bootstrap.ref | If the referenced template has changes only in metadata labels or annotations, the corresponding BootstrapTemplates are updated (in place update).<br /> <br />If the referenced template has changes in the spec:<br /> - Corresponding BootstrapTemplate are rotated (create new, delete old). <br /> - Corresponding MachineDeployments objects are updated with the reference to the newly created template (in place update). <br /> - The corresponding worker machines are updated accordingly (rollout) |
29165
| workers.machineDeployments[].infrastructure.ref | If the referenced template has changes only in metadata labels or annotations, the corresponding InfrastructureMachineTemplates are updated (in place update). <br /> <br />If the referenced template has changes in the spec:<br /> - Corresponding InfrastructureMachineTemplate are rotated (create new, delete old).<br /> - Corresponding MachineDeployments objects are updated with the reference to the newly created template (in place update). <br /> - The corresponding worker Machines are updated accordingly (rollout) |
30166

167+
### How the topology controller reconciles template fields
168+
169+
The topology reconciler enforces values defined in the ClusterClass templates into the topology
170+
owned objects in a Cluster.
171+
172+
A simple way to understand this is to `kubectl get -o json` templates referenced in a ClusterClass;
173+
then you can consider the topology reconciler to be authoritative on all the values
174+
under `spec`. Being authoritative means that the user cannot manually change those values in
175+
the object derived from the template in a specific Cluster (and if they do so the value gets reconciled
176+
to the value defined in the ClusterClass).
177+
178+
<aside class="note">
179+
<h1>What about patches?</h1>
180+
181+
The considerations above apply also when using patches, the only difference being that the
182+
authoritative fields should be determined by applying patches on top of the `kubectl get -o json` output.
183+
184+
</aside>
185+
186+
A corollary of the behaviour described above is that it is technically possible to change non-authoritative
187+
fields in the object derived from the template in a specific Cluster, but we advise against using the possibility
188+
or making ad-hoc changes in generated objects unless otherwise needed for a workaround. It is always
189+
preferable to improve ClusterClasses by supporting new Cluster variants in a reusable way.
31190

32-
Note: In case a provider supports in place template mutations, the Cluster API topology controller will adapt to them at the next reconciliation, but the system is not watching for those specific changes. When the underlying template is updated in this way the changes may not be reflected immediately, but will be put in place at the next full reconciliation. The maximum time for the next reconciliation to take place is related to the CAPI controller sync period - 10 minutes by default.

0 commit comments

Comments
 (0)