Skip to content

Commit a06fe76

Browse files
Address feedback
1 parent 6180abe commit a06fe76

File tree

3 files changed

+64
-53
lines changed

3 files changed

+64
-53
lines changed

docs/book/src/tasks/experimental-features/runtime-sdk/implement-lifecycle-hooks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -336,7 +336,7 @@ For additional details, you can see the full schema in <button onclick="openSwag
336336
This hook is called after all the workers have been upgraded to the version specified in `spec.topology.version`
337337
or to an intermediate version in the upgrade plan, and:
338338
- if the upgrade plan is completed and the entire cluster is at `spec.topology.version`, immediately before calling the AfterClusterUpgrade hook
339-
- if the upgrade plan is not complete and the entrire cluster is now at one of the intermediate versions, immediately before calling BeforeControlPlaneUpgrade hook for the next intermediate step
339+
- if the upgrade plan is not complete and the entire cluster is now at one of the intermediate versions, immediately before calling BeforeControlPlaneUpgrade hook for the next intermediate step
340340

341341
Runtime Extension implementers can use this hook to execute post-upgrade add-on tasks; if the upgrade plan is not completed,
342342
this hook allows to block upgrades to the next version of the control plane until everything is ready.

docs/proposals/20210526-cluster-class-and-managed-topologies.md

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -380,18 +380,17 @@ as well as in
380380
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
381381
kind: VSphereClusterTemplate
382382
name: vsphere-prod-cluster-template
383-
upgrade:
384-
versions:
385-
- v1.28.0
386-
- v1.29.0
387-
- v1.30.0
388-
- v1.30.1
389-
- v1.31.2
390-
- ...
391-
```
392-
393-
see [proposal: Chained and efficient upgrades for Clusters with managed topologies](20250513-chained-and-efficient-upgrades-for-clusters-with-managed-topologies.md) for more options
394-
for configuring Kubernetes version upgrade of clusters using managed topologies.
383+
kubernetesVersions:
384+
- v1.28.0
385+
- v1.29.0
386+
- v1.30.0
387+
- v1.30.1
388+
- v1.31.2
389+
- ...
390+
```
391+
392+
see [proposal: Chained and efficient upgrades for Clusters with managed topologies](20250513-chained-and-efficient-upgrades-for-clusters-with-managed-topologies.md) for more options
393+
for configuring Kubernetes version upgrade of clusters using managed topologies.
395394
396395
2. User creates a cluster using the class name and defining the topology.
397396
```yaml

docs/proposals/20250513-chained-and-efficient-upgrades-for-clusters-with-managed-topologies.md

Lines changed: 52 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ last-updated: 2025-05-13
1010
status: implementable
1111
see-also:
1212
- "/docs/proposals/20210526-cluster-class-and-managed-topologies.md"
13-
- "/docs/proposals/20220414-runtime-hooks.md"
13+
- "/docs/proposals/20220414-lifecycle-hooks.md"
1414
---
1515

1616
# Chained and efficient upgrades for Clusters with managed topologies
@@ -46,15 +46,15 @@ see-also:
4646
## Glossary
4747

4848
- **Chained upgrade**: an upgrade sequence that goes from one Kubernetes version to another
49-
by passing through a set of intermediate versions. e.g., Upgrading from v1.31.0 to v1.33.0 requires
50-
a chained upgrade with the following intermediate steps: v1.31.0 (initial state) -> v1.32.0 (intermediate version)
49+
by passing through a set of intermediate versions. e.g., Upgrading from v1.31.0 (current state) to v1.34.0 (target version) requires
50+
a chained upgrade with the following steps: v1.32.0 (first intermediate version)-> v1.32.0 (second intermediate version)
5151
-> v1.33.0 (target version)
5252

53-
- **Efficient upgrade**: a chained upgrade where workers node skips some of the intermediate versions,
54-
when allowed by the [Kubernetes version skew policy](https://kubernetes.io/releases/version-skew-policy/).
53+
- **Upgrade plan**: the sequence of intermediate versions ... target version that a Cluster must upgrade to when
54+
performing a chained upgrade;
5555

56-
- **Upgrade plan**: the sequence of intermediate versions that a Cluster must upgrade to when
57-
performing a chained upgrade; when the chained upgrade is also an efficient upgrade,
56+
- **Efficient upgrade**: a chained upgrade where worker nodes skip some of the intermediate versions,
57+
when allowed by the [Kubernetes version skew policy](https://kubernetes.io/releases/version-skew-policy/) when the chained upgrade is also an efficient upgrade,
5858
the upgrade plan for worker machines is a subset of the upgrade plan for control plane machines.
5959

6060
## Summary
@@ -78,14 +78,14 @@ by more than one minor Kubernetes version by performing chained and efficient up
7878

7979
### Goals
8080

81-
- Allow Cluster API users using managed topologies to perform chained upgrades.
82-
- Automatically perform efficient upgrades whenever possible.
83-
- Allow Cluster API users to influence the upgrade plan considering e.g. availability of machines images for
84-
the intermediate versions.
81+
When using clusters with managed topologies:
82+
- Allow Cluster API users to perform chained upgrades.
83+
- Automatically perform chained upgrades in an efficient way by skipping workers upgrades whenever possible.
84+
- Allow Cluster API users to influence the upgrade plan e.g. availability of machines images for the intermediate versions.
8585

8686
### Future Work
8787

88-
- Consider if and how to allow users to change the target version while a chained upgrade is being performed.
88+
- Consider if and how to allow users to change the desired version while a chained upgrade is in progress.
8989

9090
### Non-Goals
9191

@@ -104,8 +104,8 @@ by more than one minor Kubernetes version by performing chained and efficient up
104104
- As a cluster class author, I want to be able to specify the Kubernetes versions that the system might use as
105105
intermediate or target versions for a chained upgrades for a Cluster using a specific cluster class.
106106

107-
- As a developer building on top of Cluster API, I want that lifecycle hooks allow orchestration of my external process
108-
during different steps of a chained upgrade.
107+
- As a developer building on top of Cluster API, I want that lifecycle hooks allow orchestration of external process,
108+
like e.g. addon management, during different steps of a chained upgrade.
109109

110110
### Implementation Details/Notes/Constraints
111111

@@ -129,21 +129,22 @@ kind: ClusterClass
129129
metadata:
130130
name: quick-start-runtimesdk
131131
spec:
132-
upgrade:
133-
versions:
134-
- v1.28.0
135-
- v1.29.0
136-
- v1.30.0
137-
- v1.30.1
138-
- v1.31.2
139-
- ...
132+
kubernetesVersions:
133+
- v1.28.0
134+
- v1.29.0
135+
- v1.30.0
136+
- v1.30.1
137+
- v1.31.2
138+
- ...
140139
```
141140
142141
When computing the upgrade plan from Kubernetes vA to vB, Cluster API will use the latest version for each minor in
143142
between vA and vB.
144143
145-
In the example above, the upgrade plan from v1.28.0 to v1.31.2, will be: v1.29.0 -> v1.30.1 -> v1.31.2
146-
(by convention, the current version is omitted by the upgrade plan, the target version is included).
144+
In the example above, the upgrade plan from v1.28.0 - current version - to v1.31.2 - target version -, will be:
145+
v1.29.0 -> v1.30.1 -> v1.31.2
146+
147+
Note: by convention, the current version is omitted from the upgrade plan, the target version is included.
147148
148149
Note: Cluster API cannot determine the list of available Kubernetes versions automatically, because the versions that can be used
149150
in a Cluster API management cluster depend on external factors, e.g., by the availability of machine images for a Kubernetes version.
@@ -160,14 +161,14 @@ metadata:
160161
spec:
161162
upgrade:
162163
external:
163-
getUpgradePlanExtension: get-upgrade-plan.foo
164+
generateUpgradePlanExtension: get-upgrade-plan.foo
164165
```
165166
166167
Example Request:
167168
168169
```yaml
169170
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
170-
kind: GetUpgradePlanRequest
171+
kind: GenerateUpgradePlanRequest
171172
settings: <Runtime Extension settings>
172173
cluster:
173174
apiVersion: cluster.x-k8s.io/v1beta1
@@ -187,7 +188,7 @@ Example Response:
187188
188189
```yaml
189190
apiVersion: hooks.runtime.cluster.x-k8s.io/v1alpha1
190-
kind: GetUpgradePlanResponse
191+
kind: GenerateUpgradePlanResponse
191192
status: Success # or Failure
192193
message: "error message if status == Failure"
193194
controlPlaneVersions:
@@ -197,6 +198,9 @@ controlPlaneVersions:
197198
- v1.33.0
198199
```
199200
201+
Note: in this case the system will infer the list of intermediate version for workers from the list of control plane versions, taking
202+
care of performing the minimum number of workers upgrade by taking into account the [Kubernetes version skew policy](https://kubernetes.io/releases/version-skew-policy/).
203+
200204
Implementers of this runtime extension can also support more sophisticated use cases, e.g.
201205
202206
- Go through more patch release for a minor if necessary, e.g., v1.30.0 -> v1.30.1 -> etc.
@@ -209,6 +213,9 @@ Implementers of this runtime extension can also support more sophisticated use c
209213
- ...
210214
```
211215
216+
Note: in this case the system will infer the list of intermediate version for workers from the list of control plane versions, taking
217+
care of performing the minimum number of workers upgrade by taking into account the [Kubernetes version skew policy](https://kubernetes.io/releases/version-skew-policy/).
218+
212219
- Force workers to upgrade to specific versions, e.g., force workers upgrade to v1.30.0 when doing v1.29.0 -> v1.32.3
213220
(in this example, worker upgrade to 1.30.0 is not required by the [Kubernetes version skew policy](https://kubernetes.io/releases/version-skew-policy/), so it would
214221
be skipped under normal circumstances).
@@ -223,6 +230,9 @@ Implementers of this runtime extension can also support more sophisticated use c
223230
- v1.30.0
224231
```
225232
233+
Note: in this case the system will take into consideration the provided `workersVersions`, but if required by the [Kubernetes version skew policy](https://kubernetes.io/releases/version-skew-policy/),
234+
also add necessary intermediate version for workers inferred from the list of control plane versions.
235+
226236
- Force workers to upgrade to all the intermediate steps (opt out from efficient upgrades).
227237

228238
```yaml
@@ -240,11 +250,13 @@ Implementers of this runtime extension can also support more sophisticated use c
240250
Please note:
241251
- In case both the list of Kubernetes versions and the runtime extension definition will be left empty in a cluster class,
242252
Cluster API will behave as of today: only upgrades to the next minor are allowed for the corresponding clusters.
243-
- If the list of Kubernetes versions is defined in a ClusterClass, the system is going to use this info to:
244-
- Validate the target version for an upgrade of a corresponding cluster
245-
- Check if there is a valid upgrade path from the current version to the target version.
246-
- If instead, the ClusterClass is reading upgrade plans from a runtime extension, the system is NOT going to use it
247-
to validate the target version for an upgrade of a corresponding cluster.
253+
- If the list of Kubernetes versions is defined in a ClusterClass, the system is going to use this info also in
254+
the Cluster validation webhook in order to:
255+
- Validate the initial version of a corresponding cluster (on create)
256+
- Validate the target version for an upgrade of a corresponding cluster (on update)
257+
- Check if there is a valid upgrade path from the current version to the target version (on update)
258+
- If instead, the ClusterClass is reading upgrade plans from a runtime extension, the Cluster validation webhook is
259+
NOT going to call this runtime extension, and thus it won't validate the initial/target version of a corresponding cluster.
248260
- This limitation is driven by the fact that adding nested http calls into webhooks might lead to performance
249261
issues; also, in most cases advanced users already are implementing additional checks for cluster upgrades, and they
250262
need full flexibility in how to integrate the upgrade plan checks.
@@ -258,12 +270,12 @@ Please note:
258270
The topology controller is the component responsible to orchestrate upgrades for clusters using a managed topology,
259271
and it will be improved to:
260272
- compute the upgrade plan (when an upgrade is required/in progress)
261-
- perform the upgrade sequence accordingly
273+
- perform the chained upgrade going through all the intermediate steps in the upgrade plan
262274

263275
While the first change can be inferred from the previous paragraph, the second change requires some additional details.
264276

265277
The topology controller is already capable of performing two atomic operations used during upgrades, "upgrade control
266-
plane" and "upgrade workers"; as of today, these two operations are performed sequentially, one after the other.
278+
plane" and "upgrade workers". Today for an upgrade we run "upgrade control plane" and then "upgrade workers".
267279

268280
This proposal is planning to use existing "upgrade control plane" and "upgrade workers" primitives multiple times
269281
to perform chained and efficient upgrades, e.g., v1.29.0 -> v1.33.0 will be executed as:
@@ -273,7 +285,7 @@ to perform chained and efficient upgrades, e.g., v1.29.0 -> v1.33.0 will be exec
273285
| CP upgrade v1.29.0 -> v1.30.0 | workers can remain on v1.29.0 |
274286
| CP upgrade v1.30.0 -> v1.31.0 | workers can remain on v1.29.0 |
275287
| CP upgrade v1.31.0 -> v1.32.0 | |
276-
| Workers upgrade v1.31.0 -> v1.32.0 | workers must upgrade to prevent violation of Kubernetes version skew rules |
288+
| Workers upgrade v1.29.0 -> v1.32.0 | workers must upgrade to prevent violation of Kubernetes version skew rules |
277289
| CP upgrade v1.32.0 -> v1.33.0 | |
278290
| Workers upgrade v1.32.0 -> v1.33.0 | |
279291

@@ -296,9 +308,9 @@ are upgraded to the intermediate/target version of this iteration, which is poss
296308
`topology.cluster.x-k8s.io/hold-upgrade-sequence` annotations are removed.
297309

298310
However, it might be worth to notice that:
299-
- While performing different "upgrade workers" iterations, the target version all Machine deployment should upgrade to also changes.
311+
- While performing different "upgrade workers" operations, the target version all MachineDeployments should upgrade to also changes.
300312
- `topology.cluster.x-k8s.io/defer-upgrade` and `topology.cluster.x-k8s.io/hold-upgrade-sequence` annotations, must be
301-
applied before each upgrade step (lifecycle hooks described in the next paragraph can be used to orchestrate this process).
313+
applied before each upgrade operation (lifecycle hooks described in the next paragraph can be used to orchestrate this process).
302314

303315
#### Lifecycle hooks
304316

@@ -320,7 +332,7 @@ More specifically:
320332
request and response payload will be similar to corresponding messages for `BeforeControlPlaneUpgrade`
321333
- A new `AfterWorkersUpgrade` hook will be added and called after each "upgrade workers" step;
322334
request and response payload will be similar to corresponding messages for `AfterControlPlaneUpgrade`, but the
323-
hook will be considered blocking only for the intermediate steps of the upgrade (not blocking for the final step).
335+
hook will be considered blocking only for the intermediate steps of the upgrade.
324336
- `AfterClusterUpgrade` will remain as of today, but the system will ensure that a new upgrade
325337
can't start until `AfterClusterUpgrade` is completed.
326338

@@ -354,7 +366,7 @@ allowing users to implement additional pre-upgrade checks.
354366

355367
## Alternatives
356368

357-
An alternative to option leveraging a new CRs to define the list of Kubernetes version to be used for upgrade plans
369+
An alternative option to leverage a new CR to define the list of Kubernetes version to be used for upgrade plans
358370
was considered.
359371

360372
However, the option was discarded because it seems more consistent having the list of
@@ -363,7 +375,7 @@ how a managed topology should behave.
363375

364376
## Upgrade Strategy
365377

366-
No particular upgrade considerations are required, this feature will available to users upgrading to
378+
No particular upgrade considerations are required, this feature will be available to users upgrading to
367379
Cluster API v1.11.
368380

369381
However, it is required to enhance ClusterClasses with the information required to compute upgrade plans,

0 commit comments

Comments
 (0)