@@ -154,18 +154,24 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
154
154
155
155
## Summary
156
156
157
- We want to increase granularity of the upgrade sequence in Kubernetes, to make
158
- upgrades safer, by means of introducing a "compatibility version". A
159
- "compatibility version" is distinct from the binary version of a Kubernetes
160
- component, effectively emulating the APIs and features from a given Kubernetes
161
- release. can
157
+ We intend to introduce a "compatibility version" option to Kubernetes control
158
+ plane components to make upgrades safer by increasing the granularity of steps
159
+ avilable to cluster administrators. The "compatibility version" is distinct from
160
+ the binary version of a Kubernetes component, and can be used to emulate the
161
+ behavior (APIs, features, ...) of a prior Kubernetes release version.
162
162
163
163
## Motivation
164
164
165
165
The notion of more granular steps in Kubernetes upgrades is attractive because
166
166
it is more rigorous about how we step through a Kubernetes control-plane
167
167
upgrade, introducing potentially corrupting data (i.e. data only present in N+1
168
- and not in N) only in later stages of the upgrade process.
168
+ and not in N) only in later stages of the upgrade process.
169
+
170
+ For example, upgrading from Kubernetes 1.30 to 1.31 while keeping the compatibility
171
+ version at 1.30 would enable a cluster administrator to validate that the new
172
+ Kubernetes binary version is working as desired before exposing any feature changes
173
+ introduced in 1.31 to cluster users, and without writing and data to storage at
174
+ newer API versions.
169
175
170
176
This extra step increases the granularity of our upgrade sequence so that
171
177
(1) failures are more easily diagnosed (since we have more granular steps, we
@@ -187,16 +193,44 @@ Kubernetes control-plane, by means of:
187
193
188
194
### Goals
189
195
190
- - introduce the metadata necessary to toggle features/APIs/storage-versions/CEL features at a specific released of Kubernetes
191
-
196
+ - Introduce the metadata necessary to configure features/APIs/storage-versions/CEL
197
+ features to match the behavior of an older Kubernetes release version
198
+ - A Kubernetes binary with compatibility version set to N, will pass the
199
+ conformance and e2e tests from Kubernetes release version N.
200
+ - A Kubernetes binary with compatibility version set to N does not enable any
201
+ changes (storage versions, CEL feature, feawtures) that would prevent it
202
+ from being rolled back to N-1.
203
+ - The most recent Kubernetes version supports compatiblity version being set to
204
+ the full range of supported versions (N..N-3).
192
205
193
206
### Non-Goals
194
207
195
- - changes to CAPI/kubeadm to absorb the compatibility versions will be addressed in a separate KEP
208
+ - Changes to CAPI/kubeadm/KIND/minikube to absorb the compatibility versions
209
+ will be addressed separate from this KEP
196
210
197
211
## Proposal
198
212
213
+ Kubernetes components (apiservers, controller managers, schedulers) will offer a
214
+ ` --compatibility-version ` flag that can be set to any of the previous three
215
+ minor versions. If unset, the compatibility version defaults to the minor
216
+ version of the binary.
217
+
218
+ Features will be versioned, i.e.:
219
+
220
+ ``` go
221
+ type FeatureSpec struct {
222
+ // ...
223
+
224
+ // Version indicates the version this feature spec was introduced.
225
+ Version semver.Version
226
+ }
227
+ ```
228
+
229
+ When a component starts, feature gates will be compared against the compatibility version to
230
+ determine which features to enable for that compatibility version.
199
231
232
+ Similarily, StorageVersions, APIs and CEL features will be versioned such that configured
233
+ to match a compatibbility version.
200
234
201
235
### User Stories (Optional)
202
236
@@ -209,6 +243,18 @@ bogged down.
209
243
210
244
#### Story 1
211
245
246
+ A cluster administrator is running Kubernetes 1.30.12 and wishes to perform a cautious
247
+ upgrade to 1.31.5 using the smallest upgrade steps possible, validaing the health
248
+ of the cluster between each step.
249
+
250
+ - For each control plane component, in the [ recommended
251
+ order] ( https://kubernetes.io/releases/version-skew-policy/ ) :
252
+ - Upgrades binary to ` kubernetes-1.31.5 ` but sets ` --compatibility-version=1.30 `
253
+ - Verifies that the cluster is healthy
254
+ - Next, for each control plane component:
255
+ - Sets ` --compatibility-version=1.31 `
256
+ - Verifies that the cluster is healthy
257
+
212
258
#### Story 2
213
259
214
260
### Notes/Constraints/Caveats (Optional)
@@ -222,6 +268,21 @@ This might be a good place to talk about core concepts and how they relate.
222
268
223
269
### Risks and Mitigations
224
270
271
+ Risk: Introducing this change increases the maintenance burden on Kubernetes
272
+ maintainers.
273
+
274
+ Why we think this is managable:
275
+
276
+ - We already author features to be gated. The only change here is include
277
+ enough information about features so that they can be selectively enabled/disabled
278
+ based on compatibility version.
279
+ - We already manually deprecate/remove features. This change will instead
280
+ leave features in code longer, and require feature gates to track at which
281
+ verion a feature is deprecated/removed. The total maintenance work is
282
+ about the same.
283
+ - Some maintenance becomes simpler as the additional version data about
284
+ features makes them easier to reason about and keep track of.
285
+
225
286
<!--
226
287
What are the risks of this proposal, and how do we mitigate? Think broadly.
227
288
For example, consider both security and how this will impact the larger
0 commit comments