Skip to content

Commit c783802

Browse files
committed
4033: add cri_losing_support metric and clarify support for containerd 1.7 will be dropped in 1.36
Signed-off-by: Peter Hunt <[email protected]>
1 parent 2895ae2 commit c783802

File tree

2 files changed

+14
-11
lines changed

2 files changed

+14
-11
lines changed

keps/sig-node/4033-group-driver-detection-over-cri/README.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -186,10 +186,15 @@ cgroupDriver option has been deprecated and will be dropped in a future release.
186186
The `--cgroup-driver` flag and the cgroupDriver configuration option will be
187187
deprecated when support for the feature is graduated to GA.
188188
The configurations flags (and the related fallback behavior) will be removed in
189-
a later release as per the [Kubernetes deprecation policy][deprecation-policy].
189+
Kubernetes 1.36. This aligns well with containerd v1.7 going out of support, which is the last
190+
remaining supported CRI that doesn't have support for this field.
190191
At the point the kubelet refuses to start if the CRI runtime does not support
191192
the feature.
192193

194+
Between version 1.34 and 1.36, the kubelet will emit a counter metric (`cri_losing_support`) when a CRI implementation is
195+
used that doesn't have support for the RuntimeConfig CRI call. This metric will have a label describing the version support will be dropped by.
196+
If one node in a cluster has containerd running with 1.7, the metric will look like `cri_losing_support{,version="1.36"} 1`.
197+
193198
Kubelet startup is modified so that connection to the CRI server (container
194199
runtime) is established and RuntimeConfig is queried before initializing the
195200
kubelet internal container-manager which is responsible for kubelet-side cgroup
@@ -198,8 +203,6 @@ succeed, an error (error response or timeout) is regarded as a failed
198203
initialization of the runtime service and kubelet will exit with an error
199204
message and an error code.
200205

201-
[deprecation-policy]: https://kubernetes.io/docs/reference/using-api/deprecation-policy/#deprecating-a-flag-or-cli
202-
203206
### Test Plan
204207

205208
[x] I/we understand the owners of the involved components may require updates to
@@ -329,8 +332,8 @@ CgroupDriver as they must be today.
329332

330333
###### What specific metrics should inform a rollback?
331334

332-
Nodes being in NotReady state with kubelet logs indicating an error in the
333-
RuntimeConfig CRI request, making kubelet fail to start.
335+
`cri_losing_support` metric will be populated on nodes where the CRI implementation will one day lose support. After 1.36, kubelet will fatally error,
336+
so admins should upgrade their out of support CRI implementations (if `version==1.36`).
334337

335338
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
336339

@@ -357,8 +360,8 @@ info`).
357360

358361
###### How can someone using this feature know that it is working for their instance?
359362

360-
No metrics will expose this. Examining kubelet logs whould inform
361-
that the cgroup driver setting instructed by the runtime is being used.
363+
The metric `cri_losing_support` when `version == 1.36` will indicate those nodes will be out of support in 1.36.
364+
If that metric is unpopulated, the feature is on (as it's GA) and the flag fallback is not being used.
362365

363366
After GA, the CgroupDriver configuration option and the `--cgroup-driver` flag
364367
will be removed in a future release, in accordance with the
@@ -378,7 +381,8 @@ N/A.
378381

379382
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
380383

381-
N/A.
384+
The metric `cri_losing_support` when `version == 1.36` will indicate those nodes will be out of support in 1.36.
385+
If that metric is unpopulated, the feature is on (as it's GA) and the flag fallback is not being used.
382386

383387
### Dependencies
384388

keps/sig-node/4033-group-driver-detection-over-cri/kep.yaml

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,5 @@ feature-gates:
3838
- kubelet
3939
disable-supported: false
4040

41-
# The following PRR answers are required at beta release
42-
# No metrics will be added for this feature, as it's an internal
43-
# implementation detail
41+
metrics:
42+
- cri_losing_support

0 commit comments

Comments
 (0)