You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-node/2625-cpumanager-policies-thread-placement/README.md
+49-25Lines changed: 49 additions & 25 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,6 +29,7 @@
29
29
-[Graduation Criteria of Options](#graduation-criteria-of-options)
30
30
-[Graduation of Options to <code>Beta-quality</code> (non-hidden)](#graduation-of-options-to-beta-quality-non-hidden)
31
31
-[Graduation of Options from <code>Beta-quality</code> to <code>G.A-quality</code>](#graduation-of-options-from-beta-quality-to-ga-quality)
32
+
-[Removal of the CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions feature gates](#removal-of-the-cpumanagerpolicyalphaoptions-and-cpumanagerpolicybetaoptions-feature-gates)
@@ -248,8 +254,7 @@ NOTE: Even though the feature gate is enabled by default the user still has to e
248
254
The alpha-quality options are hidden by default and only if the `CPUManagerPolicyAlphaOptions` feature gate is enabled the user has the ability to use them.
249
255
The beta-quality options are visible by default, and the feature gate allows a positive acknowledgement that non stable features are being used, and also allows to optionally turn them off.
250
256
Based on the graduation criteria described below, a policy option will graduate from a group to the other (alpha to beta).
251
-
We plan to removete the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable, after a feature cycle passes without new planned options, and not before 1.28, to give ample time to the work in progress option to graduate at least to beta.
252
-
- Since the feature that allows the ability to customize the behaviour of CPUManager static policy as well as the CPUManager Policy option `full-pcpus-only` were both introduced in 1.22 release and meet the above graduation criterion, `full-pcpus-only` would be considered as a non-hidden option i.e. available to be used when explicitly used along with `CPUManagerPolicyOptions` Kubelet flag in the kubelet config or command line argument called `cpumanager-policy-options` .
257
+
- Since the feature that allows the ability to customize the behaviour of CPUManager static policy as well as the CPUManager Policy option `full-pcpus-only` were both introduced in 1.22 release and meet the above graduation criterion, `full-pcpus-only` would be considered as a non-hidden option i.e. available to be used when explicitly used along with `CPUManagerPolicyOptions` Kubelet flag in the kubelet configuration or command line argument called `cpumanager-policy-options` .
253
258
- The introduction of this new feature gate gives us the ability to move the feature to beta and later stable without implying all that the options are beta or stable.
254
259
255
260
The graduation Criteria of options is described below:
@@ -262,6 +267,22 @@ The graduation Criteria of options is described below:
262
267
- [X] Allowing time for feedback (1 year) on the policy option.
263
268
- [X] Risks have been addressed.
264
269
270
+
### Removal of the CPUManagerPolicyAlphaOptions and CPUManagerPolicyBetaOptions feature gates
271
+
272
+
This KEP added the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` group feature gates alongside the usual changes required to enable the `full-pcpus-only` option.
273
+
274
+
We plan to remove the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` after all options graduated to stable.
275
+
We will defer to the last graduating option the additional work to remove the gates. In case of two or more options graduating to GA and thus rendering the gates obsolete, a new minimal KEP should be issued to remove the gates.
276
+
277
+
The SIG-node community is considering a redesign of the resource management based on NRI and DRA technologies (possibly extended) for the future.
278
+
There were conversation and attempts about this topic since the 1.27 cycle and [past attempts already](https://github.com/kubernetes/enhancements/issues/3675).
279
+
We thus expect a gradual slowdown of additions to cpumanager, including policy options. At time of writing (1.33 cycle) we have 6 policy options at various degree of maturity.
280
+
We expect the rate of proposal of new options to greatly slow down and to stop entirely once the community moves to the future, yet unplanned, resource management architecture.
281
+
282
+
For the reasons above, we believe it's unlikely we will need to add back the `CPUManagerPolicyAlphaOptions` and `CPUManagerPolicyBetaOptions` feature gates once removed.
283
+
Should new options be proposed and agreed by the community, the recommendation is to graduate using specific feature gates per standard process.
284
+
Considering the expected slowdown, we expect the standard graduation process to be much more manageable in the future, if needed at all.
285
+
265
286
### Upgrade / Downgrade Strategy
266
287
267
288
We expect no impact. The new policies are opt-in and separated by the existing ones.
@@ -317,48 +338,51 @@ Kubelet may fail to start. The kubelet may crash.
317
338
318
339
###### What specific metrics should inform a rollback?
319
340
320
-
The number of pod ending up in Failed for SMTAlignmentError could be used to decide a rollback.
341
+
We can use `cpu_manager_pinning_errors_total` to see all the allocation errors, irrespective of the specific reason though.
342
+
In addition, we can use the logs: the number of pod ending up in Failed for SMTAlignmentError could be used to decide a rollback.
321
343
322
344
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
323
345
324
346
Not Applicable.
325
347
326
348
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
349
+
327
350
No.
328
351
329
352
### Monitoring requirements
330
353
331
354
###### How can an operator determine if the feature is in use by workloads?
332
355
356
+
- Check the metric `container_aligned_compute_resources_count` with the label `boundary=physical_cpu`
333
357
- Inspect the kubelet configuration of the nodes: check feature gates and usage of the new options
334
358
335
359
###### How can someone using this feature know that it is working for their instance?
336
360
337
-
- [ ] Events
338
-
- Event Reason:
339
-
- [ ] API .status
340
-
- Condition name:
341
-
- Other field:
342
-
- [ ] Other (treat as last resort)
361
+
- [X] Other (treat as last resort)
343
362
- Details:
363
+
- check metrics and their interplay:
364
+
* the metric `container_aligned_compute_resources_count` with the label `boundary=physical_cpu`
365
+
* the metric `cpu_manager_pinning_requests_total`
366
+
* the metric `cpu_manager_pinning_errors_total`
344
367
345
368
###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?
346
369
347
370
N/A.
348
371
349
372
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
350
373
351
-
- [] Metrics
374
+
- [X] Metrics
352
375
- Metric name:
353
-
- [Optional] Aggregation method:
354
-
- Components exposing the metric:
355
-
- [ ] Other (treat as last resort)
356
-
- Details:
376
+
* the metric `container_aligned_compute_resources_count` with the label `boundary=physical_cpu`
377
+
* the metric `cpu_manager_pinning_requests_total`
378
+
* the metric `cpu_manager_pinning_errors_total`
379
+
- Components exposing the metric: kubelet
357
380
358
381
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
359
382
360
-
TBD
361
-
383
+
We can detail the pinning errors total with a new metric like `cpu_manager_errors_count` or
384
+
`container_aligned_compute_resources_failure_count`using the same labels as we use for `container_aligned_compute_resources_count`.
385
+
These metrics will be added before to graduate to GA.
362
386
363
387
### Dependencies
364
388
@@ -394,7 +418,7 @@ No.
394
418
395
419
###### Can enabling / using this feature result in resource exhaustion of some node resources (PIDs, sockets, inodes, etc.)?
396
420
397
-
TBD
421
+
No.
398
422
399
423
### Troubleshooting
400
424
@@ -404,11 +428,11 @@ No effect.
404
428
405
429
###### What are other known failure modes?
406
430
407
-
No known failure mode. (TBD)
431
+
Allocation failures can lead to workload not going running. The only remediation is to disable the features and restart the kubelets.
408
432
409
433
###### What steps should be taken if SLOs are not being met to determine the problem?
410
434
411
-
N/A (TBD)
435
+
Inspect the metrics and possibly the logs to learn the failure reason
0 commit comments