You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
KEP-2625: Update CPU Manager Policy Options 1.23 Beta
- This PR updates the KEP to capture the policy name `full-pcpus-only`
based on the implementation merged in 1.22 release.
- Explains how this feature is being used for for introduction of another policy option
- Changes pertaining to promotion to Beta
- Checks relevant fields in the Release Signoff Checklist
Signed-off-by: Swati Sehgal <[email protected]>
Copy file name to clipboardExpand all lines: keps/sig-node/2625-cpumanager-policies-thread-placement/README.md
+21-19Lines changed: 21 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,7 +14,7 @@
14
14
-[Risks and Mitigations](#risks-and-mitigations)
15
15
-[Design Details](#design-details)
16
16
-[Proposed Change](#proposed-change)
17
-
-[Implementation strategy of reject-non-smt-aligned CPU Manager policy option](#implementation-strategy-of-reject-non-smt-aligned-cpu-manager-policy-option)
17
+
-[Implementation strategy of full-pcpus-only CPU Manager policy option](#implementation-strategy-of-full-pcpus-only-cpu-manager-policy-option)
18
18
-[Resource Accounting](#resource-accounting)
19
19
-[Alternatives](#alternatives)
20
20
-[Add extra resources](#add-extra-resources)
@@ -43,16 +43,16 @@
43
43
44
44
Items marked with (R) are required *prior to targeting to a milestone / release*.
45
45
46
-
-[] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements](https://github.com/kubernetes/enhancements/issues/2404)
47
-
-[] (R) KEP approvers have approved the KEP status as `implementable`
48
-
-[] (R) Design details are appropriately documented
49
-
-[] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
50
-
-[] (R) Graduation criteria is in place
51
-
-[] (R) Production readiness review completed
46
+
-[X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements](https://github.com/kubernetes/enhancements/issues/2404)
47
+
-[X] (R) KEP approvers have approved the KEP status as `implementable`
48
+
-[X] (R) Design details are appropriately documented
49
+
-[X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
50
+
-[X] (R) Graduation criteria is in place
51
+
-[X] (R) Production readiness review completed
52
52
-[ ] Production readiness review approved
53
-
-[] "Implementation History" section is up-to-date for milestone
53
+
-[X] "Implementation History" section is up-to-date for milestone
54
54
- ~~ [] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] ~~
55
-
-[] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
55
+
-[X] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
@@ -114,30 +114,30 @@ The impact in the shared codebase will be addressed enhancing the current testsu
114
114
115
115
We propose to
116
116
- add a new flag in Kubelet called `CPUManagerPolicyOptions` in the kubelet config or command line argument called `cpumanager-policy-options` which allows the user to specify the CPU Manager policy option.
117
-
- add a new cpu manager option called `reject-non-smt-aligned`; if present, this option will enable further refinements of the existing static policy.
117
+
- add a new cpu manager option called `full-pcpus-only`; if present, this option will enable further refinements of the existing static policy.
118
118
119
119
The static policy allocates CPUs using a topology-aware best-fit allocation. This enhancement wants to provide stronger guarantees by restricting the allocation of threads.
120
120
The aim is to achieve the isolation for workloads managed by Kubernetes. The other part of isolation is (as of now) not managed by Kubernetes, as described in [Explicitly Reserved CPU List](https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list) and [Static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy).
121
121
122
-
Let's summarize the key properties of the `reject-non-smt-aligned` option:
122
+
Let's summarize the key properties of the `full-pcpus-only` option:
123
123
- Preserve all the properties of the `static` policy.
124
124
- Never allocate less than a physical-cpu worth amount of cores.
125
125
- With this requirement enforced, the CPUManager allocation algorithm will guarantee avoidance of physical core sharing.
126
126
- Should the node not have enough free physical cores, the Pod will be put in Failed state, with `SMTAlignmentError` as reason.
127
127
128
-
### Implementation strategy of reject-non-smt-aligned CPU Manager policy option
128
+
### Implementation strategy of full-pcpus-only CPU Manager policy option
129
129
130
-
- In order to introduce the SMT-alignment check in CPU Manager, we introduce a new flag in Kubelet to allow the user to specify `cpumanager-policy-options` which when specified with `reject-non-smt-aligned` as its value provides the capability to modify the behaviour of static policy to strictly guarantee allocation of whole cores to a workload.
130
+
- In order to introduce the SMT-alignment check in CPU Manager, we introduce a new flag in Kubelet to allow the user to specify `cpumanager-policy-options` which when specified with `full-pcpus-only` as its value provides the capability to modify the behaviour of static policy to strictly guarantee allocation of whole cores to a workload.
131
131
- The `CPUManagerPolicyOptions` received from the kubelet config/command line args is propogated to the Container Manager.
132
132
- The responsibility of admission control is centralized in containermanager. The resource managers and/or the resource allocation orchestrator (Topology Manager) still have the responsibility of running the checks to admit the pods, but the handling of these errors and the building of the pod lifecycle result are now factored in containermanager.
133
133
- Prior to this feature, the Container Manager admission handler was delegated to the topology manager if the latter was enabled. This worked well under the assumption that only Topology Manager had the ability to reject admissions with pods. But with the introduction of this feature, the CPU Manager also needs the ability to possibly reject pods if strict SMT alignment is requested. In order to do so, we introduce a new error and let it drive the rejection. Due to an already existing dependency between CPUManager and TopologyManager as the former imports the latter in order to support the `topologymanager.HintProvider` interface, container manager is considered as the appropriate for performing admission control.
134
-
- When `reject-non-smt-aligned` policy option is specified along with `static` CPU Manager policy, an additional check in the allocation logic of the `static` policy ensures that CPUs would be allocated such that full cores are allocated. Because of this check, a pod would never have to acquire single threads with the aim to fill partially-allocated cores.
134
+
- When `full-pcpus-only` policy option is specified along with `static` CPU Manager policy, an additional check in the allocation logic of the `static` policy ensures that CPUs would be allocated such that full cores are allocated. Because of this check, a pod would never have to acquire single threads with the aim to fill partially-allocated cores.
135
135
- In case request translates to partial occupancy of the cores, the Pod will not be admitted and would fail with `SMTAlignmentError`.
136
136
137
137
138
138
### Resource Accounting
139
139
140
-
To illustrate the behaviour of the `reject-non-smt-aligned` policy option, we will consider the following CPU topology. We will use as example a CPU package with 16 physical cores, 2-way SMT-capable.
140
+
To illustrate the behaviour of the `full-pcpus-only` policy option, we will consider the following CPU topology. We will use as example a CPU package with 16 physical cores, 2-way SMT-capable.
141
141
142
142

143
143
@@ -162,11 +162,11 @@ spec:
162
162
cpu: "5"
163
163
```
164
164
165
-
The `reject-non-smt-aligned` policy option will cause the pod to be rejected since it doesn't request enough cores to consume all virtual threads exposed by the CPU.
165
+
The `full-pcpus-only` policy option will cause the pod to be rejected since it doesn't request enough cores to consume all virtual threads exposed by the CPU.
166
166
167
167
would need to make sure the remaining core on the half-allocated physical CPU is left unallocated to avoid noisy neighbours.
168
168
169
-

169
+

170
170
171
171
The container will then actually get more virtual cores (6) than what is requesting (5).
172
172
@@ -250,7 +250,7 @@ We would like to mention a further extension of this work, which we are *not* pr
250
250
A further subset of the latency sensitive class of workload we identified (CNF, HFT) benefits most of non-SMT system, delivering the best possible performance here.
251
251
For these applications, just disabling SMT at machine level solves the need of the workload, but overall creates worse usage of hardware resources and poorer container density.
252
252
253
-
Another policy option, or a further refinement of `reject-non-smt-aligned`, which enables non-SMT emulation on SMT-enabled system would allow to accommodate these needs, but this would cause even more significant resource accounting mismatches
253
+
Another policy option, or a further refinement of `full-pcpus-only`, which enables non-SMT emulation on SMT-enabled system would allow to accommodate these needs, but this would cause even more significant resource accounting mismatches
254
254
as described above. Furthermore, at the moment of writing we are still assessing how large is the set of the classes which benefit of these extra guarantees.
255
255
256
256
For all these reasons we postponed this work to a later date.
@@ -268,6 +268,7 @@ The [implementation PR](https://github.com/kubernetes/kubernetes/pull/101432) wi
268
268
#### Alpha to Beta Graduation
269
269
- [X] Gather feedback from the consumer of the policy.
270
270
- [X] No major bugs reported in the previous cycle.
271
+
- [X] Use of this policy option to further configure the behavior of CPU manager. Another CPUManager policy option `distribute-cpus-across-numa` is being proposed in 1.23 release to distribute CPUs across NUMA nodes instead of packing them.
271
272
272
273
#### Beta to G.A Graduation
273
274
- [X] Allowing time for feedback (1 year).
@@ -332,7 +333,7 @@ No changes needed
332
333
### Troubleshooting
333
334
334
335
* **How does this feature react if the API server and/or etcd is unavailable?**: No effect.
335
-
* **What are other known failure modes?** TBD
336
+
* **What are other known failure modes?** No known failure mode.
336
337
* **What steps should be taken if SLOs are not being met to determine the problem?** N/A
- 2021-05-10: KEP update to add to rename the `smtalign` to `reject-non-smt-aligned` for better clarity and address review comments
350
351
- 2021-05-11: KEP update to add to the `configurable` alias and address review comments
351
352
- 2021-05-13: KEP update to postpone the `configurable` alias, per review comments
353
+
- 2021-09-02: KEP update to capture the policy name `full-pcpus-only` based on the implementation merged in 1.22, explain how this feature is being used for introduction of another policy option and updates pertaining to promotion of the feature to Beta.
0 commit comments