Skip to content

Commit 45cb658

Browse files
committed
KEP-2625: Update CPU Manager Policy Options 1.23 Beta
- This PR updates the KEP to capture the policy name `full-pcpus-only` based on the implementation merged in 1.22 release. - Explains how this feature is being used for for introduction of another policy option - Changes pertaining to promotion to Beta - Checks relevant fields in the Release Signoff Checklist Signed-off-by: Swati Sehgal <[email protected]>
1 parent 4dd81f5 commit 45cb658

File tree

3 files changed

+26
-21
lines changed

3 files changed

+26
-21
lines changed
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
kep-number: 2625
22
alpha:
33
approver: "@johnbelamaric"
4+
beta:
5+
approver: "@johnbelamaric"

keps/sig-node/2625-cpumanager-policies-thread-placement/README.md

Lines changed: 21 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
- [Risks and Mitigations](#risks-and-mitigations)
1515
- [Design Details](#design-details)
1616
- [Proposed Change](#proposed-change)
17-
- [Implementation strategy of reject-non-smt-aligned CPU Manager policy option](#implementation-strategy-of-reject-non-smt-aligned-cpu-manager-policy-option)
17+
- [Implementation strategy of full-pcpus-only CPU Manager policy option](#implementation-strategy-of-full-pcpus-only-cpu-manager-policy-option)
1818
- [Resource Accounting](#resource-accounting)
1919
- [Alternatives](#alternatives)
2020
- [Add extra resources](#add-extra-resources)
@@ -43,16 +43,16 @@
4343

4444
Items marked with (R) are required *prior to targeting to a milestone / release*.
4545

46-
- [ ] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements](https://github.com/kubernetes/enhancements/issues/2404)
47-
- [ ] (R) KEP approvers have approved the KEP status as `implementable`
48-
- [ ] (R) Design details are appropriately documented
49-
- [ ] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
50-
- [ ] (R) Graduation criteria is in place
51-
- [ ] (R) Production readiness review completed
46+
- [X] (R) Enhancement issue in release milestone, which links to KEP dir in [kubernetes/enhancements](https://github.com/kubernetes/enhancements/issues/2404)
47+
- [X] (R) KEP approvers have approved the KEP status as `implementable`
48+
- [X] (R) Design details are appropriately documented
49+
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
50+
- [X] (R) Graduation criteria is in place
51+
- [X] (R) Production readiness review completed
5252
- [ ] Production readiness review approved
53-
- [ ] "Implementation History" section is up-to-date for milestone
53+
- [X] "Implementation History" section is up-to-date for milestone
5454
- ~~ [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io] ~~
55-
- [ ] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
55+
- [X] Supporting documentation e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
5656

5757
[kubernetes.io]: https://kubernetes.io/
5858
[kubernetes/enhancements]: https://git.k8s.io/enhancements
@@ -114,30 +114,30 @@ The impact in the shared codebase will be addressed enhancing the current testsu
114114

115115
We propose to
116116
- add a new flag in Kubelet called `CPUManagerPolicyOptions` in the kubelet config or command line argument called `cpumanager-policy-options` which allows the user to specify the CPU Manager policy option.
117-
- add a new cpu manager option called `reject-non-smt-aligned`; if present, this option will enable further refinements of the existing static policy.
117+
- add a new cpu manager option called `full-pcpus-only`; if present, this option will enable further refinements of the existing static policy.
118118

119119
The static policy allocates CPUs using a topology-aware best-fit allocation. This enhancement wants to provide stronger guarantees by restricting the allocation of threads.
120120
The aim is to achieve the isolation for workloads managed by Kubernetes. The other part of isolation is (as of now) not managed by Kubernetes, as described in [Explicitly Reserved CPU List](https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list) and [Static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy).
121121

122-
Let's summarize the key properties of the `reject-non-smt-aligned` option:
122+
Let's summarize the key properties of the `full-pcpus-only` option:
123123
- Preserve all the properties of the `static` policy.
124124
- Never allocate less than a physical-cpu worth amount of cores.
125125
- With this requirement enforced, the CPUManager allocation algorithm will guarantee avoidance of physical core sharing.
126126
- Should the node not have enough free physical cores, the Pod will be put in Failed state, with `SMTAlignmentError` as reason.
127127

128-
### Implementation strategy of reject-non-smt-aligned CPU Manager policy option
128+
### Implementation strategy of full-pcpus-only CPU Manager policy option
129129

130-
- In order to introduce the SMT-alignment check in CPU Manager, we introduce a new flag in Kubelet to allow the user to specify `cpumanager-policy-options` which when specified with `reject-non-smt-aligned` as its value provides the capability to modify the behaviour of static policy to strictly guarantee allocation of whole cores to a workload.
130+
- In order to introduce the SMT-alignment check in CPU Manager, we introduce a new flag in Kubelet to allow the user to specify `cpumanager-policy-options` which when specified with `full-pcpus-only` as its value provides the capability to modify the behaviour of static policy to strictly guarantee allocation of whole cores to a workload.
131131
- The `CPUManagerPolicyOptions` received from the kubelet config/command line args is propogated to the Container Manager.
132132
- The responsibility of admission control is centralized in containermanager. The resource managers and/or the resource allocation orchestrator (Topology Manager) still have the responsibility of running the checks to admit the pods, but the handling of these errors and the building of the pod lifecycle result are now factored in containermanager.
133133
- Prior to this feature, the Container Manager admission handler was delegated to the topology manager if the latter was enabled. This worked well under the assumption that only Topology Manager had the ability to reject admissions with pods. But with the introduction of this feature, the CPU Manager also needs the ability to possibly reject pods if strict SMT alignment is requested. In order to do so, we introduce a new error and let it drive the rejection. Due to an already existing dependency between CPUManager and TopologyManager as the former imports the latter in order to support the `topologymanager.HintProvider` interface, container manager is considered as the appropriate for performing admission control.
134-
- When `reject-non-smt-aligned` policy option is specified along with `static` CPU Manager policy, an additional check in the allocation logic of the `static` policy ensures that CPUs would be allocated such that full cores are allocated. Because of this check, a pod would never have to acquire single threads with the aim to fill partially-allocated cores.
134+
- When `full-pcpus-only` policy option is specified along with `static` CPU Manager policy, an additional check in the allocation logic of the `static` policy ensures that CPUs would be allocated such that full cores are allocated. Because of this check, a pod would never have to acquire single threads with the aim to fill partially-allocated cores.
135135
- In case request translates to partial occupancy of the cores, the Pod will not be admitted and would fail with `SMTAlignmentError`.
136136

137137

138138
### Resource Accounting
139139

140-
To illustrate the behaviour of the `reject-non-smt-aligned` policy option, we will consider the following CPU topology. We will use as example a CPU package with 16 physical cores, 2-way SMT-capable.
140+
To illustrate the behaviour of the `full-pcpus-only` policy option, we will consider the following CPU topology. We will use as example a CPU package with 16 physical cores, 2-way SMT-capable.
141141

142142
![Example Topology](smtalign-topology.png)
143143

@@ -162,11 +162,11 @@ spec:
162162
cpu: "5"
163163
```
164164
165-
The `reject-non-smt-aligned` policy option will cause the pod to be rejected since it doesn't request enough cores to consume all virtual threads exposed by the CPU.
165+
The `full-pcpus-only` policy option will cause the pod to be rejected since it doesn't request enough cores to consume all virtual threads exposed by the CPU.
166166

167167
would need to make sure the remaining core on the half-allocated physical CPU is left unallocated to avoid noisy neighbours.
168168

169-
![Example core allocation with the reject-non-smt-aligned policy option when requesting a odd number of cores](smtalign-allocation-odd-cores.png)
169+
![Example core allocation with the full-pcpus-only policy option when requesting a odd number of cores](smtalign-allocation-odd-cores.png)
170170

171171
The container will then actually get more virtual cores (6) than what is requesting (5).
172172

@@ -250,7 +250,7 @@ We would like to mention a further extension of this work, which we are *not* pr
250250
A further subset of the latency sensitive class of workload we identified (CNF, HFT) benefits most of non-SMT system, delivering the best possible performance here.
251251
For these applications, just disabling SMT at machine level solves the need of the workload, but overall creates worse usage of hardware resources and poorer container density.
252252

253-
Another policy option, or a further refinement of `reject-non-smt-aligned`, which enables non-SMT emulation on SMT-enabled system would allow to accommodate these needs, but this would cause even more significant resource accounting mismatches
253+
Another policy option, or a further refinement of `full-pcpus-only`, which enables non-SMT emulation on SMT-enabled system would allow to accommodate these needs, but this would cause even more significant resource accounting mismatches
254254
as described above. Furthermore, at the moment of writing we are still assessing how large is the set of the classes which benefit of these extra guarantees.
255255

256256
For all these reasons we postponed this work to a later date.
@@ -268,6 +268,7 @@ The [implementation PR](https://github.com/kubernetes/kubernetes/pull/101432) wi
268268
#### Alpha to Beta Graduation
269269
- [X] Gather feedback from the consumer of the policy.
270270
- [X] No major bugs reported in the previous cycle.
271+
- [X] Use of this policy option to further configure the behavior of CPU manager. Another CPUManager policy option `distribute-cpus-across-numa` is being proposed in 1.23 release to distribute CPUs across NUMA nodes instead of packing them.
271272

272273
#### Beta to G.A Graduation
273274
- [X] Allowing time for feedback (1 year).
@@ -332,7 +333,7 @@ No changes needed
332333
### Troubleshooting
333334

334335
* **How does this feature react if the API server and/or etcd is unavailable?**: No effect.
335-
* **What are other known failure modes?** TBD
336+
* **What are other known failure modes?** No known failure mode.
336337
* **What steps should be taken if SLOs are not being met to determine the problem?** N/A
337338

338339
[supported limits]: https://git.k8s.io/community//sig-scalability/configs-and-limits/thresholds.md
@@ -349,3 +350,4 @@ No changes needed
349350
- 2021-05-10: KEP update to add to rename the `smtalign` to `reject-non-smt-aligned` for better clarity and address review comments
350351
- 2021-05-11: KEP update to add to the `configurable` alias and address review comments
351352
- 2021-05-13: KEP update to postpone the `configurable` alias, per review comments
353+
- 2021-09-02: KEP update to capture the policy name `full-pcpus-only` based on the implementation merged in 1.22, explain how this feature is being used for introduction of another policy option and updates pertaining to promotion of the feature to Beta.

keps/sig-node/2625-cpumanager-policies-thread-placement/kep.yaml

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ owning-sig: sig-node
77
participating-sigs: []
88
status: implementable
99
creation-date: "2021-04-14"
10+
last-updated: "2021-09-02"
1011
reviewers:
1112
- "@klueska"
1213
approvers:
@@ -17,12 +18,12 @@ see-also: []
1718
replaces: []
1819

1920
# The target maturity stage in the current dev cycle for this KEP.
20-
stage: alpha
21+
stage: beta
2122

2223
# The most recent milestone for which work toward delivery of this KEP has been
2324
# done. This can be the current (upcoming) milestone, if it is being actively
2425
# worked on.
25-
latest-milestone: "v1.22"
26+
latest-milestone: "v1.23"
2627

2728
# The milestone at which this feature was, or is targeted to be, at each stage.
2829
milestone:

0 commit comments

Comments
 (0)