Skip to content

Commit bd3a401

Browse files
committed
doc: renamed option for clarity, address comments
Worth pointing out that cpumanager is _already_ smt-aligning. This option wants to reject workloads which cannot be aligned, because they request the odd number of cores. Signed-off-by: Francesco Romani <[email protected]>
1 parent 7982c8d commit bd3a401

File tree

1 file changed

+20
-13
lines changed
  • keps/sig-node/2625-cpumanager-policies-thread-placement

1 file changed

+20
-13
lines changed

keps/sig-node/2625-cpumanager-policies-thread-placement/README.md

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
- [Risks and Mitigations](#risks-and-mitigations)
1515
- [Design Details](#design-details)
1616
- [Proposed Change](#proposed-change)
17-
- [Implementation strategy of smtalign CPU Manager policy option](#implementation-strategy-of-smtalign-cpu-manager-policy-option)
17+
- [Implementation strategy of reject-non-smt-aligned CPU Manager policy option](#implementation-strategy-of-reject-non-smt-aligned-cpu-manager-policy-option)
1818
- [Resource Accounting](#resource-accounting)
1919
- [Alternatives](#alternatives)
2020
- [Add extra resources](#add-extra-resources)
@@ -70,8 +70,8 @@ to consider thread-level allocation, to avoid physical CPU sharing and prevent p
7070

7171
### Goals
7272

73-
* Allow the workload to request the core allocation at hardware-thread level, avoiding noisy neighbours situations
74-
* Allow the workload to request full physical core allocation, to enable more efficient cache sharing
73+
* Prevent workloads from requesting cores that don't consume a full CPU by rejecting them.
74+
This guarantees that no physical core is shared among different containers, which improves cache efficiency and mitigates the noisy neighbours problem.
7575

7676
## Proposal
7777

@@ -109,7 +109,7 @@ The impact in the shared codebase will be addressed enhancing the current testsu
109109

110110
### Proposed Change
111111

112-
We propose to add a new flag in Kubelet called `CPUManagerPolicyOptions` in the kubelet config or command line argument called `cpumanager-policy-options` which allows the user to specify the CPU Manager policy option. If the value of this option is specified to be `smtalign`, it results in further refinements of the existing static policy.
112+
We propose to add a new flag in Kubelet called `CPUManagerPolicyOptions` in the kubelet config or command line argument called `cpumanager-policy-options` which allows the user to specify the CPU Manager policy option. If the value of this option is specified to be `reject-non-smt-aligned`, it results in further refinements of the existing static policy.
113113
The static policy allocates CPUs using a topology-aware best-fit allocation. This enhancement wants to provide stronger guarantees by restricting the allocation of threads.
114114
The aim is to achieve the isolation for workloads managed by Kubernetes. The other part of isolation is (as of now) not managed by Kubernetes, as described in [Explicitly Reserved CPU List](https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list) and [Static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy).
115115

@@ -119,20 +119,20 @@ Key properties:
119119
- With this requirement enforced, the cpumanager allocation algorithm will guarantee avoidance of physical core sharing.
120120
- Should the node not have enough free physical cores, the Pod will be put in Failed state, with `SMTAlignmentError` as reason.
121121

122-
### Implementation strategy of smtalign CPU Manager policy option
122+
### Implementation strategy of reject-non-smt-aligned CPU Manager policy option
123123

124-
- In order to introduce SMT-alignment in CPU Manager, we introduce a new flag in Kubelet to allow the user to specify `cpumanager-policy-options` which when specified with `smtalign` as its value provides the capability to modify the behaviour of static policy to strictly guarantee allocation of whole cores to a workload.
124+
- In order to introduce SMT-alignment in CPU Manager, we introduce a new flag in Kubelet to allow the user to specify `cpumanager-policy-options` which when specified with `reject-non-smt-aligned` as its value provides the capability to modify the behaviour of static policy to strictly guarantee allocation of whole cores to a workload.
125125
- The `CPUManagerPolicyOptions` received from the kubelet config/command line args is propogated to the Container Manager.
126126
- The responsibility of admission control is centralized in containermanager. The resource managers and/or the resource allocation orchestrator (Topology Manager) still have the responsibility of running the checks to admit the pods, but the handling of these errors and the building of the pod lifecycle result are now factored in containermanager.
127127
- Prior to this feature, the Container Manager admission handler was delegated to the topology manager if the latter was enabled. This worked well under the assumption that only Topology Manager had the ability to reject admissions with pods. But with the introduction of this feature, the CPU Manager also needs the ability to possibly reject pods if strict SMT alignment is requested. In order to do so, we introduce a new error and let it drive the rejection. Due to an already existing dependency between cpumanager and topologymanager as the former imports the latter in order to support the topologymanager.HintProvider interface, container manager is considered as the appropriate for performing admission control.
128-
- When `smtalign` policy option is specified along with `static` CPU Manager policy, an additional check in the allocation logic of the `static` policy ensures that CPUs would be allocated such that full cores are allocated. Because of this check, a pod would never have to acquire single threads with the aim to fill partially-allocated cores.
128+
- When `reject-non-smt-aligned` policy option is specified along with `static` CPU Manager policy, an additional check in the allocation logic of the `static` policy ensures that CPUs would be allocated such that full cores are allocated. Because of this check, a pod would never have to acquire single threads with the aim to fill partially-allocated cores.
129129
- In case request translates to partial occupancy of the cores, the Pod will not be admitted and would fail with `SMTAlignmentError`.
130130

131131

132132

133133
### Resource Accounting
134134

135-
To illustrate the behaviour of the `smtalign` policy option, we will consider the following CPU topology. We will use as example a CPU package with 16 physical cores, 2-way SMT-capable.
135+
To illustrate the behaviour of the `reject-non-smt-aligned` policy option, we will consider the following CPU topology. We will use as example a CPU package with 16 physical cores, 2-way SMT-capable.
136136

137137
![Example Topology](smtalign-topology.png)
138138

@@ -157,9 +157,11 @@ spec:
157157
cpu: "5"
158158
```
159159
160-
The `smtalign` policy option would need to make sure the remaining core on the half-allocated physical CPU is left unallocated to avoid noisy neighbours.
160+
The `reject-non-smt-aligned` policy option will cause the pod to be rejected since it doesn't request enough cores to consume all virtual threads exposed by the CPU.
161161

162-
![Example core allocation with the smtalign policy option when requesting a odd number of cores](smtalign-allocation-odd-cores.png)
162+
would need to make sure the remaining core on the half-allocated physical CPU is left unallocated to avoid noisy neighbours.
163+
164+
![Example core allocation with the reject-non-smt-aligned policy option when requesting a odd number of cores](smtalign-allocation-odd-cores.png)
163165

164166
The container will then actually get more virtual cores (6) than what is requesting (5).
165167

@@ -175,7 +177,11 @@ This approach follows what the Topology Manager already does.
175177

176178
### Alternatives
177179

178-
The only drawback of the proposed admission handler is that pods might have to overallocate resources.
180+
We acknowledge few drawbacks of the proposed approach:
181+
- pods that are rejected due to an AdmissionError do not get automatically rescheduled. Workloads which wants to make sure to be rescheduled need to
182+
use extra kubernetes facilities, for example Deployments.
183+
- pods might have to overallocate resources.
184+
179185
We evaluated possible alternatives to the extra admission control, but we eventually discarded all of them. We document them in this section.
180186

181187
#### Add extra resources
@@ -229,7 +235,7 @@ We would like to mention a further extension of this work, which we are *not* pr
229235
A further subset of the latency sensitive class of workload we identified (CNF, HFT) benefits most of non-SMT system, delivering the best possible performance here.
230236
For these applications, just disabling SMT at machine level solves the need of the workload, but overall creates worse usage of hardware resources and poorer container density.
231237

232-
Another policy option, or a further refinement of `smtalign`, which enables non-SMT emulation on SMT-enabled system would allow to accommodate these needs, but this would cause even more significant resource accounting mismatches
238+
Another policy option, or a further refinement of `reject-non-smt-aligned`, which enables non-SMT emulation on SMT-enabled system would allow to accommodate these needs, but this would cause even more significant resource accounting mismatches
233239
as described above. Furthermore, at the moment of writing we are still assessing how large is the set of the classes which benefit of these extra guarantees.
234240

235241
For all these reasons we postponed this work to a later date.
@@ -267,7 +273,7 @@ No changes needed
267273
- [X] Feature gate (also fill in values in `kep.yaml`).
268274
- Feature gate name: `CPUManagerPolicyOptions`.
269275
- Components depending on the feature gate: kubelet
270-
- [X] Change the kubelet configuration to set the cpumanager policy option to `smtalign`
276+
- [X] Change the kubelet configuration to set the cpumanager policy option to `reject-non-smt-aligned`
271277
* **Does enabling the feature change any default behavior?**
272278
- Yes, it makes the behaviour of the `cpumanager` static policy more restrictive and can lead to pod admission rejection.
273279
* **Can the feature be disabled once it has been enabled (i.e. can we rollback the enablement)?**
@@ -324,3 +330,4 @@ No changes needed
324330
- 2021-04-22: KEP updated to clarify the `smtaware` policy after discussion on sig-node and to postpone the `smtisolate` policy
325331
- 2021-05-04: KEP updated to change name from `smtaware` to `smtalign`. In addition to this we capture changes in the implmentation details including the introduction of a new flag in Kubelet called `cpumanager-policy-options` to allow the user to specify `smtalign` as a value to enable this capability.
326332
- 2021-05-06: KEP update to add the feature gate and clarify PRR answers.
333+
- 2021-05-10: KEP update to add to rename the `smtalign` to `reject-non-smt-aligned` for better clarity and address review comments

0 commit comments

Comments
 (0)