You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During the review it was pointed out that existing operators (software
component or humans) make logic on the policy name. Options may change
the behaviour of the policy in a non-backward-compatible way[1], thus
we need a way to signal this to consumers; we agreed to add an alias
for the static policy whose sole purpose is to make this change evident.
+++
[1] the change we are proposing in this KEP is not, but the cpumanager
infra change we are proposing which in turn makes the change possible
would enable these future options.
Signed-off-by: Francesco Romani <[email protected]>
Copy file name to clipboardExpand all lines: keps/sig-node/2625-cpumanager-policies-thread-placement/README.md
+28-15Lines changed: 28 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,19 +59,20 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
59
59
60
60
## Summary
61
61
62
-
We propose a change in cpumanager to make the behaviour of latency-sensitive applications more predictable when running on SMT-enabled systems.
62
+
We propose a change in CPUManager to make the behaviour of latency-sensitive applications more predictable when running on SMT-enabled systems.
63
63
64
64
## Motivation
65
65
66
-
Latency-sensitive applications want to have exclusive CPU allocation to enable performance isolation and to meet their latency requirements.
67
-
The static policy of the cpumanager already allows to prevent virtual CPU sharing.
66
+
Latency-sensitive applications want to have exclusive CPU allocation to enable better isolation to meet application's latency and performance requirements.
67
+
The static policy of the CPUManager already allows to prevent virtual CPU sharing.
68
68
However, for some classes of these latency-sensitive applications running on simultaneous multithreading (SMT) enabled system, it is also beneficial
69
69
to consider thread-level allocation, to avoid physical CPU sharing and prevent possible interferences caused by noisy neighborhoods.
70
70
71
71
### Goals
72
72
73
73
* Prevent workloads from requesting cores that don't consume a full CPU by rejecting them.
74
-
This guarantees that no physical core is shared among different containers, which improves cache efficiency and mitigates the noisy neighbours problem.
74
+
This guarantees that no physical core is shared among different containers, which improves cache efficiency and mitigates the interference
75
+
with other workloads that can consume resources of the same physical core, e.g. first level caches.
75
76
76
77
## Proposal
77
78
@@ -109,35 +110,45 @@ The impact in the shared codebase will be addressed enhancing the current testsu
109
110
110
111
### Proposed Change
111
112
112
-
We propose to add a new flag in Kubelet called `CPUManagerPolicyOptions` in the kubelet config or command line argument called `cpumanager-policy-options` which allows the user to specify the CPU Manager policy option. If the value of this option is specified to be `reject-non-smt-aligned`, it results in further refinements of the existing static policy.
113
+
We propose to
114
+
- add a new _alias_ to the existing static policy, called `configurable`
115
+
- add a new flag in Kubelet called `CPUManagerPolicyOptions` in the kubelet config or command line argument called `cpumanager-policy-options` which allows the user to specify the CPU Manager policy option.
116
+
- finally, add a new cpu manager option called `reject-non-smt-aligned`; if present, this option will enable further refinements of the existing static policy.
117
+
118
+
The addition of an alias for the `static` policy is motivated by the fact the new CPUManager options can alter the behaviour of the policy in a backward incompatible way.
119
+
While the change we are proposing in this KEP is fully backward compatible, other future options may introduce changes in behaviour.
120
+
Thus, some operators may not notice the extra flags that potentially change the behaviour of the static policy.
121
+
An example of these aforementioned operators are software components who vendored older kubelet configuration.
122
+
By adding the new policy name, we make it possible for those operators to trivially detect the change, and to not make wrong assumptions.
123
+
113
124
The static policy allocates CPUs using a topology-aware best-fit allocation. This enhancement wants to provide stronger guarantees by restricting the allocation of threads.
114
125
The aim is to achieve the isolation for workloads managed by Kubernetes. The other part of isolation is (as of now) not managed by Kubernetes, as described in [Explicitly Reserved CPU List](https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list) and [Static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy).
115
126
116
-
Key properties:
127
+
Let's summarize the key properties of the `reject-non-smt-aligned` option:
117
128
- Preserve all the properties of the `static` policy.
118
129
- Never allocate less than a physical-cpu worth amount of cores.
119
-
- With this requirement enforced, the cpumanager allocation algorithm will guarantee avoidance of physical core sharing.
130
+
- With this requirement enforced, the CPUManager allocation algorithm will guarantee avoidance of physical core sharing.
120
131
- Should the node not have enough free physical cores, the Pod will be put in Failed state, with `SMTAlignmentError` as reason.
121
132
122
133
### Implementation strategy of reject-non-smt-aligned CPU Manager policy option
123
134
124
-
- In order to introduce SMT-alignment in CPU Manager, we introduce a new flag in Kubelet to allow the user to specify `cpumanager-policy-options` which when specified with `reject-non-smt-aligned` as its value provides the capability to modify the behaviour of static policy to strictly guarantee allocation of whole cores to a workload.
135
+
- In order to introduce the SMT-alignment check in CPU Manager, we introduce a new flag in Kubelet to allow the user to specify `cpumanager-policy-options` which when specified with `reject-non-smt-aligned` as its value provides the capability to modify the behaviour of static policy to strictly guarantee allocation of whole cores to a workload.
136
+
- We add a new policy called `configurable`. Only if this policy is selected the options, if given, will be considered. Otherwise, they will be ignored.
125
137
- The `CPUManagerPolicyOptions` received from the kubelet config/command line args is propogated to the Container Manager.
126
138
- The responsibility of admission control is centralized in containermanager. The resource managers and/or the resource allocation orchestrator (Topology Manager) still have the responsibility of running the checks to admit the pods, but the handling of these errors and the building of the pod lifecycle result are now factored in containermanager.
127
-
- Prior to this feature, the Container Manager admission handler was delegated to the topology manager if the latter was enabled. This worked well under the assumption that only Topology Manager had the ability to reject admissions with pods. But with the introduction of this feature, the CPU Manager also needs the ability to possibly reject pods if strict SMT alignment is requested. In order to do so, we introduce a new error and let it drive the rejection. Due to an already existing dependency between cpumanager and topologymanager as the former imports the latter in order to support the topologymanager.HintProvider interface, container manager is considered as the appropriate for performing admission control.
139
+
- Prior to this feature, the Container Manager admission handler was delegated to the topology manager if the latter was enabled. This worked well under the assumption that only Topology Manager had the ability to reject admissions with pods. But with the introduction of this feature, the CPU Manager also needs the ability to possibly reject pods if strict SMT alignment is requested. In order to do so, we introduce a new error and let it drive the rejection. Due to an already existing dependency between CPUManager and TopologyManager as the former imports the latter in order to support the `topologymanager.HintProvider` interface, container manager is considered as the appropriate for performing admission control.
128
140
- When `reject-non-smt-aligned` policy option is specified along with `static` CPU Manager policy, an additional check in the allocation logic of the `static` policy ensures that CPUs would be allocated such that full cores are allocated. Because of this check, a pod would never have to acquire single threads with the aim to fill partially-allocated cores.
129
141
- In case request translates to partial occupancy of the cores, the Pod will not be admitted and would fail with `SMTAlignmentError`.
130
142
131
143
132
-
133
144
### Resource Accounting
134
145
135
146
To illustrate the behaviour of the `reject-non-smt-aligned` policy option, we will consider the following CPU topology. We will use as example a CPU package with 16 physical cores, 2-way SMT-capable.
136
147
137
148

138
149
139
150
140
-
Let's consider a single container, requesting 5 isolated cores.
151
+
Let's consider a single container, requesting 5 exclusive cores.
141
152
142
153
```yaml
143
154
apiVersion: v1
@@ -172,7 +183,7 @@ The container will then actually get more virtual cores (6) than what is request
172
183
173
184
With `threads_per_cpu` is typical 2 on x86_64 with SMT enabled - but this number is not fixed and can change in future hardware implementation.
174
185
175
-
In order to make the resource reporting consistent, and avoiding cascading changes in the system, we enforce the request constraints ad admission time.
186
+
In order to make the resource reporting consistent, and avoiding cascading changes in the system, we enforce the request constraints at admission time.
176
187
This approach follows what the Topology Manager already does.
177
188
178
189
### Alternatives
@@ -187,7 +198,7 @@ We evaluated possible alternatives to the extra admission control, but we eventu
187
198
#### Add extra resources
188
199
189
200
We can add a new extended resource alongside `cpu` - [which on baremetal represents virtual threads](https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#cpu-units), to represent
190
-
physical CPUs. However having two resources to represent the same hardware entity is confusing and cumbersome. We believe cpumanager should keep consuming the core `cpu` resource for consistency reasons.
201
+
physical CPUs. However having two resources to represent the same hardware entity is confusing and cumbersome. We believe CPUManager should keep consuming the core `cpu` resource for consistency reasons.
191
202
192
203
Just considering the new extended resource, is not feasible as well, because it will prevent the pod to be in the guaranteed QoS class, and will void the desirable property of keeping all the guarantees
193
204
the static policy provides.
@@ -273,9 +284,10 @@ No changes needed
273
284
- [X] Feature gate (also fill in values in `kep.yaml`).
274
285
- Feature gate name: `CPUManagerPolicyOptions`.
275
286
- Components depending on the feature gate: kubelet
276
-
- [X] Change the kubelet configuration to set the cpumanager policy option to `reject-non-smt-aligned`
287
+
- [X] Change the kubelet configuration to set the CPUManager policy to `configurable`
288
+
- [X] Change the kubelet configuration adding the CPUManager policy option to `reject-non-smt-aligned`
277
289
* **Does enabling the feature change any default behavior?**
278
-
- Yes, it makes the behaviour of the `cpumanager` static policy more restrictive and can lead to pod admission rejection.
290
+
- Yes, it makes the behaviour of the CPUManager static policy more restrictive and can lead to pod admission rejection.
279
291
* **Can the feature be disabled once it has been enabled (i.e. can we rollback the enablement)?**
280
292
- Yes, disabling the feature gate shuts down the feature completely; alternatively,
281
293
- Yes, through kubelet configuration - switch to a different policy.
@@ -331,3 +343,4 @@ No changes needed
331
343
- 2021-05-04: KEP updated to change name from `smtaware` to `smtalign`. In addition to this we capture changes in the implmentation details including the introduction of a new flag in Kubelet called `cpumanager-policy-options` to allow the user to specify `smtalign` as a value to enable this capability.
332
344
- 2021-05-06: KEP update to add the feature gate and clarify PRR answers.
333
345
- 2021-05-10: KEP update to add to rename the `smtalign` to `reject-non-smt-aligned` for better clarity and address review comments
346
+
- 2021-05-11: KEP update to add to the `configurable` alias and address review comments
0 commit comments