Skip to content

Commit 2dc4f56

Browse files
committed
Clean up policy/node-resource-managers.md
1 parent 3ba842d commit 2dc4f56

File tree

1 file changed

+60
-47
lines changed

1 file changed

+60
-47
lines changed

content/en/docs/concepts/policy/node-resource-managers.md

Lines changed: 60 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -9,16 +9,18 @@ weight: 50
99

1010
<!-- overview -->
1111

12-
In order to support latency-critical and high-throughput workloads, Kubernetes offers a suite of Resource Managers. The managers aim to co-ordinate and optimise node's resources alignment for pods configured with a specific requirement for CPUs, devices, and memory (hugepages) resources.
12+
In order to support latency-critical and high-throughput workloads, Kubernetes offers a suite of
13+
Resource Managers. The managers aim to co-ordinate and optimise the alignment of node's resources for pods
14+
configured with a specific requirement for CPUs, devices, and memory (hugepages) resources.
1315

1416
<!-- body -->
1517

1618
## Hardware topology alignment policies
1719

1820
_Topology Manager_ is a kubelet component that aims to coordinate the set of components that are
19-
responsible for these optimizations. The the overall resource management process is governed using
20-
the policy you specify.
21-
To learn more, read [Control Topology Management Policies on a Node](/docs/tasks/administer-cluster/topology-manager/).
21+
responsible for these optimizations. The overall resource management process is governed using
22+
the policy you specify. To learn more, read
23+
[Control Topology Management Policies on a Node](/docs/tasks/administer-cluster/topology-manager/).
2224

2325
## Policies for assigning CPUs to Pods
2426

@@ -29,27 +31,30 @@ hardware (for example, sharing CPUs across multiple Pods) or allocate hardware b
2931
resource (for example, assigning one of more CPUs for a Pod's exclusive use).
3032

3133
By default, the kubelet uses [CFS quota](https://en.wikipedia.org/wiki/Completely_Fair_Scheduler)
32-
to enforce pod CPU limits.  When the node runs many CPU-bound pods, the workload can move to different CPU cores depending on
33-
whether the pod is throttled and which CPU cores are available at scheduling time. Many workloads are not sensitive to this migration and thus
34+
to enforce pod CPU limits.  When the node runs many CPU-bound pods, the workload can move to
35+
different CPU cores depending on whether the pod is throttled and which CPU cores are available
36+
at scheduling time. Many workloads are not sensitive to this migration and thus
3437
work fine without any intervention.
3538

36-
However, in workloads where CPU cache affinity and scheduling latency significantly affect workload performance, the kubelet allows alternative CPU
39+
However, in workloads where CPU cache affinity and scheduling latency significantly affect
40+
workload performance, the kubelet allows alternative CPU
3741
management policies to determine some placement preferences on the node.
3842
This is implemented using the _CPU Manager_ and its policy.
3943
There are two available policies:
4044

4145
- `none`: the `none` policy explicitly enables the existing default CPU
42-
affinity scheme, providing no affinity beyond what the OS scheduler does
43-
automatically.  Limits on CPU usage for
44-
[Guaranteed pods](/docs/concepts/workloads/pods/pod-qos/) and
45-
[Burstable pods](/docs/concepts/workloads/pods/pod-qos/)
46-
are enforced using CFS quota.
46+
affinity scheme, providing no affinity beyond what the OS scheduler does
47+
automatically.  Limits on CPU usage for
48+
[Guaranteed pods](/docs/concepts/workloads/pods/pod-qos/) and
49+
[Burstable pods](/docs/concepts/workloads/pods/pod-qos/)
50+
are enforced using CFS quota.
4751
- `static`: the `static` policy allows containers in `Guaranteed` pods with integer CPU
48-
`requests` access to exclusive CPUs on the node. This exclusivity is enforced
49-
using the [cpuset cgroup controller](https://www.kernel.org/doc/Documentation/cgroup-v2.txt).
52+
`requests` access to exclusive CPUs on the node. This exclusivity is enforced
53+
using the [cpuset cgroup controller](https://www.kernel.org/doc/Documentation/cgroup-v2.txt).
5054

5155
{{< note >}}
52-
System services such as the container runtime and the kubelet itself can continue to run on these exclusive CPUs.  The exclusivity only extends to other pods.
56+
System services such as the container runtime and the kubelet itself can continue to run on
57+
these exclusive CPUs.  The exclusivity only extends to other pods.
5358
{{< /note >}}
5459

5560
CPU Manager doesn't support offlining and onlining of CPUs at runtime.
@@ -64,12 +69,12 @@ CPUs reserved by these options are taken, in integer quantity, from the initial
6469
core ID.  This shared pool is the set of CPUs on which any containers in
6570
`BestEffort` and `Burstable` pods run. Containers in `Guaranteed` pods with fractional
6671
CPU `requests` also run on CPUs in the shared pool. Only containers that are
67-
both part of a `Guaranteed` pod and have integer CPU `requests` are assigned
72+
part of a `Guaranteed` pod and have integer CPU `requests` are assigned
6873
exclusive CPUs.
6974

7075
{{< note >}}
7176
The kubelet requires a CPU reservation greater than zero when the static policy is enabled.
72-
This is because zero CPU reservation would allow the shared pool to become empty.
77+
This is because a zero CPU reservation would allow the shared pool to become empty.
7378
{{< /note >}}
7479

7580
As `Guaranteed` pods whose containers fit the requirements for being statically
@@ -144,7 +149,6 @@ The pod above runs in the `Guaranteed` QoS class because `requests` are equal to
144149
And the container's resource limit for the CPU resource is an integer greater than
145150
or equal to one. The `nginx` container is granted 2 exclusive CPUs.
146151

147-
148152
```yaml
149153
spec:
150154
containers:
@@ -163,7 +167,6 @@ The pod above runs in the `Guaranteed` QoS class because `requests` are equal to
163167
But the container's resource limit for the CPU resource is a fraction. It runs in
164168
the shared pool.
165169

166-
167170
```yaml
168171
spec:
169172
containers:
@@ -182,27 +185,38 @@ equal to one. The `nginx` container is granted 2 exclusive CPUs.
182185

183186
#### Static policy options {#cpu-policy-static--options}
184187

185-
The behavior of the static policy can be fine-tuned using the CPU Manager policy options.
186-
The following policy options exist for the static CPU management policy:
187-
{{/* options in alphabetical order */}}
188+
Here are the available policy options for the static CPU management policy,
189+
listed in alphabetical order:
188190

189191
`align-by-socket` (alpha, hidden by default)
190-
: Align CPUs by physical package / socket boundary, rather than logical NUMA boundaries (available since Kubernetes v1.25)
192+
: Align CPUs by physical package / socket boundary, rather than logical NUMA boundaries
193+
(available since Kubernetes v1.25)
194+
191195
`distribute-cpus-across-cores` (alpha, hidden by default)
192-
: Allocate virtual cores, sometimes called hardware threads, across different physical cores (available since Kubernetes v1.31)
196+
: Allocate virtual cores, sometimes called hardware threads, across different physical cores
197+
(available since Kubernetes v1.31)
198+
193199
`distribute-cpus-across-numa` (alpha, hidden by default)
194-
: Spread CPUs across different NUMA domains, aiming for an even balance between the selected domains (available since Kubernetes v1.23)
200+
: Spread CPUs across different NUMA domains, aiming for an even balance between the selected domains
201+
(available since Kubernetes v1.23)
202+
195203
`full-pcpus-only` (beta, visible by default)
196204
: Always allocate full physical cores (available since Kubernetes v1.22)
205+
197206
`strict-cpu-reservation` (alpha, hidden by default)
198-
: Prevent all the pods regardless of their Quality of Service class to run on reserved CPUs (available since Kubernetes v1.32)
207+
: Prevent all the pods regardless of their Quality of Service class to run on reserved CPUs
208+
(available since Kubernetes v1.32)
209+
199210
`prefer-align-cpus-by-uncorecache` (alpha, hidden by default)
200-
: Align CPUs by uncore (Last-Level) cache boundary on a best-effort way (available since Kubernetes v1.32)
211+
: Align CPUs by uncore (Last-Level) cache boundary on a best-effort way
212+
(available since Kubernetes v1.32)
201213

202214
You can toggle groups of options on and off based upon their maturity level
203215
using the following feature gates:
216+
204217
* `CPUManagerPolicyBetaOptions` (default enabled). Disable to hide beta-level options.
205218
* `CPUManagerPolicyAlphaOptions` (default disabled). Enable to show alpha-level options.
219+
206220
You will still have to enable each option using the `cpuManagerPolicyOptions` field in the
207221
kubelet configuration file.
208222

@@ -253,10 +267,10 @@ than number of NUMA nodes.
253267

254268
If the `distribute-cpus-across-cores` policy option is specified, the static policy
255269
will attempt to allocate virtual cores (hardware threads) across different physical cores.
256-
By default, the `CPUManager` tends to pack cpus onto as few physical cores as possible,
257-
which can lead to contention among cpus on the same physical core and result
270+
By default, the `CPUManager` tends to pack CPUs onto as few physical cores as possible,
271+
which can lead to contention among CPUs on the same physical core and result
258272
in performance bottlenecks. By enabling the `distribute-cpus-across-cores` policy,
259-
the static policy ensures that cpus are distributed across as many physical cores
273+
the static policy ensures that CPUs are distributed across as many physical cores
260274
as possible, reducing the contention on the same physical core and thereby
261275
improving overall performance. However, it is important to note that this strategy
262276
might be less effective when the system is heavily loaded. Under such conditions,
@@ -268,33 +282,32 @@ better performance under high load conditions.
268282

269283
The `reservedSystemCPUs` parameter in [KubeletConfiguration](/docs/reference/config-api/kubelet-config.v1beta1/),
270284
or the deprecated kubelet command line option `--reserved-cpus`, defines an explicit CPU set for OS system daemons
271-
and kubernetes system daemons. More details of this parameter can be found on the
285+
and kubernetes system daemons. More details of this parameter can be found on the
272286
[Explicitly Reserved CPU List](/docs/tasks/administer-cluster/reserve-compute-resources/#explicitly-reserved-cpu-list) page.
273-
By default this isolation is implemented only for guaranteed pods with integer CPU requests not for burstable and best-effort pods
274-
(and guaranteed pods with fractional CPU requests). Admission is only comparing the cpu requests against the allocatable cpus.
275-
Since the cpu limit is higher than the request, the default behaviour allows burstable and best-effort pods to use up the capacity
287+
By default, this isolation is implemented only for guaranteed pods with integer CPU requests not for burstable and best-effort pods
288+
(and guaranteed pods with fractional CPU requests). Admission is only comparing the CPU requests against the allocatable CPUs.
289+
Since the CPU limit is higher than the request, the default behaviour allows burstable and best-effort pods to use up the capacity
276290
of `reservedSystemCPUs` and cause host OS services to starve in real life deployments.
277291
If the `strict-cpu-reservation` policy option is enabled, the static policy will not allow
278292
any workload to use the CPU cores specified in `reservedSystemCPUs`.
279293

280294
##### `prefer-align-cpus-by-uncorecache`
281295

282296
If the `prefer-align-cpus-by-uncorecache` policy is specified, the static policy
283-
will allocate CPU resources for individual containers such that all CPUs assigned
284-
to a container share the same uncore cache block (also known as the Last-Level Cache
285-
or LLC). By default, the `CPUManager` will tightly pack CPU assignments which can
286-
result in containers being assigned CPUs from multiple uncore caches. This option
287-
enables the `CPUManager` to allocate CPUs in a way that maximizes the efficient use
288-
of the uncore cache. Allocation is performed on a best-effort basis, aiming to
289-
affine as many CPUs as possible within the same uncore cache. If the container's
290-
CPU requirement exceeds the CPU capacity of a single uncore cache, the `CPUManager`
291-
minimizes the number of uncore caches used in order to maintain optimal uncore
292-
cache alignment. Specific workloads can benefit in performance from the reduction
293-
of inter-cache latency and noisy neighbors at the cache level. If the `CPUManager`
294-
cannot align optimally while the node has sufficient resources, the container will
297+
will allocate CPU resources for individual containers such that all CPUs assigned
298+
to a container share the same uncore cache block (also known as the Last-Level Cache
299+
or LLC). By default, the `CPUManager` will tightly pack CPU assignments which can
300+
result in containers being assigned CPUs from multiple uncore caches. This option
301+
enables the `CPUManager` to allocate CPUs in a way that maximizes the efficient use
302+
of the uncore cache. Allocation is performed on a best-effort basis, aiming to
303+
affine as many CPUs as possible within the same uncore cache. If the container's
304+
CPU requirement exceeds the CPU capacity of a single uncore cache, the `CPUManager`
305+
minimizes the number of uncore caches used in order to maintain optimal uncore
306+
cache alignment. Specific workloads can benefit in performance from the reduction
307+
of inter-cache latency and noisy neighbors at the cache level. If the `CPUManager`
308+
cannot align optimally while the node has sufficient resources, the container will
295309
still be admitted using the default packed behavior.
296310

297-
298311
## Memory Management Policies
299312

300313
{{< feature-state feature_gate_name="MemoryManager" >}}

0 commit comments

Comments
 (0)