You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: modules/nodes-nodes-resources-configuring-about.adoc
+17-54Lines changed: 17 additions & 54 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,14 +9,13 @@ CPU and memory resources reserved for node components in {product-title} are bas
9
9
10
10
[options="header",cols="1,2"]
11
11
|===
12
-
13
12
|Setting |Description
14
13
15
14
|`kube-reserved`
16
-
| Resources reserved for node components. Default is none.
15
+
| This setting is not used with {product-title}. Add the CPU and memory resources that you planned to reserve to the `system-reserved` setting.
17
16
18
17
|`system-reserved`
19
-
| Resources reserved for the remaining system components. Default settings depend on the {product-title} and Machine Config Operator versions. Confirm the default `systemReserved` parameter on the `machine-config-operator` repository.
18
+
| This setting identifies the resources to reserve for the node components and system components. The default settings depend on the {product-title} and Machine Config Operator versions. Confirm the default `systemReserved` parameter on the `machine-config-operator` repository.
20
19
|===
21
20
22
21
If a flag is not set, the defaults are used. If none of the flags are set, the
@@ -29,65 +28,33 @@ introduction of allocatable resources.
29
28
An allocated amount of a resource is computed based on the following formula:
The withholding of `Hard-Eviction-Thresholds` from allocatable is a change in behavior to improve
38
-
system reliability now that allocatable is enforced for end-user pods at the node level.
39
-
The `experimental-allocatable-ignore-eviction` setting is available to preserve legacy behavior,
40
-
but it will be deprecated in a future release.
36
+
The withholding of `Hard-Eviction-Thresholds` from `Allocatable` improves system reliability because the value for `Allocatable` is enforced for pods at the node level.
41
37
====
42
38
43
-
If `[Allocatable]` is negative, it is set to *0*.
39
+
If `Allocatable` is negative, it is set to `0`.
44
40
45
-
Each node reports system resources utilized by the container runtime and kubelet.
46
-
To better aid your ability to configure `--system-reserved` and `--kube-reserved`,
47
-
you can introspect corresponding node's resource usage using the node summary API,
48
-
which is accessible at `/api/v1/nodes/<node>/proxy/stats/summary`.
41
+
Each node reports the system resources that are used by the container runtime and kubelet. To simplify configuring the `system-reserved` parameter, view the resource use for the node by using the node summary API. The node summary is available at `/api/v1/nodes/<node>/proxy/stats/summary`.
49
42
50
43
[id="allocate-node-enforcement_{context}"]
51
44
== How nodes enforce resource constraints
52
45
53
-
The node is able to limit the total amount of resources that pods
54
-
may consume based on the configured allocatable value. This feature significantly
55
-
improves the reliability of the node by preventing pods from starving
56
-
system services (for example: container runtime, node agent, etc.) for resources.
57
-
It is strongly encouraged that administrators reserve
58
-
resources based on the desired node utilization target
59
-
in order to improve node reliability.
60
-
61
-
The node enforces resource constraints using a new *cgroup* hierarchy
62
-
that enforces quality of service. All pods are launched in a
63
-
dedicated cgroup hierarchy separate from system daemons.
64
-
65
-
Optionally, the node can be made to enforce kube-reserved and system-reserved by
66
-
specifying those tokens in the enforce-node-allocatable flag. If specified, the
67
-
corresponding `--kube-reserved-cgroup` or `--system-reserved-cgroup` needs to be provided.
68
-
In future releases, the node and container runtime will be packaged in a common cgroup
69
-
separate from `system.slice`. Until that time, we do not recommend users
70
-
change the default value of enforce-node-allocatable flag.
71
-
72
-
Administrators should treat system daemons similar to Guaranteed pods. System daemons
73
-
can burst within their bounding control groups and this behavior needs to be managed
74
-
as part of cluster deployments. Enforcing system-reserved limits
75
-
can lead to critical system services being CPU starved or OOM killed on the node. The
76
-
recommendation is to enforce system-reserved only if operators have profiled their nodes
77
-
exhaustively to determine precise estimates and are confident in their ability to
78
-
recover if any process in that group is OOM killed.
79
-
80
-
As a result, we strongly recommended that users only enforce node allocatable for
81
-
`pods` by default, and set aside appropriate reservations for system daemons to maintain
82
-
overall node reliability.
46
+
The node is able to limit the total amount of resources that pods can consume based on the configured allocatable value. This feature significantly improves the reliability of the node by preventing pods from using CPU and memory resources that are needed by system services such as the container runtime and node agent. To improve node reliability, administrators should reserve resources based on a target for resource use.
47
+
48
+
The node enforces resource constraints by using a new cgroup hierarchy that enforces quality of service. All pods are launched in a dedicated cgroup hierarchy that is separate from system daemons.
49
+
50
+
Administrators should treat system daemons similar to pods that have a guaranteed quality of service. System daemons can burst within their bounding control groups and this behavior must be managed as part of cluster deployments. Reserve CPU and memory resources for system daemons by specifying the amount of CPU and memory resources in `system-reserved`.
51
+
52
+
Enforcing `system-reserved` limits can prevent critical system services from receiving CPU and memory resources. As a result, a critical system service can be ended by the out-of-memory killer. The recommendation is to enforce `system-reserved` only if you have profiled the nodes exhaustively to determine precise estimates and you are confident that critical system services can recover if any process in that group is ended by the out-of-memory killer.
83
53
84
54
[id="allocate-eviction-thresholds_{context}"]
85
55
== Understanding Eviction Thresholds
86
56
87
-
If a node is under memory pressure, it can impact the entire node and all pods running on
88
-
it. If a system daemon is using more than its reserved amount of memory, an OOM
89
-
event may occur that can impact the entire node and all pods running on it. To avoid
90
-
(or reduce the probability of) system OOMs the node provides out-of-resource handling.
57
+
If a node is under memory pressure, it can impact the entire node and all pods running on the node. For example, a system daemon that uses more than its reserved amount of memory can trigger an out-of-memory event. To avoid or reduce the probability of system out-of-memory events, the node provides out-of-resource handling.
91
58
92
59
You can reserve some memory using the `--eviction-hard` flag. The node attempts to evict
93
60
pods whenever memory availability on the node drops below the absolute value or percentage.
@@ -98,16 +65,12 @@ before reaching out of memory conditions are not available for pods.
98
65
The following is an example to illustrate the impact of node allocatable for memory:
99
66
100
67
* Node capacity is `32Gi`
101
-
* --kube-reserved is `2Gi`
102
-
* --system-reserved is `1Gi`
68
+
* --system-reserved is `3Gi`
103
69
* --eviction-hard is set to `100Mi`.
104
70
105
-
For this node, the effective node allocatable value is `28.9Gi`. If the node
106
-
and system components use up all their reservation, the memory available for pods is `28.9Gi`,
107
-
and kubelet will evict pods when it exceeds this usage.
71
+
For this node, the effective node allocatable value is `28.9Gi`. If the node and system components use all their reservation, the memory available for pods is `28.9Gi`, and kubelet evicts pods when it exceeds this threshold.
108
72
109
-
If you enforce node allocatable (`28.9Gi`) via top level cgroups, then pods can never exceed `28.9Gi`.
110
-
Evictions would not be performed unless system daemons are consuming more than `3.1Gi` of memory.
73
+
If you enforce node allocatable, `28.9Gi`, with top-level cgroups, then pods can never exceed `28.9Gi`. Evictions are not performed unless system daemons consume more than `3.1Gi` of memory.
111
74
112
75
If system daemons do not use up all their reservation, with the above example,
113
76
pods would face memcg OOM kills from their bounding cgroup before node evictions kick in.
Copy file name to clipboardExpand all lines: modules/nodes-nodes-resources-configuring-setting.adoc
+3-7Lines changed: 3 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,8 +12,7 @@ As an administrator, you can set these using a custom resource (CR) through a se
12
12
13
13
.Prerequisites
14
14
15
-
. To help you determine setting for `--system-reserved` and `--kube-reserved` you can introspect the corresponding node's resource usage
16
-
using the node summary API, which is accessible at `/api/v1/nodes/<node>/proxy/stats/summary`. Enter the following command for your node:
15
+
. To help you determine values for the `system-reserved` setting, you can introspect the resource use for a node by using the node summary API. Enter the following command for your node:
17
16
+
18
17
[source,terminal]
19
18
----
@@ -117,11 +116,8 @@ spec:
117
116
custom-kubelet: small-pods <2>
118
117
kubeletConfig:
119
118
systemReserved:
120
-
cpu: 500m
121
-
memory: 512Mi
122
-
kubeReserved:
123
-
cpu: 500m
124
-
memory: 512Mi
119
+
cpu: 1000m
120
+
memory: 1Gi
125
121
----
126
122
<1> Assign a name to CR.
127
123
<2> Specify the label from the Machine Config Pool.
Copy file name to clipboardExpand all lines: modules/setting-up-cpu-manager.adoc
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -80,7 +80,7 @@ This adds the CPU Manager feature to the kubelet config and, if needed, the Mach
80
80
"name": "cpumanager-enabled",
81
81
"uid": "7ed5616d-6b72-11e9-aae1-021e1ce18878"
82
82
}
83
-
],
83
+
]
84
84
----
85
85
86
86
. Check the worker for the updated `kubelet.conf`:
@@ -241,7 +241,7 @@ Allocated resources:
241
241
cpu 1440m (96%) 1 (66%)
242
242
----
243
243
+
244
-
This VM has two CPU cores. You set `kube-reserved` to 500 millicores, meaning half of one core is subtracted from the total capacity of the node to arrive at the `Node Allocatable` amount. You can see that `Allocatable CPU` is 1500 millicores. This means you can run one of the CPU Manager pods since each will take one whole core. A whole core is equivalent to 1000 millicores. If you try to schedule a second pod, the system will accept the pod, but it will never be scheduled:
244
+
This VM has two CPU cores. The `system-reserved` setting reserves 500 millicores, meaning that half of one core is subtracted from the total capacity of the node to arrive at the `Node Allocatable` amount. You can see that `Allocatable CPU` is 1500 millicores. This means you can run one of the CPU Manager pods since each will take one whole core. A whole core is equivalent to 1000 millicores. If you try to schedule a second pod, the system will accept the pod, but it will never be scheduled:
Copy file name to clipboardExpand all lines: nodes/nodes/nodes-nodes-resources-configuring.adoc
+3-10Lines changed: 3 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,18 +1,11 @@
1
-
2
-
:context: nodes-nodes-resources-configuring
3
1
[id="nodes-nodes-resources-configuring"]
4
2
= Allocating resources for nodes in an {product-title} cluster
5
3
include::modules/common-attributes.adoc[]
4
+
:context: nodes-nodes-resources-configuring
6
5
7
6
toc::[]
8
7
9
-
10
-
To provide more reliable scheduling and minimize node resource overcommitment,
11
-
each node can reserve a portion of its resources for use by all underlying node
12
-
components (such as kubelet, kube-proxy) and the remaining system
13
-
components (such as *sshd*, *NetworkManager*) on the host. Once specified, the
14
-
scheduler has more information about the resources (e.g., memory, CPU) a node
15
-
has allocated for pods.
8
+
To provide more reliable scheduling and minimize node resource overcommitment, reserve a portion of the CPU and memory resources for use by the underlying node components, such as `kubelet` and `kube-proxy`, and the remaining system components, such as `sshd` and `NetworkManager`. By specifying the resources to reserve, you provide the scheduler with more information about the remaining CPU and memory resources that a node has available for use by pods.
16
9
17
10
// The following include statements pull in the module files that comprise
18
11
// the assembly. Include any combination of concept, procedure, or reference
0 commit comments