Skip to content

Commit 2d58a56

Browse files
authored
Merge pull request #41348 from jeana-redhat/OSDOCS-3141_cluster_autoscaler_scale_down_threshold
OSDOCS-3141: Cluster autoscaler node utilization threshold in 4.10
2 parents 108fa0f + 55a42e8 commit 2d58a56

File tree

3 files changed

+13
-9
lines changed

3 files changed

+13
-9
lines changed

machine_management/applying-autoscaling.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,8 +15,6 @@ You can configure the cluster autoscaler only in clusters where the machine API
1515

1616
include::modules/cluster-autoscaler-about.adoc[leveloffset=+1]
1717

18-
include::modules/machine-autoscaler-about.adoc[leveloffset=+1]
19-
2018
[id="configuring-clusterautoscaler"]
2119
== Configuring the cluster autoscaler
2220

@@ -37,6 +35,8 @@ include::modules/deploying-resource.adoc[leveloffset=+2]
3735

3836
* After you configure the cluster autoscaler, you must configure at least one machine autoscaler.
3937

38+
include::modules/machine-autoscaler-about.adoc[leveloffset=+1]
39+
4040
[id="configuring-machineautoscaler"]
4141
== Configuring the machine autoscalers
4242

modules/cluster-autoscaler-about.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ Ensure that the `maxNodesTotal` value in the `ClusterAutoscaler` resource defini
2020

2121
Every 10 seconds, the cluster autoscaler checks which nodes are unnecessary in the cluster and removes them. The cluster autoscaler considers a node for removal if the following conditions apply:
2222

23-
* The sum of CPU and memory requests of all pods running on the node is less than 50% of the allocated resources on the node.
23+
* The node utilization is less than the _node utilization level_ threshold for the cluster. The node utilization level is the sum of the requested resources divided by the allocated resources for the node. If you do not specify a value in the `ClusterAutoscaler` custom resource, the cluster autoscaler uses a default value of `0.5`, which corresponds to 50% utilization.
2424
* The cluster autoscaler can move all pods running on the node to the other nodes.
2525
* The cluster autoscaler does not have scale down disabled annotation.
2626

modules/cluster-autoscaler-cr.adoc

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
// * machine_management/applying-autoscaling.adoc
44
// * post_installation_configuration/cluster-tasks.adoc
55

6+
:_content-type: REFERENCE
67
[id="cluster-autoscaler-cr_{context}"]
78
= ClusterAutoscaler resource definition
89

@@ -38,26 +39,29 @@ spec:
3839
delayAfterDelete: 5m <13>
3940
delayAfterFailure: 30s <14>
4041
unneededTime: 5m <15>
42+
utilizationThreshold: 0.4 <16>
4143
----
4244
<1> Specify the priority that a pod must exceed to cause the cluster autoscaler to deploy additional nodes. Enter a 32-bit integer value. The `podPriorityThreshold` value is compared to the value of the `PriorityClass` that you assign to each pod.
4345
<2> Specify the maximum number of nodes to deploy. This value is the total number of machines that are deployed in your cluster, not just the ones that the autoscaler controls. Ensure that this value is large enough to account for all of your control plane and compute machines and the total number of replicas that you specify in your `MachineAutoscaler` resources.
4446
<3> Specify the minimum number of cores to deploy in the cluster.
4547
<4> Specify the maximum number of cores to deploy in the cluster.
4648
<5> Specify the minimum amount of memory, in GiB, in the cluster.
4749
<6> Specify the maximum amount of memory, in GiB, in the cluster.
48-
<7> Optionally, specify the type of GPU node to deploy. Only `nvidia.com/gpu` and `amd.com/gpu` are valid types.
50+
<7> Optional: Specify the type of GPU node to deploy. Only `nvidia.com/gpu` and `amd.com/gpu` are valid types.
4951
<8> Specify the minimum number of GPUs to deploy in the cluster.
5052
<9> Specify the maximum number of GPUs to deploy in the cluster.
5153
<10> In this section, you can specify the period to wait for each action by using any valid link:https://golang.org/pkg/time/#ParseDuration[ParseDuration] interval, including `ns`, `us`, `ms`, `s`, `m`, and `h`.
5254
<11> Specify whether the cluster autoscaler can remove unnecessary nodes.
53-
<12> Optionally, specify the period to wait before deleting a node after a node has recently been _added_. If you do not specify a value, the default value of `10m` is used.
54-
<13> Specify the period to wait before deleting a node after a node has recently been _deleted_. If you do not specify a value, the default value of `10s` is used.
55-
<14> Specify the period to wait before deleting a node after a scale down failure occurred. If you do not specify a value, the default value of `3m` is used.
56-
<15> Specify the period before an unnecessary node is eligible for deletion. If you do not specify a value, the default value of `10m` is used.
55+
<12> Optional: Specify the period to wait before deleting a node after a node has recently been _added_. If you do not specify a value, the default value of `10m` is used.
56+
<13> Optional: Specify the period to wait before deleting a node after a node has recently been _deleted_. If you do not specify a value, the default value of `0s` is used.
57+
<14> Optional: Specify the period to wait before deleting a node after a scale down failure occurred. If you do not specify a value, the default value of `3m` is used.
58+
<15> Optional: Specify the period before an unnecessary node is eligible for deletion. If you do not specify a value, the default value of `10m` is used.
59+
<16> Optional: Specify the _node utilization level_ below which an unnecessary node is eligible for deletion. The node utilization level is the sum of the requested resources divided by the allocated resources for the node, and must be a value greater than `0` but less than `1`. If you do not specify a value, the cluster autoscaler uses a default value of `0.5`, which corresponds to 50% utilization.
60+
// Might be able to add a formula to show this visually, but need to look into asciidoc math formatting and what our tooling supports.
5761

5862
[NOTE]
5963
====
6064
When performing a scaling operation, the cluster autoscaler remains within the ranges set in the `ClusterAutoscaler` resource definition, such as the minimum and maximum number of cores to deploy or the amount of memory in the cluster. However, the cluster autoscaler does not correct the current values in your cluster to be within those ranges.
6165
62-
The minimum and maximum CPUs, memory, and GPU values are determined by calculating those resources on all nodes in the cluster, even if the cluster autoscaler does not manage the nodes. For example, the control plane nodes are considered in the total memory in the cluster, even though the cluster autoscaler does not manage the control plane nodes.
66+
The minimum and maximum CPUs, memory, and GPU values are determined by calculating those resources on all nodes in the cluster, even if the cluster autoscaler does not manage the nodes. For example, the control plane nodes are considered in the total memory in the cluster, even though the cluster autoscaler does not manage the control plane nodes.
6367
====

0 commit comments

Comments
 (0)