Skip to content

Commit 8b0bdda

Browse files
authored
Merge pull request #40704 from Amrita42/HPA-2688
OSDOCS-2688: Conceptual information about HPA
2 parents bc9dce0 + c59bd96 commit 8b0bdda

5 files changed

+88
-2
lines changed

images/HPAflow.png

36.3 KB
Loading
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * nodes/nodes-pods-autoscaling-about.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="nodes-pods-autoscaling-best-practices-hpa_{context}"]
7+
= Best practices
8+
9+
.All pods must have resource requests configured
10+
The HPA makes a scaling decision based on the observed CPU or memory utilization values of pods in an {product-title} cluster. Utilization values are calculated as a percentage of the resource requests of each pod.
11+
Missing resource request values can affect the optimal performance of the HPA.
12+
13+
.Configure the cool down period
14+
During horizontal pod autoscaling, there might be a rapid scaling of events without a time gap. Configure the cool down period to prevent frequent replica fluctuations.
15+
You can specify a cool down period by configuring the `stabilizationWindowSeconds` field. The stabilization window is used to restrict the the fluctuation of replicas count when the metrics used for scaling keep fluctuating.
16+
The autoscaling algorithm uses this window to infer a previous desired state and avoid unwanted changes to workload scale.
17+
18+
For example, a stabilization window is specified for the `scaleDown` field:
19+
20+
[source,yaml]
21+
----
22+
behavior:
23+
scaleDown:
24+
stabilizationWindowSeconds: 300
25+
----
26+
27+
In the above example, all desired states for the past 5 minutes are considered. This approximates a rolling maximum, and avoids having the scaling algorithm frequently remove pods only to trigger recreating an equivalent pod just moments later.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * nodes/nodes-pods-autoscaling-about.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="nodes-pods-autoscaling-requests-and-limits-hpa_{context}"]
7+
= About requests and limits
8+
9+
The scheduler uses the resource request that you specify for containers in a pod, to decide which node to place the pod on. The kubelet enforces the resource limit that you specify for a container to ensure that the container is not allowed to use more than the specified limit.
10+
The kubelet also reserves the request amount of that system resource specifically for that container to use.
11+
12+
.How to use resource metrics?
13+
14+
In the pod specifications, you must specify the resource requests, such as CPU and memory. The HPA uses this specification to determine the resource utilization and then scales the target up or down.
15+
16+
For example, the HPA object uses the following metric source:
17+
18+
[source,yaml]
19+
----
20+
type: Resource
21+
resource:
22+
name: cpu
23+
target:
24+
type: Utilization
25+
averageUtilization: 60
26+
----
27+
28+
In this example, the HPA keeps the average utilization of the pods in the scaling target at 60%. Utilization is the ratio between the current resource usage to the requested resource of the pod.
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * nodes/nodes-pods-autoscaling-about.adoc
4+
5+
:_content-type: CONCEPT
6+
[id="nodes-pods-autoscaling-workflow-hpa_{context}"]
7+
= How does the HPA work?
8+
9+
The horizontal pod autoscaler (HPA) extends the concept of pod auto-scaling. The HPA lets you create and manage a group of load-balanced nodes. The HPA automatically increases or decreases the number of pods when a given CPU or memory threshold is crossed.
10+
11+
.High level workflow of the HPA
12+
image::HPAflow.png[workflow]
13+
14+
The HPA is an API resource in the Kubernetes autoscaling API group. The autoscaler works as a control loop with a default of 15 seconds for the sync period. During this period, the controller manager queries the CPU, memory utilization, or both, against what is defined in the YAML file for the HPA.
15+
The controller manager obtains the utilization metrics from the resource metrics API for per-pod resource metrics like CPU or memory, for each pod that is targeted by the HPA.
16+
17+
If a utilization value target is set, the controller calculates the utilization value as a percentage of the equivalent resource request on the containers in each pod. The controller then takes the average of utilization across all targeted pods and produces a ratio that is used to scale the number of desired replicas.
18+
The HPA is configured to fetch metrics from `metrics.k8s.io`, which is provided by the metrics server. Because of the dynamic nature of metrics evaluation, the number of replicas can fluctuate during scaling for a group of replicas.
19+
20+
[NOTE]
21+
====
22+
To implement the HPA, all targeted pods must have a resource request set on their containers.
23+
====

nodes/pods/nodes-pods-autoscaling.adoc

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ As a developer, you can use a horizontal pod autoscaler (HPA) to
1010
specify how {product-title} should automatically increase or decrease the scale of
1111
a replication controller or deployment configuration, based on metrics collected
1212
from the pods that belong to that replication controller or deployment
13-
configuration. You can create an HPA for any `Deployment`, `DeploymentConfig`,
13+
configuration. You can create an HPA for any `Deployment`, `DeploymentConfig`,
1414
`ReplicaSet`, `ReplicationController`, or `StatefulSet` object.
1515

1616
[NOTE]
@@ -26,6 +26,12 @@ these objects, see xref:../../applications/deployments/what-deployments-are.adoc
2626

2727
include::modules/nodes-pods-autoscaling-about.adoc[leveloffset=+1]
2828

29+
include::modules/nodes-pods-autoscaling-workflow-hpa.adoc[leveloffset=+1]
30+
31+
include::modules/nodes-pods-autoscaling-requests-and-limits-hpa.adoc[leveloffset=+1]
32+
33+
include::modules/nodes-pods-autoscaling-best-practices-hpa.adoc[leveloffset=+1]
34+
2935
include::modules/nodes-pods-autoscaling-policies.adoc[leveloffset=+2]
3036

3137
include::modules/nodes-pods-autoscaling-creating-web-console.adoc[leveloffset=+1]
@@ -41,5 +47,7 @@ include::modules/nodes-pods-autoscaling-status-viewing.adoc[leveloffset=+2]
4147

4248
== Additional resources
4349

44-
For more information on replication controllers and deployment controllers,
50+
* For more information on replication controllers and deployment controllers,
4551
see xref:../../applications/deployments/what-deployments-are.adoc#what-deployments-are[Understanding deployments and deployment configs].
52+
53+
* For an example on the usage of HPA, see https://cloud.redhat.com/blog/horizontal-pod-autoscaling-of-quarkus-application-based-on-memory-utilization[Horizontal Pod Autoscaling of Quarkus Application Based on Memory Utilization].

0 commit comments

Comments
 (0)