@@ -18,21 +18,36 @@ a new alpha feature first available in Kubernetes 1.33.
18
18
## What is it?
19
19
20
20
[ Horizontal Pod Autoscaling] ( /docs/tasks/run-application/horizontal-pod-autoscale/ )
21
- (HPA) is a well-known Kubernetes feature that allows your workload to
21
+ is a well-known Kubernetes feature that allows your workload to
22
22
automatically resize by adding or removing replicas based on resource
23
23
utilization.
24
24
25
- To decide how many replicas a workload requires, users configure their HPA
26
- with a metric (e.g. CPU utilization) and an expected value for this metric (e.g.
27
- 80%). The HPA updates the number of replica based on the ratio between the
28
- current and desired metric value. (For example, if there are currently 100
29
- replicas, the CPU utilization is 84%, and the desired utilization is 80%, the
30
- HPA will ask for \\ (100 \times (84/80)\\ )) replicas).
25
+ Let's say you have a web application running in a Kubernetes cluster with 50
26
+ replicas. You configure the Horizontal Pod Autoscaler (HPA) to scale based on
27
+ CPU utilization, with a target of 75% utilization. Now, imagine that the current
28
+ CPU utilization across all replicas is 90%, which is higher than the desired
29
+ 75%. The HPA will calculate the required number of replicas using the formula:
30
+ ``` math
31
+ desiredReplicas = ceil\left\lceil currentReplicas \times \frac{currentMetricValue}{desiredMetricValue} \right\rceil
32
+ ```
33
+
34
+ In this example:
35
+ ``` math
36
+ 50 \times (90/75) = 60
37
+ ```
38
+
39
+ So, the HPA will increase the number of replicas from 50 to 60 to reduce the
40
+ load on each pod. Similarly, if the CPU utilization were to drop below 75%, the
41
+ HPA would scale down the number of replicas accordingly. The Kubernetes
42
+ documentation provides a
43
+ [ detailed description of the scaling algorithm] ( https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details ) .
31
44
32
45
In order to avoid replicas being created or deleted whenever a small metric
33
46
fluctuation occurs, Kubernetes applies a form of hysteresis: it only changes the
34
- number of replicas when the the current and desired metric values differ by more
35
- than 10%.
47
+ number of replicas when the current and desired metric values differ by more
48
+ than 10%. In the example above, since the ratio between the current and desired
49
+ metric values is \\ (90/75\\ ), or 20% above target, exceeding the 10% tolerance,
50
+ the scale-up action will proceed.
36
51
37
52
This default tolerance of 10% is cluster-wide; in older Kubernetes releases, it
38
53
could not be fine-tuned. It's a suitable value for most usage, but too coarse
73
88
tolerance : 0
74
89
` ` `
75
90
76
- Consider the previous scenario where the ratio of current to desired metric
77
- values is \\ (84/80\\ ), a 5% increase. With the default 10% scale-up tolerance,
78
- no scaling occurs. However, with the HPA configured as shown, featuring a 0%
79
- scale-up tolerance, the 5% increase triggers scaling.
80
-
81
91
## I want all the details!
82
92
83
93
Get all the technical details by reading
0 commit comments