|
| 1 | +--- |
| 2 | +layout: blog |
| 3 | +title: "Kubernetes v1.33: HorizontalPodAutoscaler Configurable Tolerance" |
| 4 | +slug: kubernetes-1-33-hpa-configurable-tolerance |
| 5 | +# after the v1.33 release, set a future publication date and remove the draft marker |
| 6 | +# the release comms team can confirm which date has been assigned |
| 7 | +# |
| 8 | +# PRs to remove the draft marker should be opened BEFORE release day |
| 9 | +draft: true |
| 10 | +math: true # for formulae |
| 11 | +date: XXXX-XX-XX |
| 12 | +author: "Jean-Marc François (Google)" |
| 13 | +--- |
| 14 | + |
| 15 | +This post describes _configurable tolerance for horizontal Pod autoscaling_, |
| 16 | +a new alpha feature first available in Kubernetes 1.33. |
| 17 | + |
| 18 | +## What is it? |
| 19 | + |
| 20 | +[Horizontal Pod Autoscaling](/docs/tasks/run-application/horizontal-pod-autoscale/) |
| 21 | +is a well-known Kubernetes feature that allows your workload to |
| 22 | +automatically resize by adding or removing replicas based on resource |
| 23 | +utilization. |
| 24 | + |
| 25 | +Let's say you have a web application running in a Kubernetes cluster with 50 |
| 26 | +replicas. You configure the Horizontal Pod Autoscaler (HPA) to scale based on |
| 27 | +CPU utilization, with a target of 75% utilization. Now, imagine that the current |
| 28 | +CPU utilization across all replicas is 90%, which is higher than the desired |
| 29 | +75%. The HPA will calculate the required number of replicas using the formula: |
| 30 | +```math |
| 31 | +desiredReplicas = ceil\left\lceil currentReplicas \times \frac{currentMetricValue}{desiredMetricValue} \right\rceil |
| 32 | +``` |
| 33 | + |
| 34 | +In this example: |
| 35 | +```math |
| 36 | +50 \times (90/75) = 60 |
| 37 | +``` |
| 38 | + |
| 39 | +So, the HPA will increase the number of replicas from 50 to 60 to reduce the |
| 40 | +load on each pod. Similarly, if the CPU utilization were to drop below 75%, the |
| 41 | +HPA would scale down the number of replicas accordingly. The Kubernetes |
| 42 | +documentation provides a |
| 43 | +[detailed description of the scaling algorithm](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details). |
| 44 | + |
| 45 | +In order to avoid replicas being created or deleted whenever a small metric |
| 46 | +fluctuation occurs, Kubernetes applies a form of hysteresis: it only changes the |
| 47 | +number of replicas when the current and desired metric values differ by more |
| 48 | +than 10%. In the example above, since the ratio between the current and desired |
| 49 | +metric values is \\(90/75\\), or 20% above target, exceeding the 10% tolerance, |
| 50 | +the scale-up action will proceed. |
| 51 | + |
| 52 | +This default tolerance of 10% is cluster-wide; in older Kubernetes releases, it |
| 53 | +could not be fine-tuned. It's a suitable value for most usage, but too coarse |
| 54 | +for large deployments, where a 10% tolerance represents tens of pods. As a |
| 55 | +result, the community has long |
| 56 | +[asked](https://github.com/kubernetes/kubernetes/issues/116984) to be able to |
| 57 | +tune this value. |
| 58 | + |
| 59 | +In Kubernetes v1.33, this is now possible. |
| 60 | + |
| 61 | +## How do I use it? |
| 62 | + |
| 63 | +After enabling the `HPAConfigurableTolerance` |
| 64 | +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) in |
| 65 | +your Kubernetes v1.33 cluster, you can add your desired tolerance for your |
| 66 | +HorizontalPodAutoscaler object. |
| 67 | + |
| 68 | +Tolerances appear under the `spec.behavior.scaleDown` and |
| 69 | +`spec.behavior.scaleUp` fields and can thus be different for scale up and scale |
| 70 | +down. A typical usage would be to specify a small tolerance on scale up (to |
| 71 | +react quickly to spikes), but higher on scale down (to avoid adding and removing |
| 72 | +replicas too quickly in response to small metric fluctuations). |
| 73 | + |
| 74 | +For example, an HPA with a tolerance of 5% on scale-down, and no tolerance on |
| 75 | +scale-up, would look like the following: |
| 76 | + |
| 77 | +```yaml |
| 78 | +apiVersion: autoscaling/v2 |
| 79 | +kind: HorizontalPodAutoscaler |
| 80 | +metadata: |
| 81 | + name: my-app |
| 82 | +spec: |
| 83 | + ... |
| 84 | + behavior: |
| 85 | + scaleDown: |
| 86 | + tolerance: 0.05 |
| 87 | + scaleUp: |
| 88 | + tolerance: 0 |
| 89 | +``` |
| 90 | +
|
| 91 | +## I want all the details! |
| 92 | +
|
| 93 | +Get all the technical details by reading |
| 94 | +[KEP-4951](https://github.com/kubernetes/enhancements/tree/master/keps/sig-autoscaling/4951-configurable-hpa-tolerance) |
| 95 | +and follow [issue 4951](https://github.com/kubernetes/enhancements/issues/4951) |
| 96 | +to be notified of the feature graduation. |
0 commit comments