Merge pull request #50000 from jm-franc/configurable-tolerance-blog

k8s-ci-robot · web-flow · commit 2247c6c90db8 · 2025-04-14T01:30:49.000-07:00
Add HPA 'configurable tolerance' blog post (KEP-4951).
diff --git a/content/en/blog/_posts/XXXX-XX-XX-hpa-configurable-tolerance.md b/content/en/blog/_posts/XXXX-XX-XX-hpa-configurable-tolerance.md
@@ -0,0 +1,96 @@
+---
+layout: blog
+title: "Kubernetes v1.33: HorizontalPodAutoscaler Configurable Tolerance"
+slug: kubernetes-1-33-hpa-configurable-tolerance
+# after the v1.33 release, set a future publication date and remove the draft marker
+# the release comms team can confirm which date has been assigned
+#
+# PRs to remove the draft marker should be opened BEFORE release day
+draft: true
+math: true # for formulae
+date: XXXX-XX-XX
+author: "Jean-Marc François (Google)"
+---
+
+This post describes _configurable tolerance for horizontal Pod autoscaling_,
+a new alpha feature first available in Kubernetes 1.33.
+
+## What is it?
+
+[Horizontal Pod Autoscaling](/docs/tasks/run-application/horizontal-pod-autoscale/)
+is a well-known Kubernetes feature that allows your workload to
+automatically resize by adding or removing replicas based on resource
+utilization.
+
+Let's say you have a web application running in a Kubernetes cluster with 50
+replicas. You configure the Horizontal Pod Autoscaler (HPA) to scale based on
+CPU utilization, with a target of 75% utilization. Now, imagine that the current
+CPU utilization across all replicas is 90%, which is higher than the desired
+75%. The HPA will calculate the required number of replicas using the formula:
+```math
+desiredReplicas = ceil\left\lceil currentReplicas \times \frac{currentMetricValue}{desiredMetricValue} \right\rceil
+```
+
+In this example:
+```math
+50 \times (90/75) = 60
+```
+
+So, the HPA will increase the number of replicas from 50 to 60 to reduce the
+load on each pod. Similarly, if the CPU utilization were to drop below 75%, the
+HPA would scale down the number of replicas accordingly. The Kubernetes
+documentation provides a
+[detailed description of the scaling algorithm](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details).
+
+In order to avoid replicas being created or deleted whenever a small metric
+fluctuation occurs, Kubernetes applies a form of hysteresis: it only changes the
+number of replicas when the current and desired metric values differ by more
+than 10%. In the example above, since the ratio between the current and desired
+metric values is \\(90/75\\), or 20% above target, exceeding the 10% tolerance,
+the scale-up action will proceed.
+
+This default tolerance of 10% is cluster-wide; in older Kubernetes releases, it
+could not be fine-tuned. It's a suitable value for most usage, but too coarse
+for large deployments, where a 10% tolerance represents tens of pods. As a
+result, the community has long
+[asked](https://github.com/kubernetes/kubernetes/issues/116984) to be able to
+tune this value.
+
+In Kubernetes v1.33, this is now possible.
+
+## How do I use it?
+
+After enabling the `HPAConfigurableTolerance`
+[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) in
+your Kubernetes v1.33 cluster, you can add your desired tolerance for your
+HorizontalPodAutoscaler object.
+
+Tolerances appear under the `spec.behavior.scaleDown` and
+`spec.behavior.scaleUp` fields and can thus be different for scale up and scale
+down. A typical usage would be to specify a small tolerance on scale up (to
+react quickly to spikes), but higher on scale down (to avoid adding and removing
+replicas too quickly in response to small metric fluctuations).
+
+For example, an HPA with a tolerance of 5% on scale-down, and no tolerance on
+scale-up, would look like the following:
+
+```yaml
+apiVersion: autoscaling/v2
+kind: HorizontalPodAutoscaler
+metadata:
+  name: my-app
+spec:
+  ...
+  behavior:
+    scaleDown:
+      tolerance: 0.05
+    scaleUp:
+      tolerance: 0
+```
+
+## I want all the details!
+
+Get all the technical details by reading
+[KEP-4951](https://github.com/kubernetes/enhancements/tree/master/keps/sig-autoscaling/4951-configurable-hpa-tolerance)
+and follow [issue 4951](https://github.com/kubernetes/enhancements/issues/4951)
+to be notified of the feature graduation.