Skip to content

Commit d10786f

Browse files
authored
Merge pull request #44959 from mocdaniel/autoscaling-overview
Adds a new concept page for autoscaling
2 parents 4dd24fd + cdb2b06 commit d10786f

File tree

1 file changed

+146
-0
lines changed

1 file changed

+146
-0
lines changed
Lines changed: 146 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,146 @@
1+
---
2+
title: Autoscaling Workloads
3+
description: >-
4+
With autoscaling, you can automatically update your workloads in one way or another. This allows your cluster to react to changes in resource demand more elastically and efficiently.
5+
content_type: concept
6+
weight: 40
7+
---
8+
9+
<!-- overview -->
10+
11+
In Kubernetes, you can _scale_ a workload depending on the current demand of resources.
12+
This allows your cluster to react to changes in resource demand more elastically and efficiently.
13+
14+
When you scale a workload, you can either increase or decrease the number of replicas managed by
15+
the workload, or adjust the resources available to the replicas in-place.
16+
17+
The first approach is referred to as _horizontal scaling_, while the second is referred to as
18+
_vertical scaling_.
19+
20+
There are manual and automatic ways to scale your workloads, depending on your use case.
21+
22+
<!-- body -->
23+
24+
## Scaling workloads manually
25+
26+
Kubernetes supports _manual scaling_ of workloads. Horizontal scaling can be done
27+
using the `kubectl` CLI.
28+
For vertical scaling, you need to _patch_ the resource definition of your workload.
29+
30+
See below for examples of both strategies.
31+
32+
- **Horizontal scaling**: [Running multiple instances of your app](/docs/tutorials/kubernetes-basics/scale/scale-intro/)
33+
- **Vertical scaling**: [Resizing CPU and memory resources assigned to containers](/docs/tasks/configure-pod-container/resize-container-resources)
34+
35+
## Scaling workloads automatically
36+
37+
Kubernetes also supports _automatic scaling_ of workloads, which is the focus of this page.
38+
39+
The concept of _Autoscaling_ in Kubernetes refers to the ability to automatically update an
40+
object that manages a set of Pods (for example a
41+
{{< glossary_tooltip text="Deployment" term_id="deployment" >}}.
42+
43+
### Scaling workloads horizontally
44+
45+
In Kubernetes, you can automatically scale a workload horizontally using a _HorizontalPodAutoscaler_ (HPA).
46+
47+
It is implemented as a Kubernetes API resource and a {{< glossary_tooltip text="controller" term_id="controller" >}}
48+
and periodically adjusts the number of {{< glossary_tooltip text="replicas" term_id="replica" >}}
49+
in a workload to match observed resource utilization such as CPU or memory usage.
50+
51+
There is a [walkthrough tutorial](/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough) of configuring a HorizontalPodAutoscaler for a Deployment.
52+
53+
### Scaling workloads vertically
54+
55+
{{< feature-state for_k8s_version="v1.25" state="stable" >}}
56+
57+
You can automatically scale a workload vertically using a _VerticalPodAutoscaler_ (VPA).
58+
Different to the HPA, the VPA doesn't come with Kubernetes by default, but is a separate project
59+
that can be found [on GitHub](https://github.com/kubernetes/autoscaler/tree/9f87b78df0f1d6e142234bb32e8acbd71295585a/vertical-pod-autoscaler).
60+
61+
Once installed, it allows you to create {{< glossary_tooltip text="CustomResourceDefinitions" term_id="customresourcedefinition" >}}
62+
(CRDs) for your workloads which define _how_ and _when_ to scale the resources of the managed replicas.
63+
64+
{{< note >}}
65+
You will need to have the [Metrics Server](https://github.com/kubernetes-sigs/metrics-server)
66+
installed to your cluster for the HPA to work.
67+
{{< /note >}}
68+
69+
At the moment, the VPA can operate in four different modes:
70+
71+
{{< table caption="Different modes of the VPA" >}}
72+
Mode | Description
73+
:----|:-----------
74+
`Auto` | Currently `Recreate`, might change to in-place updates in the future
75+
`Recreate` | The VPA assigns resource requests on pod creation as well as updates them on existing pods by evicting them when the requested resources differ significantly from the new recommendation
76+
`Initial` | The VPA only assigns resource requests on pod creation and never changes them later.
77+
`Off` | The VPA does not automatically change the resource requirements of the pods. The recommendations are calculated and can be inspected in the VPA object.
78+
{{< /table >}}
79+
80+
#### Requirements for in-place resizing
81+
82+
{{< feature-state for_k8s_version="v1.27" state="alpha" >}}
83+
84+
Resizing a workload in-place **without** restarting the {{< glossary_tooltip text="Pods" term_id="pod" >}}
85+
or its {{< glossary_tooltip text="Containers" term_id="container" >}} requires Kubernetes version 1.27 or later.<br />
86+
Additionally, the `InPlaceVerticalScaling` feature gate needs to be enabled.
87+
88+
{{< feature-gate-description name="InPlacePodVerticalScaling" >}}
89+
90+
### Autoscaling based on cluster size
91+
92+
For workloads that need to be scaled based on the size of the cluster (for example
93+
`cluster-dns` or other system components), you can use the
94+
[_Cluster Proportional Autoscaler_](https://github.com/kubernetes-sigs/cluster-proportional-autoscaler).<br />
95+
Just like the VPA, it is not part of the Kubernetes core, but hosted as its
96+
own project on GitHub.
97+
98+
The Cluster Proportional Autoscaler watches the number of schedulable {{< glossary_tooltip text="nodes" term_id="node" >}}
99+
and cores and scales the number of replicas of the target workload accordingly.
100+
101+
If the number of replicas should stay the same, you can scale your workloads vertically according to the cluster size using
102+
the [_Cluster Proportional Vertical Autoscaler_](https://github.com/kubernetes-sigs/cluster-proportional-vertical-autoscaler).
103+
The project is **currently in beta** and can be found on GitHub.
104+
105+
While the Cluster Proportional Autoscaler scales the number of replicas of a workload, the Cluster Proportional Vertical Autoscaler
106+
adjusts the resource requests for a workload (for example a Deployment or DaemonSet) based on the number of nodes and/or cores
107+
in the cluster.
108+
109+
### Event driven Autoscaling
110+
111+
It is also possible to scale workloads based on events, for example using the
112+
[_Kubernetes Event Driven Autoscaler_ (**KEDA**)](https://keda.sh/).
113+
114+
KEDA is a CNCF graduated enabling you to scale your workloads based on the number
115+
of events to be processed, for example the amount of messages in a queue. There exists
116+
a wide range of adapters for different event sources to choose from.
117+
118+
### Autoscaling based on schedules
119+
120+
Another strategy for scaling your workloads is to **schedule** the scaling operations, for example in order to
121+
reduce resource consumption during off-peak hours.
122+
123+
Similar to event driven autoscaling, such behavior can be achieved using KEDA in conjunction with
124+
its [`Cron` scaler](https://keda.sh/docs/2.13/scalers/cron/). The `Cron` scaler allows you to define schedules
125+
(and time zones) for scaling your workloads in or out.
126+
127+
## Scaling cluster infrastructure
128+
129+
If scaling workloads isn't enough to meet your needs, you can also scale your cluster infrastructure itself.
130+
131+
Scaling the cluster infrastructure normally means adding or removing {{< glossary_tooltip text="nodes" term_id="node" >}}.
132+
This can be done using one of two available autoscalers:
133+
134+
- [**Cluster Autoscaler**](https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler)
135+
- [**Karpenter**](https://github.com/kubernetes-sigs/karpenter?tab=readme-ov-file)
136+
137+
Both scalers work by watching for pods marked as _unschedulable_ or _underutilized_ nodes and then adding or
138+
removing nodes as needed.
139+
140+
## {{% heading "whatsnext" %}}
141+
142+
- Learn more about scaling horizontally
143+
- [Scale a StatefulSet](/docs/tasks/run-application/scale-stateful-set/)
144+
- [HorizontalPodAutoscaler Walkthrough](/docs/tasks/run-application/horizontal-pod-autoscale-walkthrough/)
145+
- [Resize Container Resources In-Place](/docs/tasks/configure-pod-container/resize-container-resources/)
146+
- [Autoscale the DNS Service in a Cluster](/docs/tasks/administer-cluster/dns-horizontal-autoscaling/)

0 commit comments

Comments
 (0)