Skip to content

Commit 80527d5

Browse files
committed
docs: Provide autoscaling guidance for self-hosted gateway
Signed-off-by: Tom Kerkhove <[email protected]>
1 parent 6aa777b commit 80527d5

File tree

1 file changed

+30
-1
lines changed

1 file changed

+30
-1
lines changed

articles/api-management/how-to-self-hosted-gateway-on-kubernetes-in-production.md

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,36 @@ By default, a self-hosted gateway is deployed with a **RollingUpdate** deploymen
3434

3535
## Autoscaling
3636

37-
While we provide [guidance on the minimum number of replicas](#number-of-replicas) for the self-hosted gateway, we recommend that you use autoscaling to meet demand of your traffic more proactively.
37+
While we provide [guidance on the minimum number of replicas](#number-of-replicas) for the self-hosted gateway, we recommend that you use autoscaling for the self-hosted gateway to meet the demand of your traffic more proactively.
38+
39+
There are two ways to autoscale the self-hosted gateway horizontally:
40+
41+
- Autoscale based on resource usage (CPU & Memory)
42+
- Autoscale based on the number of requests per second
43+
44+
This is possible through native Kubernetes functionality, or by using [Kubernetes Event-driven Autoscaling (KEDA)](https://keda.sh/). KEDA is a CNCF Incubation project that strives to make application autoscaling simple.
45+
46+
> [!NOTE]
47+
> KEDA is an open-source technology that is not supported by Azure support and needs to be operated by customers.
48+
49+
## Resource-based Autoscaling
50+
51+
Kubernetes allows you to autoscale the self-hosted gateway based on resource usage by using a [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/). It allows you to [define CPU & memory thresholds](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-resource-metrics), and the number of replicas to scale out or in.
52+
53+
An alternative is to use Kubernetes Event-driven Autoscaling (KEDA) allowing you to scale workloads based on a [variety of scalers](https://keda.sh/docs/latest/scalers/), including CPU and memory.
54+
We recommend using KEDA to scale the self-hosted gateway horizontally, if you are already using KEDA to scale other workloads as a unified app autoscaler. If that is not the case, then we strongly suggest to rely on the native Kubernetes functionality through Horizontal Pod Autoscaler.
55+
56+
> [!TIP]
57+
> If you are already using KEDA to scale other workloads, we recommend using KEDA as a unified app autoscaler. If that is not the case, then we strongly suggest to rely on the native Kubernetes functionality through Horizontal Pod Autoscaler.
58+
59+
## Traffic-based Autoscaling
60+
61+
Kubernetes does not provide an out-of-the-box mechanism for traffic-based autoscaling.
62+
63+
Kubernetes Event-driven Autoscaling (KEDA) provides a few ways that can help with traffic-based autoscaling:
64+
65+
- You can scale based on metrics from a Kubernetes ingress if they are available in [Prometheus](https://keda.sh/docs/latest/scalers/prometheus/) or [Azure Monitor](https://keda.sh/docs/latest/scalers/azure-monitor/) by using an out-of-the-box scaler
66+
- You can install [HTTP add-on](https://github.com/kedacore/http-add-on), which is available in beta, and scales based on the number of requests per second.
3867

3968
## Container resources
4069
By default, the YAML file provided in the Azure portal doesn't specify container resource requests.

0 commit comments

Comments
 (0)