Skip to content

Commit b6628c5

Browse files
authored
Merge pull request #190453 from tomkerkhove/autoscaling-shgw
docs: Provide autoscaling guidance for API Management's self-hosted gateway
2 parents 20e4744 + 2f97a10 commit b6628c5

File tree

1 file changed

+32
-0
lines changed

1 file changed

+32
-0
lines changed

articles/api-management/how-to-self-hosted-gateway-on-kubernetes-in-production.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,38 @@ The minimum number of replicas suitable for production is three, preferably comb
3232

3333
By default, a self-hosted gateway is deployed with a **RollingUpdate** deployment [strategy](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy). Review the default values and consider explicitly setting the [maxUnavailable](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-unavailable) and [maxSurge](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-surge) fields, especially when you're using a high replica count.
3434

35+
## Autoscaling
36+
37+
While we provide [guidance on the minimum number of replicas](#number-of-replicas) for the self-hosted gateway, we recommend that you use autoscaling for the self-hosted gateway to meet the demand of your traffic more proactively.
38+
39+
There are two ways to autoscale the self-hosted gateway horizontally:
40+
41+
- Autoscale based on resource usage (CPU and memory)
42+
- Autoscale based on the number of requests per second
43+
44+
This is possible through native Kubernetes functionality, or by using [Kubernetes Event-driven Autoscaling (KEDA)](https://keda.sh/). KEDA is a CNCF Incubation project that strives to make application autoscaling simple.
45+
46+
> [!NOTE]
47+
> KEDA is an open-source technology that is not supported by Azure support and needs to be operated by customers.
48+
49+
### Resource-based autoscaling
50+
51+
Kubernetes allows you to autoscale the self-hosted gateway based on resource usage by using a [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/). It allows you to [define CPU and memory thresholds](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-resource-metrics), and the number of replicas to scale out or in.
52+
53+
An alternative is to use Kubernetes Event-driven Autoscaling (KEDA) allowing you to scale workloads based on a [variety of scalers](https://keda.sh/docs/latest/scalers/), including CPU and memory.
54+
55+
> [!TIP]
56+
> If you are already using KEDA to scale other workloads, we recommend using KEDA as a unified app autoscaler. If that is not the case, then we strongly suggest to rely on the native Kubernetes functionality through Horizontal Pod Autoscaler.
57+
58+
### Traffic-based autoscaling
59+
60+
Kubernetes does not provide an out-of-the-box mechanism for traffic-based autoscaling.
61+
62+
Kubernetes Event-driven Autoscaling (KEDA) provides a few ways that can help with traffic-based autoscaling:
63+
64+
- You can scale based on metrics from a Kubernetes ingress if they are available in [Prometheus](https://keda.sh/docs/latest/scalers/prometheus/) or [Azure Monitor](https://keda.sh/docs/latest/scalers/azure-monitor/) by using an out-of-the-box scaler
65+
- You can install [HTTP add-on](https://github.com/kedacore/http-add-on), which is available in beta, and scales based on the number of requests per second.
66+
3567
## Container resources
3668
By default, the YAML file provided in the Azure portal doesn't specify container resource requests.
3769

0 commit comments

Comments
 (0)