docs: Provide autoscaling guidance for self-hosted gateway

tomkerkhove · tomkerkhove · commit 80527d552565 · 2022-03-03T13:37:19.000+01:00
Signed-off-by: Tom Kerkhove &lt;kerkhove.tom@gmail.com&gt;
diff --git a/articles/api-management/how-to-self-hosted-gateway-on-kubernetes-in-production.md b/articles/api-management/how-to-self-hosted-gateway-on-kubernetes-in-production.md
@@ -34,7 +34,36 @@ By default, a self-hosted gateway is deployed with a **RollingUpdate** deploymen
 
 ## Autoscaling
 
-While we provide [guidance on the minimum number of replicas](#number-of-replicas) for the self-hosted gateway, we recommend that you use autoscaling to meet demand of your traffic more proactively.
+While we provide [guidance on the minimum number of replicas](#number-of-replicas) for the self-hosted gateway, we recommend that you use autoscaling for the self-hosted gateway to meet the demand of your traffic more proactively.
+
+There are two ways to autoscale the self-hosted gateway horizontally:
+
+- Autoscale based on resource usage (CPU & Memory)
+- Autoscale based on the number of requests per second
+
+This is possible through native Kubernetes functionality, or by using [Kubernetes Event-driven Autoscaling (KEDA)](https://keda.sh/). KEDA is a CNCF Incubation project that strives to make application autoscaling simple.
+
+> [!NOTE]
+> KEDA is an open-source technology that is not supported by Azure support and needs to be operated by customers.
+
+## Resource-based Autoscaling
+
+Kubernetes allows you to autoscale the self-hosted gateway based on resource usage by using a [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/). It allows you to [define CPU & memory thresholds](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-resource-metrics), and the number of replicas to scale out or in.
+
+An alternative is to use Kubernetes Event-driven Autoscaling (KEDA) allowing you to scale workloads based on a [variety of scalers](https://keda.sh/docs/latest/scalers/), including CPU and memory.
+We recommend using KEDA to scale the self-hosted gateway horizontally, if you are already using KEDA to scale other workloads as a unified app autoscaler. If that is not the case, then we strongly suggest to rely on the native Kubernetes functionality through Horizontal Pod Autoscaler.
+
+> [!TIP]
+> If you are already using KEDA to scale other workloads, we recommend using KEDA as a unified app autoscaler. If that is not the case, then we strongly suggest to rely on the native Kubernetes functionality through Horizontal Pod Autoscaler.
+
+## Traffic-based Autoscaling
+
+Kubernetes does not provide an out-of-the-box mechanism for traffic-based autoscaling.
+
+Kubernetes Event-driven Autoscaling (KEDA) provides a few ways that can help with traffic-based autoscaling:
+
+- You can scale based on metrics from a Kubernetes ingress if they are available in [Prometheus](https://keda.sh/docs/latest/scalers/prometheus/) or [Azure Monitor](https://keda.sh/docs/latest/scalers/azure-monitor/) by using an out-of-the-box scaler
+- You can install [HTTP add-on](https://github.com/kedacore/http-add-on), which is available in beta, and scales based on the number of requests per second.
 
 ## Container resources
 By default, the YAML file provided in the Azure portal doesn't specify container resource requests.