Skip to content

Commit 667db81

Browse files
authored
Merge pull request #183423 from tomkerkhove/shgw-apim-kubernetes-prod-guidance
docs: Provide dedicated page for production considerations for self-hosted gateway for Azure API Management
2 parents 4ab697e + c0c8681 commit 667db81

6 files changed

+132
-71
lines changed

articles/api-management/TOC.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -332,6 +332,10 @@
332332
href: api-management-howto-migrate.md
333333
- name: Use managed identities for Azure resources
334334
href: api-management-howto-use-managed-service-identity.md
335+
- name: Self-hosted gateway
336+
items:
337+
- name: Guidance for running on Kubernetes
338+
href: how-to-self-hosted-gateway-on-kubernetes-in-production.md
335339
- name: Integrate with Service Fabric
336340
items:
337341
- name: Service Fabric integration overview

articles/api-management/how-to-deploy-self-hosted-gateway-azure-arc.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,3 +110,4 @@ To enable monitoring of the self-hosted gateway, configure the following Log Ana
110110
* To learn more about the self-hosted gateway, see [Azure API Management self-hosted gateway overview](self-hosted-gateway-overview.md).
111111
* Discover all [Azure Arc-enabled Kubernetes extensions](../azure-arc/kubernetes/extensions.md).
112112
* Learn more about [Azure Arc-enabled Kubernetes](../azure-arc/kubernetes/overview.md).
113+
* Learn more about guidance to [run the self-hosted gateway on Kubernetes in production](how-to-self-hosted-gateway-on-kubernetes-in-production.md).

articles/api-management/how-to-deploy-self-hosted-gateway-azure-kubernetes-service.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@ This article provides the steps for deploying self-hosted gateway component of A
6868

6969
* To learn more about the self-hosted gateway, see [Azure API Management self-hosted gateway overview](self-hosted-gateway-overview.md).
7070
* Learn [how to deploy API Management self-hosted gateway to Azure Arc-enabled Kubernetes clusters](how-to-deploy-self-hosted-gateway-azure-arc.md).
71+
* Learn more about guidance to [run the self-hosted gateway on Kubernetes in production](how-to-self-hosted-gateway-on-kubernetes-in-production.md).
7172
* Learn more about [Azure Kubernetes Service](../aks/intro-kubernetes.md).
7273
* Learn [how to configure and persist logs in the cloud](how-to-configure-cloud-metrics-logs.md).
7374
* Learn [how to configure and persist logs locally](how-to-configure-local-metrics-logs.md).

articles/api-management/how-to-deploy-self-hosted-gateway-kubernetes.md

Lines changed: 1 addition & 67 deletions
Original file line numberDiff line numberDiff line change
@@ -58,74 +58,8 @@ This article describes the steps for deploying the self-hosted gateway component
5858
> Run the `kubectl logs deployment/<gateway-name>` command to view logs from a randomly selected pod if there's more than one.
5959
> Run `kubectl logs -h` for a complete set of command options, such as how to view logs for a specific pod or container.
6060

61-
## Production deployment considerations
62-
63-
### Access token
64-
Without a valid access token, a self-hosted gateway can't access and download configuration data from the endpoint of the associated API Management service. The access token can be valid for a maximum of 30 days. It must be regenerated, and the cluster configured with a fresh token, either manually or via automation before it expires.
65-
66-
When you're automating token refresh, use [this management API operation](/rest/api/apimanagement/current-ga/gateway/generate-token) to generate a new token. For information on managing Kubernetes secrets, see the [Kubernetes website](https://kubernetes.io/docs/concepts/configuration/secret).
67-
68-
### Namespace
69-
Kubernetes [namespaces](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/) help with dividing a single cluster among multiple teams, projects, or applications. Namespaces provide a scope for resources and names. They can be associated with a resource quota and access control policies.
70-
71-
The Azure portal provides commands to create self-hosted gateway resources in the **default** namespace. This namespace is automatically created, exists in every cluster, and can't be deleted.
72-
Consider [creating and deploying](https://www.kubernetesbyexample.com/) a self-hosted gateway into a separate namespace in production.
73-
74-
### Number of replicas
75-
The minimum number of replicas suitable for production is two.
76-
77-
By default, a self-hosted gateway is deployed with a **RollingUpdate** deployment [strategy](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy). Review the default values and consider explicitly setting the [maxUnavailable](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-unavailable) and [maxSurge](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-surge) fields, especially when you're using a high replica count.
78-
79-
### Container resources
80-
By default, the YAML file provided in the Azure portal doesn't specify container resource requests.
81-
82-
It's impossible to reliably predict and recommend the amount of per-container CPU and memory resources and the number of replicas required for supporting a specific workload. Many factors are at play, such as:
83-
84-
- Specific hardware that the cluster is running on.
85-
- Presence and type of virtualization.
86-
- Number and rate of concurrent client connections.
87-
- Request rate.
88-
- Kind and number of configured policies.
89-
- Payload size and whether payloads are buffered or streamed.
90-
- Backend service latency.
91-
92-
We recommend setting resource requests to two cores and 2 GiB as a starting point. Perform a load test and scale up/out or down/in based on the results.
93-
94-
### Container image tag
95-
The YAML file provided in the Azure portal uses the **latest** tag. This tag always references the most recent version of the self-hosted gateway container image.
96-
97-
Consider using a specific version tag in production to avoid unintentional upgrade to a newer version.
98-
99-
You can [download a full list of available tags](https://mcr.microsoft.com/v2/azure-api-management/gateway/tags/list).
100-
101-
### DNS policy
102-
DNS name resolution plays a critical role in a self-hosted gateway's ability to connect to dependencies in Azure and dispatch API calls to backend services.
103-
104-
The YAML file provided in the Azure portal applies the default [ClusterFirst](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy) policy. This policy causes name resolution requests not resolved by the cluster DNS to be forwarded to the upstream DNS server that's inherited from the node.
105-
106-
To learn about name resolution in Kubernetes, see the [Kubernetes website](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service). Consider customizing [DNS policy](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy) or [DNS configuration](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-config) as appropriate for your setup.
107-
108-
### External traffic policy
109-
The YAML file provided in the Azure portal sets `externalTrafficPolicy` field on the [Service](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#service-v1-core) object to `Local`. This preserves caller IP address (accessible in the [request context](api-management-policy-expressions.md#ContextVariables)) and disables cross node load balancing, eliminating network hops caused by it. Be aware, that this setting might cause asymmetric distribution of traffic in deployments with unequal number of gateway pods per node.
110-
111-
### Custom domain names and SSL certificates
112-
113-
If you use custom domain names for the API Management endpoints, especially if you use a custom domain name for the Management endpoint, you might need to update the value of `config.service.endpoint` in the **\<gateway-name\>.yaml** file to replace the default domain name with the custom domain name. Make sure that the Management endpoint can be accessed from the pod of the self-hosted gateway in the Kubernetes cluster.
114-
115-
In this scenario, if the SSL certificate that's used by the Management endpoint isn't signed by a well-known CA certificate, you must make sure that the CA certificate is trusted by the pod of the self-hosted gateway.
116-
117-
### Configuration backup
118-
To learn about self-hosted gateway behavior in the presence of a temporary Azure connectivity outage, see [Self-hosted gateway overview](self-hosted-gateway-overview.md#connectivity-to-azure).
119-
120-
Configure a local storage volume for the self-hosted gateway container, so it can persist a backup copy of the latest downloaded configuration. If connectivity is down, the storage volume can use the backup copy upon restart. The volume mount path must be <code>/apim/config</code>. See an example on [GitHub](https://github.com/Azure/api-management-self-hosted-gateway/blob/master/examples/self-hosted-gateway-with-configuration-backup.yaml).
121-
To learn about storage in Kubernetes, see the [Kubernetes website](https://kubernetes.io/docs/concepts/storage/volumes/).
122-
123-
### Local logs and metrics
124-
The self-hosted gateway sends telemetry to [Azure Monitor](api-management-howto-use-azure-monitor.md) and [Azure Application Insights](api-management-howto-app-insights.md) according to configuration settings in the associated API Management service.
125-
When [connectivity to Azure](self-hosted-gateway-overview.md#connectivity-to-azure) is temporarily lost, the flow of telemetry to Azure is interrupted and the data is lost for the duration of the outage.
126-
Consider [setting up local monitoring](how-to-configure-local-metrics-logs.md) to ensure the ability to observe API traffic and prevent telemetry loss during Azure connectivity outages.
127-
12861
## Next steps
12962

13063
* To learn more about the self-hosted gateway, see [Self-hosted gateway overview](self-hosted-gateway-overview.md).
13164
* Learn [how to deploy API Management self-hosted gateway to Azure Arc-enabled Kubernetes clusters](how-to-deploy-self-hosted-gateway-azure-arc.md).
65+
* Learn more about guidance for [running the self-hosted gateway on Kubernetes in production](how-to-self-hosted-gateway-on-kubernetes-in-production.md).
Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
---
2+
title: Self-hosted gateway on Kubernetes in production | Azure API Management
3+
description: Learn about guidance to run an API Management self-hosted gateway on Kubernetes for production workloads
4+
author: tomkerkhove
5+
manager: mrcarlosdev
6+
ms.service: api-management
7+
ms.workload: mobile
8+
ms.topic: article
9+
ms.author: tomkerkhove
10+
ms.date: 12/17/2021
11+
---
12+
13+
# Guidance for running self-hosted gateway on Kubernetes in production
14+
15+
In order to run the self-hosted gateway in production, there are various aspects to take in to mind. For example, it should be deployed in a highly-available manner, use configuration backups to handle temporary disconnects and many more.
16+
17+
This article provides guidance on how to run [self-hosted gateway](./self-hosted-gateway-overview.md) on Kubernetes for production workloads to ensure that it will run smoothly and reliably.
18+
19+
## Access token
20+
Without a valid access token, a self-hosted gateway can't access and download configuration data from the endpoint of the associated API Management service. The access token can be valid for a maximum of 30 days. It must be regenerated, and the cluster configured with a fresh token, either manually or via automation before it expires.
21+
22+
When you're automating token refresh, use [this management API operation](/rest/api/apimanagement/current-ga/gateway/generate-token) to generate a new token. For information on managing Kubernetes secrets, see the [Kubernetes website](https://kubernetes.io/docs/concepts/configuration/secret).
23+
24+
## Namespace
25+
Kubernetes [namespaces](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/) help with dividing a single cluster among multiple teams, projects, or applications. Namespaces provide a scope for resources and names. They can be associated with a resource quota and access control policies.
26+
27+
The Azure portal provides commands to create self-hosted gateway resources in the **default** namespace. This namespace is automatically created, exists in every cluster, and can't be deleted.
28+
Consider [creating and deploying](https://www.kubernetesbyexample.com/) a self-hosted gateway into a separate namespace in production.
29+
30+
## Number of replicas
31+
The minimum number of replicas suitable for production is two.
32+
33+
By default, a self-hosted gateway is deployed with a **RollingUpdate** deployment [strategy](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy). Review the default values and consider explicitly setting the [maxUnavailable](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-unavailable) and [maxSurge](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-surge) fields, especially when you're using a high replica count.
34+
35+
## Container resources
36+
By default, the YAML file provided in the Azure portal doesn't specify container resource requests.
37+
38+
It's impossible to reliably predict and recommend the amount of per-container CPU and memory resources and the number of replicas required for supporting a specific workload. Many factors are at play, such as:
39+
40+
- Specific hardware that the cluster is running on.
41+
- Presence and type of virtualization.
42+
- Number and rate of concurrent client connections.
43+
- Request rate.
44+
- Kind and number of configured policies.
45+
- Payload size and whether payloads are buffered or streamed.
46+
- Backend service latency.
47+
48+
We recommend setting resource requests to two cores and 2 GiB as a starting point. Perform a load test and scale up/out or down/in based on the results.
49+
50+
## Container image tag
51+
The YAML file provided in the Azure portal uses the **latest** tag. This tag always references the most recent version of the self-hosted gateway container image.
52+
53+
Consider using a specific version tag in production to avoid unintentional upgrade to a newer version.
54+
55+
You can [download a full list of available tags](https://mcr.microsoft.com/v2/azure-api-management/gateway/tags/list).
56+
57+
> [!TIP]
58+
> When installing with Helm, image tagging is optimized for you. The Helm chart's application version pins the gateway to a given version and does not rely on `latest`.
59+
>
60+
> Learn more on how to [install an API Management self-hosted gateway on Kubernetes with Helm](how-to-deploy-self-hosted-gateway-kubernetes-helm.md).
61+
62+
## DNS policy
63+
DNS name resolution plays a critical role in a self-hosted gateway's ability to connect to dependencies in Azure and dispatch API calls to backend services.
64+
65+
The YAML file provided in the Azure portal applies the default [ClusterFirst](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy) policy. This policy causes name resolution requests not resolved by the cluster DNS to be forwarded to the upstream DNS server that's inherited from the node.
66+
67+
To learn about name resolution in Kubernetes, see the [Kubernetes website](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service). Consider customizing [DNS policy](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy) or [DNS configuration](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-config) as appropriate for your setup.
68+
69+
## External traffic policy
70+
The YAML file provided in the Azure portal sets `externalTrafficPolicy` field on the [Service](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#service-v1-core) object to `Local`. This preserves caller IP address (accessible in the [request context](api-management-policy-expressions.md#ContextVariables)) and disables cross node load balancing, eliminating network hops caused by it. Be aware, that this setting might cause asymmetric distribution of traffic in deployments with unequal number of gateway pods per node.
71+
72+
## Custom domain names and SSL certificates
73+
74+
If you use custom domain names for the API Management endpoints, especially if you use a custom domain name for the Management endpoint, you might need to update the value of `config.service.endpoint` in the **\<gateway-name\>.yaml** file to replace the default domain name with the custom domain name. Make sure that the Management endpoint can be accessed from the pod of the self-hosted gateway in the Kubernetes cluster.
75+
76+
In this scenario, if the SSL certificate that's used by the Management endpoint isn't signed by a well-known CA certificate, you must make sure that the CA certificate is trusted by the pod of the self-hosted gateway.
77+
78+
## Configuration backup
79+
80+
Configure a local storage volume for the self-hosted gateway container, so it can persist a backup copy of the latest downloaded configuration. If connectivity is down, the storage volume can use the backup copy upon restart. The volume mount path must be <code>/apim/config</code>. See an example on [GitHub](https://github.com/Azure/api-management-self-hosted-gateway/blob/master/examples/self-hosted-gateway-with-configuration-backup.yaml).
81+
To learn about storage in Kubernetes, see the [Kubernetes website](https://kubernetes.io/docs/concepts/storage/volumes/).
82+
83+
> [!NOTE]
84+
> To learn about self-hosted gateway behavior in the presence of a temporary Azure connectivity outage, see [Self-hosted gateway overview](self-hosted-gateway-overview.md#connectivity-to-azure).
85+
86+
## Local logs and metrics
87+
The self-hosted gateway sends telemetry to [Azure Monitor](api-management-howto-use-azure-monitor.md) and [Azure Application Insights](api-management-howto-app-insights.md) according to configuration settings in the associated API Management service.
88+
When [connectivity to Azure](self-hosted-gateway-overview.md#connectivity-to-azure) is temporarily lost, the flow of telemetry to Azure is interrupted and the data is lost for the duration of the outage.
89+
Consider [setting up local monitoring](how-to-configure-local-metrics-logs.md) to ensure the ability to observe API traffic and prevent telemetry loss during Azure connectivity outages.
90+
91+
## High availability
92+
The self-hosted gateway is a crucial component in the infrastructure and has to be highly available. However, failure will and can happen.
93+
94+
Consider protecting the self-hosted gateway against [disruption](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/).
95+
96+
> [!TIP]
97+
> When installing with Helm, easily enable high available scheduling by enabling the `highAvailability.enabled` configuration option.
98+
>
99+
> Learn more on how to [install an API Management self-hosted gateway on Kubernetes with Helm](how-to-deploy-self-hosted-gateway-kubernetes-helm.md).
100+
101+
### Protecting against node failure
102+
To prevent being affected due to data center or node failures, consider using a Kubernetes cluster that uses availability zones to achieve high availability on the node-level.
103+
104+
Availability zones allow you to schedule the self-hosted gateway's pod on nodes spread across the zones by using:
105+
- [Pod Topology Spread Constraints](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/) (Recommended - Kubernetes v1.19+)
106+
- [Pod Anti-Affinity](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/)
107+
108+
> [!Note]
109+
> If you are using Azure Kubernetes Service, learn how to use availability zones in [this article](./../aks/availability-zones.md).
110+
111+
### Protecting against pod disruption
112+
113+
Pods can experience disruption due to [various](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#voluntary-and-involuntary-disruptions) reasons such as manual pod deletion, node maintenance, etc.
114+
115+
Consider using [Pod Disruption Budgets](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets) to enforce a minimum number of pods to be available at any given time.
116+
117+
## Next steps
118+
119+
* To learn more about the self-hosted gateway, see [Self-hosted gateway overview](self-hosted-gateway-overview.md).
120+
* Learn [how to deploy API Management self-hosted gateway to Azure Arc-enabled Kubernetes clusters](how-to-deploy-self-hosted-gateway-azure-arc.md).

0 commit comments

Comments
 (0)