|
| 1 | +--- |
| 2 | +title: Deployment and cluster reliability best practices for Azure Kubernetes Service (AKS) |
| 3 | +titleSuffix: Azure Kubernetes Service |
| 4 | +description: Learn the best practices for deployment and cluster reliability for Azure Kubernetes Service (AKS) workloads. |
| 5 | +ms.topic: conceptual |
| 6 | +ms.date: 01/31/2024 |
| 7 | +--- |
| 8 | + |
| 9 | +# Deployment and cluster reliability best practices for Azure Kubernetes Service (AKS) |
| 10 | + |
| 11 | +## Deployment level best practices |
| 12 | + |
| 13 | +### Pod Disruption Budgets (PDBs) |
| 14 | + |
| 15 | +> **Best practice guidance** |
| 16 | +> |
| 17 | +> Use Pod Disruption Budgets (PDBs) to ensure that a minimum number of pods remain available during *voluntary disruptions*, such as upgrade operations or accidental pod deletions. |
| 18 | +
|
| 19 | +[Pod Disruption Budgets (PDBs)](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/#pod-disruption-budgets) allow you to define how deployments or replica sets respond during voluntary disruptions, such as upgrade operations or accidental pod deletions. Using PDBs, you can define a minimum or maximum unavailable resource count. |
| 20 | + |
| 21 | +For example, let's say you need to perform a cluster upgrade and already have a PDB defined. Before performing the cluster upgrade, the Kubernetes scheduler ensures that the minimum number of pods defined in the PDB are available. If the upgrade would cause the number of available pods to fall below the minimum defined in the PDS, the scheduler schedules extra pods on other nodes before allowing the upgrade to proceed. |
| 22 | + |
| 23 | +In the following example PDB definition file, the `minAvailable` field sets the minimum number of pods that must remain available during voluntary disruptions: |
| 24 | + |
| 25 | +```yaml |
| 26 | +apiVersion: policy/v1 |
| 27 | +kind: PodDisruptionBudget |
| 28 | +metadata: |
| 29 | + name: mypdb |
| 30 | +spec: |
| 31 | + minAvailable: 3 # Minimum number of pods that must remain available |
| 32 | + selector: |
| 33 | + matchLabels: |
| 34 | + app: myapp |
| 35 | +``` |
| 36 | +
|
| 37 | +For more information, see [Plan for availability using PDBs](./operator-best-practices-scheduler.md#plan-for-availability-using-pod-disruption-budgets) and [Specifying a Disruption Budget for your Application](https://kubernetes.io/docs/tasks/run-application/configure-pdb/). |
| 38 | +
|
| 39 | +### Pod CPU and memory limits |
| 40 | +
|
| 41 | +> **Best practice guidance** |
| 42 | +> |
| 43 | +> Set pod CPU and memory limits for all pods to ensure that pods don't consume all resources on a node and to provide protection during service threats, such as DDoS attacks. |
| 44 | +
|
| 45 | +Pod CPU and memory limits define the maximum amount of CPU and memory a pod can use. When a pod exceeds its defined limits, it gets marked for removal. For more information, see [CPU resource units in Kubernetes](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu) and [Memory resource units in Kubernetes](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#meaning-of-memory). |
| 46 | +
|
| 47 | +Setting CPU and memory limits helps you maintain node health and minimizes impact to other pods on the node. Avoid setting a pod limit higher than your nodes can support. Each AKS node reserves a set amount of CPU and memory for the core Kubernetes components. If you set a pod limit higher than the node can support, your application might try to consume too many resources and negatively impact other pods on the node. Cluster administrators need to set resource quotas on a namespace that requires setting resource requests and limits. For more information, see [Enforce resource quotas in AKS](./operator-best-practices-scheduler.md#enforce-resource-quotas). |
| 48 | +
|
| 49 | +In the following example pod definition file, the `resources` section sets the CPU and memory limits for the pod: |
| 50 | + |
| 51 | +```yaml |
| 52 | +kind: Pod |
| 53 | +apiVersion: v1 |
| 54 | +metadata: |
| 55 | + name: mypod |
| 56 | +spec: |
| 57 | + containers: |
| 58 | + - name: mypod |
| 59 | + image: mcr.microsoft.com/oss/nginx/nginx:1.15.5-alpine |
| 60 | + resources: |
| 61 | + requests: |
| 62 | + cpu: 100m |
| 63 | + memory: 128Mi |
| 64 | + limits: |
| 65 | + cpu: 250m |
| 66 | + memory: 256Mi |
| 67 | +``` |
| 68 | + |
| 69 | +For more information, see [Assign CPU Resources to Containers and Pods](https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/) and [Assign Memory Resources to Containers and Pods](https://kubernetes.io/docs/tasks/configure-pod-container/assign-memory-resource/). |
| 70 | + |
| 71 | +### Pod anti-affinity |
| 72 | + |
| 73 | +> **Best practice guidance** |
| 74 | +> |
| 75 | +> Use pod anti-affinity to ensure that pods are spread across nodes for node-down scenarios. |
| 76 | + |
| 77 | +You can use the `nodeSelector` field in your pod specification to specify the node labels you want the target node to have. Kubernetes only schedules the pod onto nodes that have the specified labels. Anti-affinity expands the types of constraints you can define and gives you more control over the selection logic. Anti-affinity allows you to constrain pods against labels on other pods. For more information, see [Affinity and anti-affinity in Kubernetes](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity). |
| 78 | + |
| 79 | +### Pod anti-affinity across availability zones |
| 80 | + |
| 81 | +> **Best practice guidance** |
| 82 | +> |
| 83 | +> Use pod anti-affinity across availability zones to ensure that pods are spread across availability zones for zone-down scenarios. |
| 84 | + |
| 85 | +When you deploy your application across multiple availability zones, you can use pod anti-affinity to ensure that pods are spread across availability zones. This practice helps ensure that your application remains available in the event of a zone-down scenario. For more information, see [Best practices for multiple zones](https://kubernetes.io/docs/setup/best-practices/multiple-zones/) and [Overview of availability zones for AKS clusters](./availability-zones.md#overview-of-availability-zones-for-aks-clusters). |
| 86 | + |
| 87 | +### Readiness and liveness probes |
| 88 | + |
| 89 | +> **Best practice guidance** |
| 90 | +> |
| 91 | +> Configure readiness and liveness probes to improve resiliency for high load and lower container restarts. |
| 92 | + |
| 93 | +#### Readiness probes |
| 94 | + |
| 95 | +In Kubernetes, the kubelet uses readiness probes to know when a container is ready to start accepting traffic. A pod is considered *ready* when all of its containers are ready. When a pod is *not ready*, it's removed from service load balancers. For more information, see [Readiness Probes in Kubernetes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-readiness-probes). |
| 96 | + |
| 97 | +For containerized applications that serve traffic, you should verify that your container is ready to handle incoming requests. [Azure Container Instances](../container-instances/container-instances-overview.md) supports readiness probes to include configurations so that your container can't be accessed under certain conditions. |
| 98 | + |
| 99 | +The following example YAML snipped shows a readiness probe configuration: |
| 100 | + |
| 101 | +```yaml |
| 102 | +readinessProbe: |
| 103 | + exec: |
| 104 | + command: |
| 105 | + - cat |
| 106 | + - /tmp/healthy |
| 107 | + initialDelaySeconds: 5 |
| 108 | + periodSeconds: 5 |
| 109 | +``` |
| 110 | + |
| 111 | +For more information, see [Configure readiness probes](../container-instances/container-instances-readiness-probe.md). |
| 112 | + |
| 113 | +#### Liveness probes |
| 114 | + |
| 115 | +In Kubernetes, the kubelet uses liveness probes to know when to restart a container. If a container fails its liveness probe, the container is restarted. For more information, see [Liveness Probes in Kubernetes](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/). |
| 116 | + |
| 117 | +Containerized applications can run for extended periods of time, resulting in broken states in need of repair by restarting the container. [Azure Container Instances](../container-instances/container-instances-overview.md) supports liveness probes to include configurations so that your container can be restarted under certain conditions. |
| 118 | + |
| 119 | +The following example YAML snipped shows a liveness probe configuration: |
| 120 | + |
| 121 | +```yaml |
| 122 | + livenessProbe: |
| 123 | + exec: |
| 124 | + command: |
| 125 | + - cat |
| 126 | + - /tmp/healthy |
| 127 | +``` |
| 128 | + |
| 129 | +For more information, see [Configure liveness probes](../container-instances/container-instances-liveness-probe.md). |
| 130 | + |
| 131 | +### Pre-stop hooks |
| 132 | + |
| 133 | +> **Best practice guidance** |
| 134 | +> |
| 135 | +> Use pre-stop hooks to ensure graceful termination during SIGTERM. |
| 136 | + |
| 137 | +A `PreStop` hook is called immediately before a container is terminated due to an API request or management event, such as a liveness probe failure. The pod's termination grace period countdown begins before the `PreStop` hook is executed, so the container will eventually terminate within the termination grace period. For more information, see [Container lifecycle hooks](https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#container-hooks) and [Termination of Pods](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination). |
| 138 | + |
| 139 | +### Multi-replica applications |
| 140 | + |
| 141 | +> **Best practice guidance** |
| 142 | +> |
| 143 | +> Deploy at least two replicas of your application to ensure high availability and resiliency in node-down scenarios. |
| 144 | + |
| 145 | +When you create an application in AKS and choose an Azure region during resource creation, it's a single-region app. In the event of a disaster that causes the region to become unavailable, your application also becomes unavailable. If you create an identical deployment in a secondary Azure region, your application becomes less susceptible to a single-region disaster and any data replication across the regions lets you recover your last application state. |
| 146 | + |
| 147 | +For more information, see [Recommended active-active high availability solution overview for AKS](./active-active-solution.md) and [Running Multiple Instances of your Application](https://kubernetes.io/docs/tutorials/kubernetes-basics/scale/scale-intro/). |
| 148 | + |
| 149 | +## Cluster level best practices |
| 150 | + |
| 151 | +### Availability zones |
| 152 | + |
| 153 | +Require at least two for zone-down scenarios. |
| 154 | + |
| 155 | +https://kubernetes.io/docs/setup/best-practices/multiple-zones/ |
| 156 | + |
| 157 | +### Premium Disks |
| 158 | + |
| 159 | +Needed to achieve 99.9% availability in one VM. |
| 160 | + |
| 161 | +https://learn.microsoft.com/en-us/azure/aks/use-premium-v2-disks |
| 162 | + |
| 163 | +### Application dependencies |
| 164 | + |
| 165 | +Such as databases, warn customers if they use dependencies that aren't AZ resilient. |
| 166 | + |
| 167 | +### Auto-scale imbalance |
| 168 | + |
| 169 | +Auto scale requires one node pool in each zone to balance load. |
| 170 | + |
| 171 | +https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/ |
| 172 | +https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/ |
| 173 | + |
| 174 | +### Image versions |
| 175 | + |
| 176 | +Images shouldn't use the latest tag. |
| 177 | + |
| 178 | +https://kubernetes.io/docs/concepts/containers/images/ |
| 179 | + |
| 180 | +### Standard tier for production |
| 181 | + |
| 182 | +Use standard tier for production workloads. |
| 183 | + |
| 184 | +https://learn.microsoft.com/en-us/azure/aks/free-standard-pricing-tiers |
| 185 | + |
| 186 | +### maxUnavailable |
| 187 | + |
| 188 | +Minimum number of pods for rolling upgrades. |
| 189 | + |
| 190 | +https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#max-unavailable |
| 191 | + |
| 192 | +### Accelerated Networking |
| 193 | + |
| 194 | +Provides lower latency, reduced jitter, and decreased CPU utilization on the VMs. |
| 195 | + |
| 196 | +https://learn.microsoft.com/en-us/azure/virtual-network/accelerated-networking-overview?tabs=redhat |
| 197 | + |
| 198 | +### Standard Load Balancer |
| 199 | + |
| 200 | +Supports multiple availability zones, HTTP probes, and it works in multiple data centers. |
| 201 | + |
| 202 | +https://learn.microsoft.com/en-us/azure/aks/load-balancer-standard |
| 203 | + |
| 204 | +### Dynamic IP for Azure CNI |
| 205 | + |
| 206 | +Prevents IP exhaustion for AKS clusters if using Azure CNI. |
| 207 | + |
| 208 | +https://learn.microsoft.com/en-us/azure/aks/configure-azure-cni-dynamic-ip-allocation |
| 209 | + |
| 210 | +### Container insights |
| 211 | + |
| 212 | +Use Prometheus or other tools to track cluster performance. |
| 213 | + |
| 214 | +https://learn.microsoft.com/en-us/azure/azure-monitor/containers/kubernetes-monitoring-enable?tabs=cli |
| 215 | + |
| 216 | +### Scale-down mode |
| 217 | + |
| 218 | +Use scale-down to delete/deallocate nodes. |
| 219 | + |
| 220 | +https://learn.microsoft.com/en-us/azure/aks/scale-down-mode |
| 221 | + |
| 222 | +### Azure policies |
| 223 | + |
| 224 | +Ensures compliance of cluster. |
| 225 | + |
| 226 | +https://learn.microsoft.com/en-us/azure/aks/use-azure-policy |
| 227 | + |
| 228 | +### System node pools |
| 229 | + |
| 230 | +#### Do not use taints |
| 231 | + |
| 232 | +Don't add taints to system node pools. |
| 233 | + |
| 234 | +https://learn.microsoft.com/en-us/azure/aks/use-system-pools?tabs=azure-cli |
| 235 | + |
| 236 | +#### Autoscaler for system node pools |
| 237 | + |
| 238 | +Use the autoscaler for system node pools. |
| 239 | + |
| 240 | +https://learn.microsoft.com/en-us/azure/aks/use-system-pools?tabs=azure-cli |
| 241 | +https://learn.microsoft.com/en-us/azure/aks/keda-about |
| 242 | + |
| 243 | +#### At least two nodes in system node pools |
| 244 | + |
| 245 | +Ensures resiliency for node-down scenarios. |
| 246 | + |
| 247 | +https://learn.microsoft.com/en-us/azure/aks/use-system-pools?tabs=azure-cli |
| 248 | + |
| 249 | +### Container images |
| 250 | + |
| 251 | +Only use allowed images. |
| 252 | + |
| 253 | +https://learn.microsoft.com/en-us/azure/aks/operator-best-practices-container-image-management |
| 254 | +https://learn.microsoft.com/en-us/azure/aks/image-integrity?tabs=azure-cli |
| 255 | + |
| 256 | +### Image pulls |
| 257 | + |
| 258 | +No unauthenticated image pulls. |
| 259 | + |
| 260 | +https://learn.microsoft.com/en-us/azure/aks/artifact-streaming |
| 261 | + |
| 262 | +### v5 SKU VMs |
| 263 | + |
| 264 | +v4/v5 SKUs have better reliability and less impact of updates. |
| 265 | + |
| 266 | +https://learn.microsoft.com/en-us/azure/aks/best-practices-performance-scale |
| 267 | +https://learn.microsoft.com/en-us/azure/aks/best-practices-performance-scale-large |
| 268 | +https://learn.microsoft.com/en-us/azure/aks/operator-best-practices-run-at-scale |
| 269 | + |
| 270 | +#### Do not use B series VMs |
| 271 | + |
| 272 | +B series VMs are low performance and don't work well with AKS. |
0 commit comments