You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/aks/availability-zones.md
+18-19Lines changed: 18 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,15 +4,15 @@ description: Learn how to create a cluster that distributes nodes across availab
4
4
services: container-service
5
5
ms.custom: fasttrack-edit
6
6
ms.topic: article
7
-
ms.date: 06/24/2019
7
+
ms.date: 02/27/2020
8
8
9
9
---
10
10
11
11
# Create an Azure Kubernetes Service (AKS) cluster that uses availability zones
12
12
13
-
An Azure Kubernetes Service (AKS) cluster distributes resources such as the nodes and storage across logical sections of the underlying Azure compute infrastructure. This deployment model makes sure that the nodes run across separate update and fault domains in a single Azure datacenter. AKS clusters deployed with this default behavior provide a high level of availability to protect against a hardware failure or planned maintenance event.
13
+
An Azure Kubernetes Service (AKS) cluster distributes resources such as nodes and storage across logical sections of underlying Azure infrastructure. This deployment model when using availability zones, ensures nodes in a given availability zone are physically separated from those defined in another availability zone. AKS clusters deployed with multiple availability zones configured across a cluster provide a higher level of availability to protect against a hardware failure or a planned maintenance event.
14
14
15
-
To provide a higher level of availability to your applications, AKS clusters can be distributed across availability zones. These zones are physically separate datacenters within a given region. When the cluster components are distributed across multiple zones, your AKS cluster is able to tolerate a failure in one of those zones. Your applications and management operations continue to be available even if one entire datacenter has a problem.
15
+
By defining node pools in a cluster to span multiple zones, nodes in a given node pool are able to continue operating even if a single zone has gone down. Your applications can continue to be available even if there is a physical failure in a single datacenter if orchestrated to tolerate failure of a subset of nodes.
16
16
17
17
This article shows you how to create an AKS cluster and distribute the node components across availability zones.
18
18
@@ -37,40 +37,36 @@ AKS clusters can currently be created using availability zones in the following
37
37
38
38
The following limitations apply when you create an AKS cluster using availability zones:
39
39
40
-
* You can only enable availability zones when the cluster is created.
40
+
* You can only define availability zones when the cluster or node pool is created.
41
41
* Availability zone settings can't be updated after the cluster is created. You also can't update an existing, non-availability zone cluster to use availability zones.
42
-
* You can't disable availability zones for an AKS cluster once it has been created.
43
-
* The node size (VM SKU) selected must be available across all availability zones.
44
-
* Clusters with availability zones enabled require use of Azure Standard Load Balancers for distribution across zones.
45
-
* You must use Kubernetes version 1.13.5 or greater in order to deploy Standard Load Balancers.
46
-
47
-
AKS clusters that use availability zones must use the Azure load balancer *standard* SKU, which is the default value for the load balancer type. This load balancer type can only be defined at cluster create time. For more information and the limitations of the standard load balancer, see [Azure load balancer standard SKU limitations][standard-lb-limitations].
42
+
* The chosen node size (VM SKU) selected must be available across all availability zones selected.
43
+
* Clusters with availability zones enabled require use of Azure Standard Load Balancers for distribution across zones. This load balancer type can only be defined at cluster create time. For more information and the limitations of the standard load balancer, see [Azure load balancer standard SKU limitations][standard-lb-limitations].
48
44
49
45
### Azure disks limitations
50
46
51
-
Volumes that use Azure managed disks are currently not zonal resources. Pods rescheduled in a different zone from their original zone can't reattach their previous disk(s). It's recommended to run stateless workloads that don't require persistent storage that may come across zonal issues.
47
+
Volumes that use Azure managed disks are currently not zone-redundant resources. Volumes cannot be attached across zones and must be co-located in the same zone as a given node hosting a the target pod.
52
48
53
-
If you must run stateful workloads, use taints and tolerations in your pod specs to tell the Kubernetes scheduler to create pods in the same zone as your disks. Alternatively, use network-based storage such as Azure Files that can attach to pods as they're scheduled between zones.
49
+
If you must run stateful workloads, use node pool taints and tolerations in pod specs to group pod scheduling in the same zone as your disks. Alternatively, use network-based storage such as Azure Files that can attach to pods as they're scheduled between zones.
54
50
55
51
## Overview of availability zones for AKS clusters
56
52
57
-
Availability zones is a high-availability offering that protects your applications and data from datacenter failures. Zones are unique physical locations within an Azure region. Each zone is made up of one or more datacenters equipped with independent power, cooling, and networking. To ensure resiliency, there’s a minimum of three separate zones in all enabled regions. The physical separation of availability zones within a region protects applications and data from datacenter failures. Zone-redundant services replicate your applications and data across availability zones to protect from single-points-of-failure.
53
+
Availability zones are a high-availability offering that protects your applications and data from datacenter failures. Zones are unique physical locations within an Azure region. Each zone is made up of one or more datacenters equipped with independent power, cooling, and networking. To ensure resiliency, there's a minimum of three separate zones in all zone enabled regions. The physical separation of availability zones within a region protects applications and data from datacenter failures.
58
54
59
55
For more information, see [What are availability zones in Azure?][az-overview].
60
56
61
57
AKS clusters that are deployed using availability zones can distribute nodes across multiple zones within a single region. For example, a cluster in the *East US 2* region can create nodes in all three availability zones in *East US 2*. This distribution of AKS cluster resources improves cluster availability as they're resilient to failure of a specific zone.
62
58
63
59

64
60
65
-
In a zone outage, the nodes can be rebalanced manually or using the cluster autoscaler. If a single zone becomes unavailable, your applications continue to run.
61
+
If a single zone becomes unavailable, your applications continue to run if the cluster is spread across multiple zones.
66
62
67
63
## Create an AKS cluster across availability zones
68
64
69
-
When you create a cluster using the [az aks create][az-aks-create] command, the `--zones` parameter defines which zones agent nodes are deployed into. The AKS control plane components for your cluster are also spread across zones in the highest available configuration when you define the `--zones` parameter at cluster creation time.
65
+
When you create a cluster using the [az aks create][az-aks-create] command, the `--zones` parameter defines which zones agent nodes are deployed into. The control plane components such as etcd is spread across three zones if you define the `--zones` parameter at cluster creation time. The specific zones which the control plane components are spread across are independent of what explicit zones are selected for the initial node pool.
70
66
71
-
If you don't define any zones for the default agent pool when you create an AKS cluster, the AKS control plane components for your cluster will not use availability zones. You can add additional node pools using the [az aks nodepool add][az-aks-nodepool-add] command and specify `--zones` for those new nodes, however the control plane components remain without availability zone awareness. You can't change the zone awareness for a node pool or the AKS control plane components once they're deployed.
67
+
If you don't define any zones for the default agent pool when you create an AKS cluster, control plane components are not guaranteed to spread across availability zones. You can add additional node pools using the [az aks nodepool add][az-aks-nodepool-add] command and specify `--zones` for new nodes, but it will not change how the control plane has been spread across zones. Availability zone settings can only be defined at cluster or node pool create-time.
72
68
73
-
The following example creates an AKS cluster named *myAKSCluster* in the resource group named *myResourceGroup*. A total of *3* nodes are created - one agent in zone *1*, one in *2*, and then one in *3*. The AKS control plane components are also distributed across zones in the highest available configuration since they're defined as part of the cluster create process.
69
+
The following example creates an AKS cluster named *myAKSCluster* in the resource group named *myResourceGroup*. A total of *3* nodes are created - one agent in zone *1*, one in *2*, and then one in *3*.
74
70
75
71
```azurecli-interactive
76
72
az group create --name myResourceGroup --location eastus2
@@ -87,6 +83,8 @@ az aks create \
87
83
88
84
It takes a few minutes to create the AKS cluster.
89
85
86
+
When deciding what zone a new node should belong to, a given AKS node pool will use a [best effort zone balancing offered by underlying Azure Virtual Machine Scale Sets][vmss-zone-balancing]. A given AKS node pool is considered "balanced" if each zone has the same number of VMs or +\- 1 VM in all other zones for the scale set.
87
+
90
88
## Verify node distribution across zones
91
89
92
90
When the cluster is ready, list the agent nodes in the scale set to see what availability zone they're deployed in.
As you can see, we now have two additional nodes in zones 1 and 2. You can deploy an application consisting of three replicas. We will use NGINX as example:
145
+
We now have two additional nodes in zones 1 and 2. You can deploy an application consisting of three replicas. We will use NGINX as an example:
148
146
149
147
```console
150
148
kubectl run nginx --image=nginx --replicas=3
151
149
```
152
150
153
-
If you verify that nodes where your pods are running, you will see that the pods are running on the pods corresponding to three different availability zones. For example with the command `kubectl describe pod | grep -e "^Name:" -e "^Node:"` you would get an output similar to this:
151
+
By viewing nodes where your pods are running, you see pods are running on the nodes corresponding to three different availability zones. For example, with the command `kubectl describe pod | grep -e "^Name:" -e "^Node:"` you would get an output similar to this:
154
152
155
153
```console
156
154
Name: nginx-6db489d4b7-ktdwg
@@ -182,6 +180,7 @@ This article detailed how to create an AKS cluster that uses availability zones.
Copy file name to clipboardExpand all lines: articles/aks/troubleshooting.md
+5-33Lines changed: 5 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -130,10 +130,10 @@ Based on the output of the cluster status:
130
130
131
131
## I'm receiving errors that my service principal was not found when I try to create a new cluster without passing in an existing one.
132
132
133
-
When creating an AKS cluster it requires a service principal to create resources on your behalf. AKS offers the ability to have a new one created at cluster creation time, but this requires Azure Active Directory to fully propagate the new service principal in a reasonable time in order to have the cluster succeed in creation. When this propagation takes too long, the cluster will fail validation to create as it cannot find an available service principal to do so.
133
+
When creating an AKS cluster it requires a service principal to create resources on your behalf. AKS offers the ability to have a new one created at cluster creation time, but this requires Azure Active Directory to propagate the new service principal in a reasonable time in order to have the cluster creation success. If regional propagation exceeds timeout thresholds, the cluster will fail validation to create as it cannot find an available service principal.
134
134
135
135
Use the following workarounds for this:
136
-
1. Use an existing service principal which has already propagated across regions and exists to pass into AKS at cluster create time.
136
+
1. Use an existing service principal to pass to AKS at cluster create time.
137
137
2. If using automation scripts, add time delays between service principal creation and AKS cluster creation.
138
138
3. If using Azure portal, return to the cluster settings during create and retry the validation page after a few minutes.
139
139
@@ -154,41 +154,14 @@ Verify that your settings are not conflicting with any of the required or option
154
154
| 1.14 | 1.14.2 or later |
155
155
156
156
157
-
### What versions of Kubernetes have Azure Disk support on the Sovereign Cloud?
157
+
### What versions of Kubernetes have Azure Disk support on the Sovereign Clouds?
158
158
159
159
| Kubernetes version | Recommended version |
160
160
| -- | :--: |
161
161
| 1.12 | 1.12.0 or later |
162
162
| 1.13 | 1.13.0 or later |
163
163
| 1.14 | 1.14.0 or later |
164
164
165
-
166
-
### WaitForAttach failed for Azure Disk: parsing "/dev/disk/azure/scsi1/lun1": invalid syntax
167
-
168
-
In Kubernetes version 1.10, MountVolume.WaitForAttach may fail with an the Azure Disk remount.
169
-
170
-
On Linux, you may see an incorrect DevicePath format error. For example:
171
-
172
-
```console
173
-
MountVolume.WaitForAttach failed for volume "pvc-f1562ecb-3e5f-11e8-ab6b-000d3af9f967" : azureDisk - Wait for attach expect device path as a lun number, instead got: /dev/disk/azure/scsi1/lun1 (strconv.Atoi: parsing "/dev/disk/azure/scsi1/lun1": invalid syntax)
174
-
Warning FailedMount 1m (x10 over 21m) kubelet, k8s-agentpool-66825246-0 Unable to mount volumes for pod
175
-
```
176
-
177
-
On Windows, you may see a wrong DevicePath(LUN) number error. For example:
178
-
179
-
```console
180
-
Warning FailedMount 1m kubelet, 15282k8s9010 MountVolume.WaitForAttach failed for volume "disk01" : azureDisk - WaitForAttach failed within timeout node (15282k8s9010) diskId:(andy-mghyb
This issue has been fixed in the following versions of Kubernetes:
185
-
186
-
| Kubernetes version | Fixed version |
187
-
| -- | :--: |
188
-
| 1.10 | 1.10.2 or later |
189
-
| 1.11 | 1.11.0 or later |
190
-
| 1.12 and later | N/A |
191
-
192
165
### Failure when setting uid and gid in mountOptions for Azure Disk
193
166
194
167
Azure Disk uses the ext4,xfs filesystem by default and mountOptions such as uid=x,gid=x can't be set at mount time. For example if you tried to set mountOptions uid=999,gid=999, would see an error like:
@@ -322,7 +295,6 @@ This issue has been fixed in the following versions of Kubernetes:
322
295
323
296
If you are using a version of Kubernetes that does not have the fix for this issue and your node VM has an obsolete disk list, you can mitigate the issue by detaching all non-existing disks from the VM as a single, bulk operation. **Individually detaching non-existing disks may fail.**
324
297
325
-
326
298
### Large number of Azure Disks causes slow attach/detach
327
299
328
300
When the number of Azure Disks attached to a node VM is larger than 10, attach and detach operations may be slow. This issue is a known issue and there are no workarounds at this time.
@@ -379,7 +351,7 @@ Recommended settings:
379
351
| 1.12.0 - 1.12.1 | 0755 |
380
352
| 1.12.2 and later | 0777 |
381
353
382
-
If using a cluster with Kubernetes version 1.8.5 or greater and dynamically creating the persistent volume with a storage class, mount options can be specified on the storage class object. The following example sets *0777*:
354
+
Mount options can be specified on the storage class object. The following example sets *0777*:
383
355
384
356
```yaml
385
357
kind: StorageClass
@@ -475,7 +447,7 @@ To update your Azure secret file, use `kubectl edit secret`. For example:
0 commit comments