Skip to content

Commit 1998135

Browse files
Merge pull request #209353 from schaffererin/upgrade-cluster-update
Updating fix to GitHub issue 95910 - Upgrade an AKS cluster
2 parents 5d00064 + 740092b commit 1998135

File tree

1 file changed

+35
-25
lines changed

1 file changed

+35
-25
lines changed

articles/aks/upgrade-cluster.md

Lines changed: 35 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ For AKS clusters that use multiple node pools or Windows Server nodes, see [Upgr
1717

1818
### [Azure CLI](#tab/azure-cli)
1919

20-
This article requires that you are running the Azure CLI version 2.0.65 or later. Run `az --version` to find the version. If you need to install or upgrade, see [Install Azure CLI][azure-cli-install].
20+
This article requires that you're running the Azure CLI version 2.34.1 or later. Run `az --version` to find the version. If you need to install or upgrade, see [Install Azure CLI][azure-cli-install].
2121

2222
### [Azure PowerShell](#tab/azure-powershell)
2323

@@ -40,7 +40,7 @@ az aks get-upgrades --resource-group myResourceGroup --name myAKSCluster --outpu
4040

4141
> [!NOTE]
4242
> When you upgrade a supported AKS cluster, Kubernetes minor versions can't be skipped. All upgrades must be performed sequentially by major version number. For example, upgrades between *1.14.x* -> *1.15.x* or *1.15.x* -> *1.16.x* are allowed, however *1.14.x* -> *1.16.x* is not allowed.
43-
>
43+
>
4444
> Skipping multiple versions can only be done when upgrading from an _unsupported version_ back to a _supported version_. For example, an upgrade from an unsupported *1.10.x* -> a supported *1.15.x* can be completed if available.
4545
4646
The following example output shows that the cluster can be upgraded to versions *1.19.1* and *1.19.3*:
@@ -51,28 +51,37 @@ Name ResourceGroup MasterVersion Upgrades
5151
default myResourceGroup 1.18.10 1.19.1, 1.19.3
5252
```
5353

54-
The following output shows that no upgrades are available (or it may also be possible that cli is not upgraded):
54+
The following example output means that the appservice-kube extension isn't compatible with your Azure CLI version (a minimum of version 2.34.1 is required):
55+
56+
```console
57+
The 'appservice-kube' extension is not compatible with this version of the CLI.
58+
You have CLI core version 2.0.81 and this extension requires a min of 2.34.1.
59+
Table output unavailable. Use the --query option to specify an appropriate query. Use --debug for more info.
60+
```
61+
62+
If you receive this output, you need to update your Azure CLI version. The `az upgrade` command was added in version 2.11.0 and doesn't work with versions prior to 2.11.0. Older versions can be updated by reinstalling Azure CLI as described in [Install the Azure CLI](/cli/azure/install-azure-cli). If your Azure CLI version is 2.11.0 or later, you'll receive a message to run `az upgrade` to upgrade Azure CLI to the latest version.
63+
64+
If your Azure CLI is updated and you receive the following example output, it means that no upgrades are available:
5565

5666
```console
5767
ERROR: Table output unavailable. Use the --query option to specify an appropriate query. Use --debug for more info.
5868
```
5969

60-
> [!IMPORTANT]
61-
> If no upgrade is available, create a new cluster with a supported version of Kubernetes and migrate your workloads from the existing cluster to the new cluster. Attempting to upgrade a cluster to a newer Kubernetes version when `az aks get-upgrades` shows no upgrades available is not supported.
70+
If no upgrades are available, create a new cluster with a supported version of Kubernetes and migrate your workloads from the existing cluster to the new cluster. It's not supported to upgrade a cluster to a newer Kubernetes version when `az aks get-upgrades` shows that no upgrades are available.
6271

6372
### [Azure PowerShell](#tab/azure-powershell)
6473

6574
To check which Kubernetes releases are available for your cluster, use the [Get-AzAksUpgradeProfile][get-azaksupgradeprofile] command. The following example checks for available upgrades to *myAKSCluster* in *myResourceGroup*:
6675

6776
```azurepowershell-interactive
68-
Get-AzAksUpgradeProfile -ResourceGroupName myResourceGroup -ClusterName myAKSCluster |
77+
Get-AzAksUpgradeProfile -ResourceGroupName myResourceGroup -ClusterName myAKSCluster |
6978
Select-Object -Property Name, ControlPlaneProfileKubernetesVersion -ExpandProperty ControlPlaneProfileUpgrade |
7079
Format-Table -Property *
7180
```
7281

7382
> [!NOTE]
7483
> When you upgrade a supported AKS cluster, Kubernetes minor versions can't be skipped. All upgrades must be performed sequentially by major version number. For example, upgrades between *1.14.x* -> *1.15.x* or *1.15.x* -> *1.16.x* are allowed, however *1.14.x* -> *1.16.x* is not allowed.
75-
>
84+
>
7685
> Skipping multiple versions can only be done when upgrading from an _unsupported version_ back to a _supported version_. For example, an upgrade from an unsupported *1.10.x* -> a supported *1.15.x* can be completed if available.
7786
7887
The following example output shows that the cluster can be upgraded to versions *1.19.1* and *1.19.3*:
@@ -84,27 +93,26 @@ default 1.18.10 1.19.1
8493
default 1.18.10 1.19.3
8594
```
8695

87-
> [!IMPORTANT]
88-
> If no upgrade is available, create a new cluster with a supported version of Kubernetes and migrate your workloads from the existing cluster to the new cluster. Attempting to upgrade a cluster to a newer Kubernetes version when `Get-AzAksUpgradeProfile` shows no upgrades available is not supported.
96+
If no upgrade is available, create a new cluster with a supported version of Kubernetes and migrate your workloads from the existing cluster to the new cluster. It's not supported to upgrade a cluster to a newer Kubernetes version when `Get-AzAksUpgradeProfile` shows that no upgrades are available.
8997

9098
---
9199

92100
## Customize node surge upgrade
93101

94-
> [!Important]
102+
> [!IMPORTANT]
95103
> Node surges require subscription quota for the requested max surge count for each upgrade operation. For example, a cluster that has 5 node pools, each with a count of 4 nodes, has a total of 20 nodes. If each node pool has a max surge value of 50%, additional compute and IP quota of 10 nodes (2 nodes * 5 pools) is required to complete the upgrade.
96104
>
97105
> If using Azure CNI, validate there are available IPs in the subnet as well to [satisfy IP requirements of Azure CNI](configure-azure-cni.md).
98106
99-
By default, AKS configures upgrades to surge with one extra node. A default value of one for the max surge settings will enable AKS to minimize workload disruption by creating an extra node before the cordon/drain of existing applications to replace an older versioned node. The max surge value may be customized per node pool to enable a trade-off between upgrade speed and upgrade disruption. By increasing the max surge value, the upgrade process completes faster, but setting a large value for max surge may cause disruptions during the upgrade process.
107+
By default, AKS configures upgrades to surge with one extra node. A default value of one for the max surge settings will enable AKS to minimize workload disruption by creating an extra node before the cordon/drain of existing applications to replace an older versioned node. The max surge value may be customized per node pool to enable a trade-off between upgrade speed and upgrade disruption. By increasing the max surge value, the upgrade process completes faster, but setting a large value for max surge may cause disruptions during the upgrade process.
100108

101109
For example, a max surge value of 100% provides the fastest possible upgrade process (doubling the node count) but also causes all nodes in the node pool to be drained simultaneously. You may wish to use a higher value such as this for testing environments. For production node pools, we recommend a max_surge setting of 33%.
102110

103111
AKS accepts both integer values and a percentage value for max surge. An integer such as "5" indicates five extra nodes to surge. A value of "50%" indicates a surge value of half the current node count in the pool. Max surge percent values can be a minimum of 1% and a maximum of 100%. A percent value is rounded up to the nearest node count. If the max surge value is lower than the current node count at the time of upgrade, the current node count is used for the max surge value.
104112

105113
During an upgrade, the max surge value can be a minimum of 1 and a maximum value equal to the number of nodes in your node pool. You can set larger values, but the maximum number of nodes used for max surge won't be higher than the number of nodes in the pool at the time of upgrade.
106114

107-
> [!Important]
115+
> [!IMPORTANT]
108116
> The max surge setting on a node pool is persistent. Subsequent Kubernetes upgrades or node version upgrades will use this setting. You may change the max surge value for your node pools at any time. For production node pools, we recommend a max-surge setting of 33%.
109117
110118
Use the following commands to set max surge values for new or existing node pools.
@@ -123,11 +131,12 @@ az aks nodepool update -n mynodepool -g MyResourceGroup --cluster-name MyManaged
123131

124132
### [Azure CLI](#tab/azure-cli)
125133

126-
With a list of available versions for your AKS cluster, use the [az aks upgrade][az-aks-upgrade] command to upgrade. During the upgrade process, AKS will:
127-
- add a new buffer node (or as many nodes as configured in [max surge](#customize-node-surge-upgrade)) to the cluster that runs the specified Kubernetes version.
128-
- [cordon and drain][kubernetes-drain] one of the old nodes to minimize disruption to running applications (if you're using max surge, it will [cordon and drain][kubernetes-drain] as many nodes at the same time as the number of buffer nodes specified).
129-
- When the old node is fully drained, it will be reimaged to receive the new version and it will become the buffer node for the following node to be upgraded.
130-
- This process repeats until all nodes in the cluster have been upgraded.
134+
With a list of available versions for your AKS cluster, use the [az aks upgrade][az-aks-upgrade] command to upgrade. During the upgrade process, AKS will:
135+
136+
- Add a new buffer node (or as many nodes as configured in [max surge](#customize-node-surge-upgrade)) to the cluster that runs the specified Kubernetes version.
137+
- [Cordon and drain][kubernetes-drain] one of the old nodes to minimize disruption to running applications. If you're using max surge, it will [cordon and drain][kubernetes-drain] as many nodes at the same time as the number of buffer nodes specified.
138+
- When the old node is fully drained, it will be reimaged to receive the new version, and it will become the buffer node for the following node to be upgraded.
139+
- This process repeats until all nodes in the cluster have been upgraded.
131140
- At the end of the process, the last buffer node will be deleted, maintaining the existing agent node count and zone balance.
132141

133142
[!INCLUDE [alias minor version callout](./includes/aliasminorversion/alias-minor-version-upgrade.md)]
@@ -161,11 +170,12 @@ myAKSCluster eastus myResourceGroup 1.19.1 Succeeded
161170

162171
### [Azure PowerShell](#tab/azure-powershell)
163172

164-
With a list of available versions for your AKS cluster, use the [Set-AzAksCluster][set-azakscluster] cmdlet to upgrade. During the upgrade process, AKS will:
165-
- add a new buffer node (or as many nodes as configured in [max surge](#customize-node-surge-upgrade)) to the cluster that runs the specified Kubernetes version.
166-
- [cordon and drain][kubernetes-drain] one of the old nodes to minimize disruption to running applications (if you're using max surge it will [cordon and drain][kubernetes-drain] as many nodes at the same time as the number of buffer nodes specified).
167-
- When the old node is fully drained, it will be reimaged to receive the new version and it will become the buffer node for the following node to be upgraded.
168-
- This process repeats until all nodes in the cluster have been upgraded.
173+
With a list of available versions for your AKS cluster, use the [Set-AzAksCluster][set-azakscluster] cmdlet to upgrade. During the upgrade process, AKS will:
174+
175+
- Add a new buffer node (or as many nodes as configured in [max surge](#customize-node-surge-upgrade)) to the cluster that runs the specified Kubernetes version.
176+
- [Cordon and drain][kubernetes-drain] one of the old nodes to minimize disruption to running applications. If you're using max surge, it will [cordon and drain][kubernetes-drain] as many nodes at the same time as the number of buffer nodes specified.
177+
- When the old node is fully drained, it will be reimaged to receive the new version, and it will become the buffer node for the following node to be upgraded.
178+
- This process repeats until all nodes in the cluster have been upgraded.
169179
- At the end of the process, the last buffer node will be deleted, maintaining the existing agent node count and zone balance.
170180

171181
[!INCLUDE [alias minor version callout](./includes/aliasminorversion/alias-minor-version-upgrade.md)]
@@ -199,7 +209,7 @@ myAKSCluster eastus 1.19.1 Succeeded myakscluster-dns-379cb
199209

200210
## View the upgrade events
201211

202-
When you upgrade your cluster, the following Kubenetes events may occur on each node:
212+
When you upgrade your cluster, the following Kubernetes events may occur on each node:
203213

204214
- Surge – Create surge node.
205215
- Drain – Pods are being evicted from the node. Each pod has a 30-minute timeout to complete the eviction.
@@ -224,13 +234,13 @@ default 9m22s Normal Surge node/aks-nodepool1-96663640-vmss000002 Created a surg
224234

225235
## Set auto-upgrade channel
226236

227-
In addition to manually upgrading a cluster, you can set an auto-upgrade channel on your cluster. For more information, see [Auto-upgrading an AKS cluster][aks-auto-upgrade].
237+
In addition to manually upgrading a cluster, you can set an auto-upgrade channel on your cluster. For more information, see [Auto-upgrading an AKS cluster][aks-auto-upgrade].
228238

229239
## Special considerations for node pools that span multiple Availability Zones
230240

231241
AKS uses best-effort zone balancing in node groups. During an Upgrade surge, zone(s) for the surge node(s) in virtual machine scale sets is unknown ahead of time. This can temporarily cause an unbalanced zone configuration during an upgrade. However, AKS deletes the surge node(s) once the upgrade has been completed and preserves the original zone balance. If you desire to keep your zones balanced during upgrade, increase the surge to a multiple of three nodes. Virtual machine scale sets will then balance your nodes across Availability Zones with best-effort zone balancing.
232242

233-
If you have PVCs backed by Azure LRS Disks, they’ll be bound to a particular zone and may fail to recover immediately if the surge node doesn’t match the zone of the PVC. This could cause downtime on your application when the Upgrade operation continues to drain nodes but the PVs are bound to a zone. To handle this case and maintain high availability, configure a [Pod Disruption Budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) on your application. This allows Kubernetes to respect your availability requirements during Upgrade's drain operation.
243+
If you have PVCs backed by Azure LRS Disks, they’ll be bound to a particular zone, and they may fail to recover immediately if the surge node doesn’t match the zone of the PVC. This could cause downtime on your application when the Upgrade operation continues to drain nodes but the PVs are bound to a zone. To handle this case and maintain high availability, configure a [Pod Disruption Budget](https://kubernetes.io/docs/tasks/run-application/configure-pdb/) on your application. This allows Kubernetes to respect your availability requirements during Upgrade's drain operation.
234244

235245
## Next steps
236246

0 commit comments

Comments
 (0)