Merge pull request #258532 from Nickomang/aks-drain-timeout

prmerger-automator[bot] · web-flow · commit cdcd100e0a20 · 2023-11-20T19:03:31.000Z
Added configurable drain timeout
diff --git a/articles/aks/upgrade-aks-cluster.md b/articles/aks/upgrade-aks-cluster.md
@@ -132,6 +132,7 @@ During the cluster upgrade process, AKS performs the following operations:
 
 * Add a new buffer node (or as many nodes as configured in [max surge](#customize-node-surge-upgrade)) to the cluster that runs the specified Kubernetes version.
 * [Cordon and drain][kubernetes-drain] one of the old nodes to minimize disruption to running applications. If you're using max surge, it [cordons and drains][kubernetes-drain] as many nodes at the same time as the number of buffer nodes specified.
+* For long running pods, you can configure the node drain timeout, which allows for custom wait time on the eviction of pods and graceful termination per node. If not specified, the default is 30 minutes.
 * When the old node is fully drained, it's reimaged to receive the new version and becomes the buffer node for the following node to be upgraded.
 * This process repeats until all nodes in the cluster have been upgraded.
 * At the end of the process, the last buffer node is deleted, maintaining the existing agent node count and zone balance.
@@ -229,6 +230,24 @@ AKS accepts both integer values and a percentage value for max surge. An integer
     az aks nodepool update -n mynodepool -g MyResourceGroup --cluster-name MyManagedCluster --max-surge 5
     ```
 
+#### Set node drain timeout value
+
+When you have a long running workload on a certain pod, it may result in one of the following cases:
+- Your pod takes a long time to come up, such as when restoring a database.
+- Your pod uses graceful termination to take a long time to shut down.
+
+In these scenarios, you can configure a node drain timeout that AKS will respect in the upgrade workflow. If you prefer your upgrades to be fast and are confident in your pod startup/terminate times being fast, you may want to set a low drain timeout. Otherwise, higher drain timeouts will affect how long you wait before discovering an issue. If no node drain timeout value is specified, the default is 30 minutes.
+
+To set a node drain timeout for new or existing node pools using the [`az aks nodepool add`][az-aks-nodepool-add] or [`az aks nodepool update`][az-aks-nodepool-update] command:
+
+```azurecli-interactive
+# Set drain timeout for a new node pool
+az aks nodepool add -n mynodepool -g MyResourceGroup --cluster-name MyManagedCluster  --drainTimeoutInMinutes 100
+    
+# Update drain timeout for an existing node pool
+az aks nodepool update -n mynodepool -g MyResourceGroup --cluster-name MyManagedCluster --drainTimeoutInMinutes 45
+```
+
 ## View upgrade events
 
 * View upgrade events using the `kubectl get events` command.
diff --git a/articles/aks/upgrade-cluster.md b/articles/aks/upgrade-cluster.md
@@ -43,11 +43,12 @@ Persistent volume claims (PVCs) backed by Azure locally redundant storage (LRS)
 
 ## Optimize upgrades to improve performance and minimize disruptions
 
-The combination of [Planned Maintenance Window][planned-maintenance], [Max Surge](./upgrade-aks-cluster.md#customize-node-surge-upgrade), and [Pod Disruption Budget][pdb-spec] can significantly increase the likelihood of node upgrades completing successfully by the end of the maintenance window while also minimizing disruptions.
+The combination of [Planned Maintenance Window][planned-maintenance], [Max Surge](./upgrade-aks-cluster.md#customize-node-surge-upgrade), and [Pod Disruption Budget][pdb-spec], and [node drain timeout][drain-timeout] can significantly increase the likelihood of node upgrades completing successfully by the end of the maintenance window while also minimizing disruptions.
 
 * [Planned Maintenance Window][planned-maintenance] enables service teams to schedule auto-upgrade during a pre-defined window, typically a low-traffic period, to minimize workload impact. We recommend a window duration of at least *four hours*.
 * [Max Surge](./upgrade-aks-cluster.md#customize-node-surge-upgrade) on the node pool allows requesting extra quota during the upgrade process and limits the number of nodes selected for upgrade simultaneously. A higher max surge results in a faster upgrade process. We don't recommend setting it at 100%, as it upgrades all nodes simultaneously, which can cause disruptions to running applications. We recommend a max surge quota of *33%* for production node pools.
 * [Pod Disruption Budget][pdb-spec] is set for service applications and limits the number of pods that can be down during voluntary disruptions, such as AKS-controlled node upgrades. It can be configured as `minAvailable` replicas, indicating the minimum number of application pods that need to be active, or `maxUnavailable` replicas, indicating the maximum number of application pods that can be terminated, ensuring high availability for the application. Refer to the guidance provided for configuring [Pod Disruption Budgets (PDBs)][pdb-concepts]. PDB values should be validated to determine the settings that work best for your specific service.
+* [Node drain timeout][drain-timeout] on the node pool allows configuring the wait time for eviction of pods and graceful termination per node during upgrades, typically applicable for long running workloads. When the node drain timeout is specified to an amount of time (in minutes), AKS honors waiting on pod disruption budgets. If not specified, the default is 30 minutes.
 
 ## Next steps
 
@@ -63,5 +64,6 @@ This article listed different upgrade options for AKS clusters. To learn more ab
 <!-- LINKS - internal -->
 [aks-tutorial-prepare-app]: ./tutorial-kubernetes-prepare-app.md
 [nodepool-upgrade]: manage-node-pools.md#upgrade-a-single-node-pool
+[drain-timeout]: ./upgrade-aks-cluster.md#set-node-drain-timeout-value
 [planned-maintenance]: planned-maintenance.md
 [specific-nodepool]: node-image-upgrade.md#upgrade-a-specific-node-pool