|
| 1 | +--- |
| 2 | +title: Update and reboot nodes with kured in Azure Kubernetes Service (AKS) |
| 3 | +description: Learn how to update nodes and automatically reboot them with kured in Azure Kubernetes Service (AKS) |
| 4 | +services: container-service |
| 5 | +author: iainfoulds |
| 6 | + |
| 7 | +ms.service: container-service |
| 8 | +ms.topic: article |
| 9 | +ms.date: 11/06/2018 |
| 10 | +ms.author: iainfou |
| 11 | +--- |
| 12 | + |
| 13 | +# Apply security and kernel updates to nodes in Azure Kubernetes Service (AKS) |
| 14 | + |
| 15 | +To protect your clusters, security updates are automatically applied to nodes in AKS. These updates include OS security fixes or kernel updates. Some of these updates require a node reboot to complete the process. AKS doesn't automatically reboot nodes to complete the update process. |
| 16 | + |
| 17 | +This article shows you how to use the open-source [kured (KUbernetes REboot Daemon)][kured] to watch for nodes that require a reboot, then automatically handle the rescheduling of running pods and node reboot process. |
| 18 | + |
| 19 | +> [!NOTE] |
| 20 | +> `Kured` is an open-source project by Weaveworks. Support for this project in AKS is provided on a best-effort basis. Additional support can be found in the #weave-community slack channel, |
| 21 | +
|
| 22 | +## Before you begin |
| 23 | + |
| 24 | +This article assumes that you have an existing AKS cluster. If you need an AKS cluster, see the AKS quickstart [using the Azure CLI][aks-quickstart-cli] or [using the Azure portal][aks-quickstart-portal]. |
| 25 | + |
| 26 | +You also need the Azure CLI version 2.0.49 or later installed and configured. Run `az --version` to find the version. If you need to install or upgrade, see [Install Azure CLI][install-azure-cli]. |
| 27 | + |
| 28 | +## Understand the AKS node update experience |
| 29 | + |
| 30 | +In an AKS cluster, your Kubernetes nodes run as Azure virtual machines (VMs). These Linux-based VMs use an Ubuntu image, with the OS configured to automatically check for updates every night. If security or kernel updates are available, they are automatically downloaded and installed. |
| 31 | + |
| 32 | + |
| 33 | + |
| 34 | +Some security updates, such as kernel updates, require a node reboot to finalize the process. A node that requires a reboot creates a file named */var/run/reboot-required*. This reboot process doesn't happen automatically. |
| 35 | + |
| 36 | +You can use your own workflows and processes to handle node reboots, or use `kured` to orchestrate the process. With `kured`, a [DaemonSet][DaemonSet] is deployed that runs a pod on each node in the cluster. These pods in the DaemonSet watch for existence of the */var/run/reboot-required* file, and then initiates a process to reboot the nodes. |
| 37 | + |
| 38 | +### Node upgrades |
| 39 | + |
| 40 | +There is an additional process in AKS that lets you *upgrade* a cluster. An upgrade is typically to move to a newer version of Kubernetes, not just apply node security updates. An AKS upgrade performs the following actions: |
| 41 | + |
| 42 | +* A new node is deployed with the latest security updates and Kubernetes version applied. |
| 43 | +* An old node is cordoned and drained. |
| 44 | +* Pods are scheduled on the new node. |
| 45 | +* The old node is deleted. |
| 46 | + |
| 47 | +You can't remain on the same Kubernetes version during an upgrade event. You must specify a newer version of Kubernetes. To upgrade to the latest version of Kubernetes, you can [upgrade your AKS cluster][aks-upgrade]. |
| 48 | + |
| 49 | +## Deploy kured in an AKS cluster |
| 50 | + |
| 51 | +To deploy the `kured` DaemonSet, apply the following sample YAML manifest from their GitHub project page. This manifest creates a role and cluster role, bindings, and a service account, then deploys the DaemonSet using `kured` version 1.1.0 that supports AKS clusters 1.9 or later. |
| 52 | + |
| 53 | +```console |
| 54 | +kubectl apply -f https://github.com/weaveworks/kured/releases/download/1.1.0/kured-1.1.0.yaml |
| 55 | +``` |
| 56 | + |
| 57 | +You can also configure additional parameters for `kured`, such as integration with Prometheus or Slack. For more information about additional configuration parameters, see the [kured installation docs][kured-install]. |
| 58 | + |
| 59 | +## Update cluster nodes |
| 60 | + |
| 61 | +By default, AKS nodes check for updates every evening. If you don't want to wait, you can manually perform an update to check that `kured` runs correctly. First, follow the steps to [SSH to one of your AKS nodes][aks-ssh]. Once you have an SSH connection to the node, check for updates and apply them as follows: |
| 62 | + |
| 63 | +```console |
| 64 | +sudo apt-get update && sudo apt-get upgrade -y |
| 65 | +``` |
| 66 | + |
| 67 | +If updates were applied that require a node reboot, a file is written to */var/run/reboot-required*. `Kured` checks for nodes that require a reboot every 60 minutes by default. |
| 68 | + |
| 69 | +## Monitor and review reboot process |
| 70 | + |
| 71 | +When one of the replicas in the DaemonSet has detected that a node reboot is required, a lock is placed on the node through the Kubernetes API. This lock prevents additional pods being scheduled on the node. The lock also indicates that only one node should be rebooted at a time. With the node cordoned off, running pods are drained from the node, and the node is rebooted. |
| 72 | + |
| 73 | +You can monitor the status of the nodes using the [kubectl get nodes][kubectl-get-nodes] command. The following example output shows a node with a status of *SchedulingDisabled* as the node prepares for the reboot process: |
| 74 | + |
| 75 | +``` |
| 76 | +NAME STATUS ROLES AGE VERSION |
| 77 | +aks-nodepool1-79590246-2 Ready,SchedulingDisabled agent 1h v1.9.11 |
| 78 | +``` |
| 79 | + |
| 80 | +Once the update process is complete, you can view the status of the nodes using the [kubectl get nodes][kubectl-get-nodes] command with the `--output wide` parameter. This additional output lets you see a difference in *KERNEL-VERSION* of the underlying nodes, as shown in the following example output. The *aks-nodepool1-79590246-2* was updated in a previous step and shows kernel version *4.15.0-1025-azure*. The node *aks-nodepool1-79590246-1* that hasn't been updated shows kernel version *4.15.0-1023-azure*. |
| 81 | + |
| 82 | +``` |
| 83 | +NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME |
| 84 | +aks-nodepool1-79590246-1 Ready agent 1h v1.9.11 10.240.0.6 <none> Ubuntu 16.04.5 LTS 4.15.0-1023-azure docker://1.13.1 |
| 85 | +aks-nodepool1-79590246-2 Ready agent 1h v1.9.11 10.240.0.4 <none> Ubuntu 16.04.5 LTS 4.15.0-1025-azure docker://1.13.1 |
| 86 | +``` |
| 87 | + |
| 88 | +## Next steps |
| 89 | + |
| 90 | +This article detailed how to use `kured` to reboot nodes automatically as part of the security update process. To upgrade to the latest version of Kubernetes, you can [upgrade your AKS cluster][aks-upgrade]. |
| 91 | + |
| 92 | +<!-- LINKS - external --> |
| 93 | +[kured]: https://github.com/weaveworks/kured |
| 94 | +[kured-install]: https://github.com/weaveworks/kured#installation |
| 95 | +[kubectl-get-nodes]: https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#get |
| 96 | + |
| 97 | +<!-- LINKS - internal --> |
| 98 | +[aks-quickstart-cli]: kubernetes-walkthrough.md |
| 99 | +[aks-quickstart-portal]: kubernetes-walkthrough-portal.md |
| 100 | +[install-azure-cli]: /cli/azure/install-azure-cli |
| 101 | +[DaemonSet]: concepts-clusters-workloads.md#statefulsets-and-daemonsets |
| 102 | +[aks-ssh]: ssh.md |
| 103 | +[aks-upgrade]: upgrade-cluster.md |
0 commit comments