|
| 1 | +--- |
| 2 | +title: Troubleshoot KubernetesCluster with a node in NotReady |
| 3 | +description: Learn what to do when you see a node in NotReady in your kubernetesCluster. |
| 4 | +ms.service: azure-operator-nexus |
| 5 | +ms.custom: troubleshooting |
| 6 | +ms.topic: troubleshooting |
| 7 | +ms.date: 02/19/2025 |
| 8 | +ms.author: jessegler |
| 9 | +author: jessegler |
| 10 | +--- |
| 11 | +# Troubleshoot a KubernetesCluster with a node in NotReady state |
| 12 | + |
| 13 | +Follow this troubleshooting guide if you see a kubernetesCluster with a node in **NotReady**. |
| 14 | + |
| 15 | +## Prerequisites |
| 16 | + |
| 17 | +- Ability to run kubectl commands against the KubernetesCluster |
| 18 | +- Familiarity with the capabilities referenced in this article by reviewing the [Baremetalmachine actions](howto-baremetal-functions.md). |
| 19 | + |
| 20 | +## Cause |
| 21 | + |
| 22 | +- After Baremetalmachine restart or Cluster runtime upgrade, a node may enter the **NotReady** status. |
| 23 | +- Tainting, cordoning, or powering off a Baremetalmachine causes nodes running on that Baremetalmachine to become **NotReady**. If possible, remove the taint, uncordon, or power on the Baremetalmachine. If not possible, the following the procedure below may allow the node to reschedule to a different Baremetalmachine. |
| 24 | + |
| 25 | +## Procedure |
| 26 | + |
| 27 | +Delete the node by following the instructions below. This will allow the Cluster to attempt to reschedule and restart the node. |
| 28 | + |
| 29 | + |
| 30 | +1. Use kubectl to list the nodes using the wide flag. Observe the node in **NotReady** status. |
| 31 | + |
| 32 | + ~~~bash |
| 33 | + $ kubectl get nodes -owide |
| 34 | + NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME |
| 35 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-7qt2b Ready <none> 6d3h v1.27.3 10.4.74.30 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 36 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-dqmzw Ready <none> 6d3h v1.27.3 10.4.74.31 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 37 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-lkhhq NotReady <none> 6d3h v1.27.3 10.4.74.29 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 38 | + mytest-naks1-3b466a17-control-plane-6q7ns Ready control-plane 6d3h v1.27.3 10.4.74.14 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 39 | + mytest-naks1-3b466a17-control-plane-8qqvz Ready control-plane 6d3h v1.27.3 10.4.74.28 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 40 | + mytest-naks1-3b466a17-control-plane-g42mh Ready control-plane 6d3h v1.27.3 10.4.74.32 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 41 | + ~~~ |
| 42 | + |
| 43 | +1. Issue the kubectl command to delete the node. |
| 44 | + |
| 45 | + ~~~bash |
| 46 | + $ kubectl delete node mytest-naks1-3b466a17-agentpool1-md-6bg5h-lkhhq |
| 47 | + node "mytest-naks1-3b466a17-agentpool1-md-6bg5h-lkhhq" deleted |
| 48 | + ~~~ |
| 49 | + |
| 50 | +1. List the nodes again and see that the node is gone. |
| 51 | + |
| 52 | + ~~~bash |
| 53 | + $ kubectl get nodes -owide |
| 54 | + NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME |
| 55 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-7qt2b Ready <none> 6d3h v1.27.3 10.4.74.30 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 56 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-dqmzw Ready <none> 6d3h v1.27.3 10.4.74.31 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 57 | + mytest-naks1-3b466a17-control-plane-6q7ns Ready control-plane 6d3h v1.27.3 10.4.74.14 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 58 | + mytest-naks1-3b466a17-control-plane-8qqvz Ready control-plane 6d3h v1.27.3 10.4.74.28 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 59 | + mytest-naks1-3b466a17-control-plane-g42mh Ready control-plane 6d3h v1.27.3 10.4.74.32 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 60 | + ~~~ |
| 61 | + |
| 62 | +1. Wait 5-15 minutes for the node to be replaced. See that its returned with a new name. It will show **NotReady** as it comes up. |
| 63 | + |
| 64 | + ~~~bash |
| 65 | + $ kubectl get nodes -owide |
| 66 | + NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME |
| 67 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-7qt2b Ready <none> 6d3h v1.27.3 10.4.74.30 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 68 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-dqmzw Ready <none> 6d3h v1.27.3 10.4.74.31 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 69 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-nxkks NotReady <none> 42s v1.27.3 10.4.74.12 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 70 | + mytest-naks1-3b466a17-control-plane-6q7ns Ready control-plane 6d3h v1.27.3 10.4.74.14 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 71 | + mytest-naks1-3b466a17-control-plane-8qqvz Ready control-plane 6d3h v1.27.3 10.4.74.28 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 72 | + mytest-naks1-3b466a17-control-plane-g42mh Ready control-plane 6d3h v1.27.3 10.4.74.32 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 73 | + ~~~ |
| 74 | + |
| 75 | +1. Wait a bit longer and the **NotReady** node becomes **Ready**. |
| 76 | + |
| 77 | + ~~~bash |
| 78 | + $ kubectl get nodes -owide |
| 79 | + NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME |
| 80 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-7qt2b Ready <none> 6d3h v1.27.3 10.4.74.30 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 81 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-dqmzw Ready <none> 6d3h v1.27.3 10.4.74.31 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 82 | + mytest-naks1-3b466a17-agentpool1-md-6bg5h-nxkks Ready <none> 97s v1.27.3 10.4.74.12 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 83 | + mytest-naks1-3b466a17-control-plane-6q7ns Ready control-plane 6d3h v1.27.3 10.4.74.14 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 84 | + mytest-naks1-3b466a17-control-plane-8qqvz Ready control-plane 6d3h v1.27.3 10.4.74.28 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 85 | + mytest-naks1-3b466a17-control-plane-g42mh Ready control-plane 6d3h v1.27.3 10.4.74.32 <none> CBL-Mariner/Linux 5.15.153.1-2.cm2 containerd://1.6.26 |
| 86 | + ~~~ |
| 87 | + |
| 88 | +If you still have questions, [contact support](https://portal.azure.com/?#blade/Microsoft_Azure_Support/HelpAndSupportBlade). |
| 89 | +For more information about Support plans, see [Azure Support plans](https://azure.microsoft.com/support/plans/response/). |
0 commit comments