Skip to content

Commit 021d564

Browse files
again
1 parent 671bdc5 commit 021d564

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

articles/operator-nexus/troubleshoot-kubernetes-cluster-node-cordoned.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ The purpose of this guide is to troubleshoot a Kubernetes Cluster when 1 or more
1919

2020
## Typical Cause
2121

22-
After a runtime upgrade, before a Baremetal Machine is shut down for reimaging, the machine lifecycle controller will cordon and drain Virtual Machine resources scheduled to that Baremetal Machine. Once the Baremetal Machine resolves the reimaging process, the expectation is that the machine lifecycle controller reschedule Virtual Machines to that Baremetal Machine. It would then uncordon the Virtual Machine, with the Kubernetes Cluster Node it supports reflecting the appropriate state `Ready`.
22+
After a runtime upgrade, before a Baremetal Machine is shut down for reimaging, the machine lifecycle controller will cordon and drain Virtual Machine resources scheduled to that Baremetal Machine. Once the Baremetal Machine resolves the reimaging process, the expectation is that the machine lifecycle controller reschedules Virtual Machines to that Baremetal Machine. It would then uncordon the Virtual Machine, with the Kubernetes Cluster Node it supports reflecting the appropriate state `Ready`.
2323

24-
However, a race condition may occur wherein the machine lifecycle controller fails to find Virtual Machines which should be scheduled to that Baremetal Machine. Each Virtual Machine is deployed using a virt-launcher pod. This race condition happens when the virt-launcher pod's image pull job isn't yet complete. Only after the image pull job is complete will the pod be schedulable to a Baremetal Machine. When the machine lifecycle controller examines these virt-launcher pods during the uncordon action execution, it can't find which Baremetal Machine the pod is tied to. This is because the pod itself has not yet been scheduled. Therefore the machine lifecycle controller skips uncordoning that Virtual Machine which that pod represents.
24+
However, a race condition may occur wherein the machine lifecycle controller fails to find Virtual Machines which should be scheduled to that Baremetal Machine. Each Virtual Machine is deployed using a virt-launcher pod. This race condition happens when the virt-launcher pod's image pull job isn't yet complete. Only after the image pull job is complete will the pod be schedulable to a Baremetal Machine. When the machine lifecycle controller examines these virt-launcher pods during the uncordon action execution, it can't find which Baremetal Machine the pod is tied to. This is because the pod itself hasn't been scheduled. Therefore the machine lifecycle controller skips uncordoning that Virtual Machine that that pod represents.
2525

2626
## Procedure
2727

0 commit comments

Comments
 (0)