-
Notifications
You must be signed in to change notification settings - Fork 132
Description
How to categorize this issue?
/area control-plane
/area usability
/kind enhancement
/priority 3
What would you like to be added:
MCM should, at some point, write general eviction/drain errors into the Status.LastOperation.Description field.
We are mainly interested in errors that contain Cannot evict pod as it would violate the pod's disruption budget., because of the below explained reason.
Code locations:
- after:
machine-controller-manager/pkg/util/provider/drain/drain.go
Lines 1057 to 1067 in ef09b5e
klog.V(3).Infof("Pod %s/%s couldn't be evicted from node %s. This may also occur due to PDB violation. Will be retried. Error: %v", pod.Namespace, pod.Name, pod.Spec.NodeName, err) pdb := getPdbForPod(o.pdbLister, pod) if pdb != nil { if isMisconfiguredPdb(pdb) { pdbErr := fmt.Errorf("error while evicting pod %q: pod disruption budget %s/%s is misconfigured and requires zero voluntary evictions", pod.Name, pdb.Namespace, pdb.Name) returnCh <- pdbErr return } } - after
machine-controller-manager/pkg/util/provider/drain/drain.go
Lines 733 to 744 in ef09b5e
klog.V(3).Infof("Pod %s/%s couldn't be evicted from node %s. This may also occur due to PDB violation. Will be retried. Error: %v", pod.Namespace, pod.Name, pod.Spec.NodeName, err) pdb := getPdbForPod(o.pdbLister, pod) if pdb != nil { if isMisconfiguredPdb(pdb) { pdbErr := fmt.Errorf("error while evicting pod %q: pod disruption budget %s/%s is misconfigured and requires zero voluntary evictions", pod.Name, pdb.Namespace, pdb.Name) returnCh <- pdbErr o.checkAndDeleteWorker(volumeAttachmentEventCh) continue } }
Why is this needed:
Currently, the MCM only writes an error into the Status.LastOperation.Description field if it detected a misconfigured PDB.
In cases where the PDB is not misconfigured, but a drain would still violate it (e.g., because the workload is unhealthy), it would be very beneficial to easily being able to differentiate between "a node can't be drained because of a PDB" and "a node can't be drained because something else is broken".