You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Azure: fix node removal race condition on VMSS deletion
When a VMSS is being deleted, instances are removed first. The VMSS
itself will disappear once empty. That delay is generally enough for
kube-controller-manager to delete the corresponding k8s nodes, but
might not when busy or throttled (for instance).
If kubernetes nodes remains after their backing VMSS were removed, Azure
cloud-provider will fail listing that VMSS VMs, and downstream callers
(ie. `InstanceExistsByProviderID`) won't account those errors for a
missing instance. The nodes will remain (still considered as "existing"),
and controller-manager will indefinitely retry to VMSS VMs list it,
draining API calls quotas, potentially causing throttling.
In practice a missing scale set implies instances attributed to that
VMSS don't exists either: `InstanceExistsByProviderID` (part of the
general cloud provider interface) should return false in that case.
0 commit comments