k8s-spot-rescheduler doesn't handle pod disruption budgets nicely, leaving nodes underutilized and tainted

We're using kops `1.10.0` and k8s `1.10.11`. We're using two separate instance groups (IG), `nodes` (on-demand) and `spots` (spot), both spread across 3 availability zones. I've applied the appropriate nodeLabels and have defined the following in my k8s-spot-rescheduler deployment manifest:

```
- --on-demand-node-label=on-demand
- --spot-node-label=spot
```

The `nodes` IG has the `spot=false:PreferNoSchedule` taint so the `spots` IG is preferred. I'm using the cluster autoscaler to autodiscover both IGs via the `--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,kubernetes.io/cluster/kubernetes.metis.wtf` and these tags exist on both IGs. I've confirmed that pods on most `nodes` nodes are able to be drained and moved to `spots` nodes. With an exception:

* k8s-spot-reschedule picks a node and states 
   ```I0117 14:01:51.155242 1 rescheduler.go:298] All pods on ip-172-20-61-39.ec2.internal can be 
   moved. Will drain node.
   ```

    which isn't true
* It then figures out it's unable to drain the node due to PDBs
   ```
   E0117 14:03:51.801764       1 rescheduler.go:302] Failed to drain node: Failed to drain node /ip-172- 
  20-61-39.ec2.internal, due to following errors: [Failed to evict pod skafos-notebooks/hub- 
  deployment-cf799d494-gp6z4 within allowed timeout (last error: Cannot evict pod as it would 
   violate the pod's disruption budget.)]
   ```

   and aborts the drain.

Now we're left with an on-demand node that has had all of its pods evicted except those with PDBs, leaving the on-demand node underutilized and tainted with `ToBeDeletedByClusterAutoscaler`. It seems like it should check if it can drain all pods, taking into consideration PDBs, and if it can't, don't evict any pods and don't taint with `ToBeDeletedByClusterAutoscaler`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

k8s-spot-rescheduler doesn't handle pod disruption budgets nicely, leaving nodes underutilized and tainted #54

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

k8s-spot-rescheduler doesn't handle pod disruption budgets nicely, leaving nodes underutilized and tainted #54

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions