[proposal] Support CustomPriority descheduling plugin

**What is your proposal**:

Introduce a new descheduler plugin named `CustomPriority` that continuously rebalances workloads from high‑priority (expensive) node pools to lower‑priority (cheaper) pools based on an user‑defined priority order and resource availability. 

**Why is this needed**:

We have user stories like below: 
- When scaling down at off-peak, I want pods of the same workload on a burstable node to be drained together so the elastic node can be released, not just scale down by per-workload priority, so that I actually save cost by removing the whole elastic node.
  - Given an **elastic** node hosting several pods from the same workload
  - And **static** nodes have spare capacity
  - When a manual scale down or an autoscaler triggers downscale
  - Then the system should drain those pods together from the elastic node and reschedule them onto static nodes, enabling node release for elastic nodes
- Mixed node billing: **commit** (monthly) pool plus **on‑demand elastic** pool
  - Given a cluster with a committed static pool and an on‑demand elastic pool used only for peak bursts
  - When the cluster is in off‑peak and the static pool has room
  - Then pods should be proactively evicted from elastic to static according to the configured EvictionOrder, minimizing on‑demand spend

This Plugin can realize: 
- Cost efficiency: Proactively vacate premium nodes when cheaper capacity is available, aligning placement with business priorities.
- Better binpacking: Evicts smaller, easily‑reschedulable pods first to minimize churn and improve fit rate on destination nodes.
- Safer operations: NodeFit prevents pathological evictions; DrainNode mode performs atomic, capacity‑aware draining with virtual reservations; AutoCordon prevents immediate re‑scheduling back.

**Why not `HighNodeUtilization`?** 
1. The `HighNodeUtilization` plugin does similar things, it can bin-pack pods between nodes by generic utilization thresholds; but it doesn’t understand business tiers/cost. It can’t say “drain elastic/expensive nodes first.” It moves from underutilized nodes broadly, not cost-prioritized pools.
2. Cannot guarantee node drain: It opportunistically evicts pods to raise utilization, but won’t virtually plan full placement of all pods from a source node. Some pods on a node might be evicted but some others not, this prevent autoscaler-like components to release a node at time. We'll need to reserve target capacity virtually, evicts only when a whole node can be emptied, enabling real scale‑in.

**Is there a suggested solution, if so, please add it**:

The plugin runs at the `Balance` extension point and supports:
- Configurable **EvictionOrder** of resource tiers via per‑tier NodeSelectors
- Optional global **NodeSelector** for narrowing the working set of nodes
- Pod selection via label‑based **CustomPriorityPodSelectors** and **EvictableNamespaces** include/exclude lists
- Safety checks via NodeFit (NodeAffinity, taints/tolerations, resource fit)
- Two modes: **BestEffort** and **DrainNode** (with optional **AutoCordon**)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[proposal] Support CustomPriority descheduling plugin #2771

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[proposal] Support CustomPriority descheduling plugin #2771

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions