Skip to content

Commit de6d84f

Browse files
committed
Update the feature name to "Prefer Nominated Node"
1 parent 780ee9f commit de6d84f

File tree

2 files changed

+19
-20
lines changed

2 files changed

+19
-20
lines changed

keps/sig-scheduling/1923-try-nominated-node-first/README.md renamed to keps/sig-scheduling/1923-prefer-nominated-node/README.md

Lines changed: 17 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# KEP-1923: Try Nominated Node First
1+
# KEP-1923: Prefer Nominated Node
22

33
<!-- toc -->
44
- [Release Signoff Checklist](#release-signoff-checklist)
@@ -44,36 +44,35 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
4444

4545
## Summary
4646

47-
If the scheduler fails to fit an incoming pod on any node, the scheduler will try to preempt lower
48-
priority pods running on a selected node to make room for the pod. The name of this node will be set
49-
in the pod's `pod.Status.NominatedNodeName`.
47+
This KEP proposes to change the scheduling cycle such that nominated node of a pod is evaluated first
48+
and schedule the pod on that node if it fits. If the nominated node doesn't fit the pod, only then the
49+
scheduling cycle continues with the standard logic of evaluating the rest of the nodes in the cluster.
50+
51+
## Motivation
52+
53+
If the scheduler fails to fit an incoming pod on any node, it will try to preempt lower priority pods
54+
running on a selected node to make room for the pod. The name of this node will be set in the
55+
pod's `.status.nominatedNodeName`.
5056

5157
The Node is called *Nominated* to indicate the intent for the Pod to be scheduled on it once preemption
52-
of other Pods finish. However, the `Pod.status.nominatedNodeName` information is not directly used in
53-
the Pod's following scheduling attempts.
58+
of other Pods finishes. However, the Pod's `.status.nominatedNodeName` information is not fully utilized
59+
in the Pod's following scheduling attempts.
5460

5561
Pod scheduling is split into two phases, the scheduling cycle and the binding cycle, the scheduling cycle
5662
primarily includes filtering and scoring.
5763

5864
When preemption happens in a previous scheduling cycle, there is a high chance that the nominated node is
5965
the *only* node that satisfies the filters for the unscheduled Pod that triggered preemption.
6066

61-
This KEP proposes to change the scheduling cycle such that nominated node of a pod is evaluated first
62-
and schedule the pod on that node if it fits. If the nominated node doesn't fit the pod, only then the
63-
scheduling cycle continues with the standard logic of evaluating the rest of the nodes in the cluster.
64-
65-
## Motivation
66-
6767
In real production environment, pods can have different priorites due to business needs, the preemption
6868
could happen to make sure higher priority pods could get scheduled.
6969

7070
In cluster with large number of computing nodes, evaluating all nodes when scheduling a pod is time consuming.
7171

7272
### Goals
7373

74-
In the case where `pod.Status.NominatedNodeName` is set for an incoming pod, the scheduler will evaluate the
75-
nominated node first; if the nominated node doesn't fit the pod, the scheduling cycle will continue to evaluate
76-
the rest of the nodes in the cluster just like we do today.
74+
Prefer scheduling a pod to its `.status.nominatedNodeName` if set, if the nominated node doesn't fit the pod,
75+
the scheduling cycle will continue to evaluate the rest of the nodes in the cluster just like we do today.
7776

7877

7978
## Proposal
@@ -94,7 +93,7 @@ nominated node.
9493
### Implementation Details
9594

9695
1. In filtering phase, which is currently implemented in the method of `findNodesThatFitPod`, check the nominated node
97-
first if the incoming pod has the `pod.Status.NominatedNodeName` defined and the feature gate is enabled.
96+
first if the incoming pod has the `.status.nominatedNodeName` defined and the feature gate is enabled.
9897

9998
2. In case the nominated node doesn't suit for the incoming pod anymore, get `err` from `findNodesThatPassFilters` where
10099
`NominatedNode` is firstly evaluated, the `err` will be padded with more information to tell that scheduler is evaluating
@@ -108,7 +107,7 @@ nominated node.
108107

109108
Scheduler will retry until matching either of the following cases,
110109
- `NominatedNode` eventually released all the resource and the preemptor pod can be scheduled on that node.
111-
- Another node in the cluster released enough release and pod get scheduled on that node instead.
110+
- Another node in the cluster released enough resources and pod get scheduled on that node instead.
112111
[Discuss] Should scheduler clear the `NominatedNode` in this case?
113112
- Resource cannot be released on the `NominatedNode` and no other candidate node could be found in the cluster, this will
114113
be covered by [issue 95752](https://github.com/kubernetes/kubernetes/issues/95752).
@@ -144,7 +143,7 @@ _This section must be completed when targeting alpha to a release._
144143

145144
* **How can this feature be enabled / disabled in a live cluster?**
146145
- [x] Feature gate (also fill in values in `kep.yaml`)
147-
- Feature gate name: TryNominatedNodeFirst
146+
- Feature gate name: PreferNominatedNode
148147
- Components depending on the feature gate: kube-scheduler
149148

150149
* **Are there any tests for feature enablement/disablement?**

keps/sig-scheduling/1923-try-nominated-node-first/kep.yaml renamed to keps/sig-scheduling/1923-prefer-nominated-node/kep.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
title: Try Nominated Node First
1+
title: Prefer Nominated Node
22
kep-number: 1923
33
authors:
44
- "@chendave"
@@ -21,7 +21,7 @@ milestone:
2121
beta: "v1.22"
2222
stable: "v1.24"
2323
feature-gates:
24-
- name: TryNominatedNodeFirst
24+
- name: PreferNominatedNode
2525
components:
2626
- kube-scheduler
2727
disable-supported: true

0 commit comments

Comments
 (0)