@@ -28,23 +28,67 @@ large Kubernetes clusters.
28
28
29
29
{{% capture body %}}
30
30
31
- ## Percentage of Nodes to Score
32
-
33
- Before Kubernetes 1.12, Kube-scheduler used to check the feasibility of all
34
- nodes in a cluster and then scored the feasible ones. Kubernetes 1.12 added a
35
- new feature that allows the scheduler to stop looking for more feasible nodes
36
- once it finds a certain number of them. This improves the scheduler's
37
- performance in large clusters. The number is specified as a percentage of the
38
- cluster size. The percentage can be controlled by a configuration option called
39
- ` percentageOfNodesToScore ` . The range should be between 1 and 100. Larger values
40
- are considered as 100%. Zero is equivalent to not providing the config option.
41
- Kubernetes 1.14 has logic to find the percentage of nodes to score based on the
42
- size of the cluster if it is not specified in the configuration. It uses a
43
- linear formula which yields 50% for a 100-node cluster. The formula yields 10%
44
- for a 5000-node cluster. The lower bound for the automatic value is 5%. In other
45
- words, the scheduler always scores at least 5% of the cluster no matter how
46
- large the cluster is, unless the user provides the config option with a value
47
- smaller than 5.
31
+ In large clusters, you can tune the scheduler's behaviour balancing
32
+ scheduling outcomes between latency (new Pods are placed quickly) and
33
+ accuracy (the scheduler rarely makes poor placement decisions).
34
+
35
+ You configure this tuning setting via kube-scheduler setting
36
+ ` percentageOfNodesToScore ` . This KubeSchedulerConfiguration setting determines
37
+ a threshold for scheduling nodes in your cluster.
38
+
39
+ ### Setting the threshold
40
+
41
+ The ` percentageOfNodesToScore ` option accepts whole numeric values between 0
42
+ and 100. The value 0 is a special number which indicates that the kube-scheduler
43
+ should use its compiled-in default.
44
+ If you set ` percentageOfNodesToScore ` above 100, kube-scheduler acts as if you
45
+ had set a value of 100.
46
+
47
+ To change the value, edit the kube-scheduler configuration file (this is likely
48
+ to be ` /etc/kubernetes/config/kube-scheduler.yaml ` ), then restart the scheduler.
49
+
50
+ After you have made this change, you can run
51
+ ``` bash
52
+ kubectl get componentstatuses
53
+ ```
54
+ to verify that the kube-scheduler component is healthy. The output is similar to:
55
+ ```
56
+ NAME STATUS MESSAGE ERROR
57
+ controller-manager Healthy ok
58
+ scheduler Healthy ok
59
+ ...
60
+ ```
61
+
62
+ ## Node scoring threshold {#percentage-of-nodes-to-score}
63
+
64
+ To improve scheduling performance, the kube-scheduler can stop looking for
65
+ feasible nodes once it has found enough of them. In large clusters, this saves
66
+ time compared to a naive approach that would consider every node.
67
+
68
+ You specify a threshold for how many nodes are enough, as a whole number percentage
69
+ of all the nodes in your cluster. The kube-scheduler converts this into an
70
+ integer number of nodes. During scheduling, if the kube-scheduler has identified
71
+ enough feasible nodes to exceed the configured percentage, the kube-scheduler
72
+ stops searching for more feasible nodes and moves on to the
73
+ [ scoring phase] ( /docs/concepts/scheduling/kube-scheduler/#kube-scheduler-implementation ) .
74
+
75
+ [ How the scheduler iterates over Nodes] ( #how-the-scheduler-iterates-over-nodes )
76
+ describes the process in detail.
77
+
78
+ ### Default threshold
79
+
80
+ If you don't specify a threshold, Kubernetes calculates a figure using a
81
+ linear formula that yields 50% for a 100-node cluster and yields 10%
82
+ for a 5000-node cluster. The lower bound for the automatic value is 5%.
83
+
84
+ This means that, the kube-scheduler always scores at least 5% of your cluster no
85
+ matter how large the cluster is, unless you have explicitly set
86
+ ` percentageOfNodesToScore ` to be smaller than 5.
87
+
88
+ If you want the scheduler to score all nodes in your cluster, set
89
+ ` percentageOfNodesToScore ` to 100.
90
+
91
+ ## Example
48
92
49
93
Below is an example configuration that sets ` percentageOfNodesToScore ` to 50%.
50
94
@@ -59,39 +103,38 @@ algorithmSource:
59
103
percentageOfNodesToScore : 50
60
104
` ` `
61
105
62
- {{< note >}} In clusters with less than 50 feasible nodes, the scheduler still
63
- checks all the nodes, simply because there are not enough feasible nodes to stop
64
- the scheduler's search early. {{< /note >}}
65
-
66
- **To disable this feature**, you can set ` percentageOfNodesToScore` to 100.
67
106
68
- # ## Tuning percentageOfNodesToScore
107
+ ## Tuning percentageOfNodesToScore
69
108
70
109
` percentageOfNodesToScore` must be a value between 1 and 100 with the default
71
110
value being calculated based on the cluster size. There is also a hardcoded
72
- minimum value of 50 nodes. This means that changing
73
- this option to lower values in clusters with several hundred nodes will not have
74
- much impact on the number of feasible nodes that the scheduler tries to find.
75
- This is intentional as this option is unlikely to improve performance noticeably
76
- in smaller clusters. In large clusters with over a 1000 nodes setting this value
77
- to lower numbers may show a noticeable performance improvement.
78
-
79
- An important note to consider when setting this value is that when a smaller
111
+ minimum value of 50 nodes.
112
+
113
+ {{< note >}}In clusters with less than 50 feasible nodes, the scheduler still
114
+ checks all the nodes, simply because there are not enough feasible nodes to stop
115
+ the scheduler's search early.
116
+
117
+ In a small cluster, if you set a low value for `percentageOfNodesToScore`, your
118
+ change will have no or little effect, for a similar reason.
119
+
120
+ If your cluster has several hundred Nodes or fewer, leave this configuration option
121
+ at its default value. Making changes is unlikely to improve the
122
+ scheduler's performance significantly.
123
+ {{< /note >}}
124
+
125
+ An important detail to consider when setting this value is that when a smaller
80
126
number of nodes in a cluster are checked for feasibility, some nodes are not
81
127
sent to be scored for a given Pod. As a result, a Node which could possibly
82
128
score a higher value for running the given Pod might not even be passed to the
83
- scoring phase. This would result in a less than ideal placement of the Pod. For
84
- this reason, the value should not be set to very low percentages. A general rule
85
- of thumb is to never set the value to anything lower than 10. Lower values
86
- should be used only when the scheduler's throughput is critical for your
87
- application and the score of nodes is not important. In other words, you prefer
88
- to run the Pod on any Node as long as it is feasible.
89
-
90
- If your cluster has several hundred Nodes or fewer, we do not recommend lowering
91
- the default value of this configuration option. It is unlikely to improve the
92
- scheduler's performance significantly.
129
+ scoring phase. This would result in a less than ideal placement of the Pod.
130
+
131
+ You should avoid setting `percentageOfNodesToScore` very low so that kube-scheduler
132
+ does not make frequent, poor Pod placement decisions. Avoid setting the
133
+ percentage to anything below 10%, unless the scheduler's throughput is critical
134
+ for your application and the score of nodes is not important. In other words, you
135
+ prefer to run the Pod on any Node as long as it is feasible.
93
136
94
- # ## How the scheduler iterates over Nodes
137
+ # # How the scheduler iterates over Nodes
95
138
96
139
This section is intended for those who want to understand the internal details
97
140
of this feature.
0 commit comments