Skip to content

Commit baf8b82

Browse files
authored
Merge pull request #18543 from sftim/20200109_revise_percentage_of_nodes_to_score
Revise “Scheduler Performance Tuning”
2 parents 30f77cb + 828574c commit baf8b82

File tree

1 file changed

+85
-42
lines changed

1 file changed

+85
-42
lines changed

content/en/docs/concepts/scheduling/scheduler-perf-tuning.md

Lines changed: 85 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -28,23 +28,67 @@ large Kubernetes clusters.
2828

2929
{{% capture body %}}
3030

31-
## Percentage of Nodes to Score
32-
33-
Before Kubernetes 1.12, Kube-scheduler used to check the feasibility of all
34-
nodes in a cluster and then scored the feasible ones. Kubernetes 1.12 added a
35-
new feature that allows the scheduler to stop looking for more feasible nodes
36-
once it finds a certain number of them. This improves the scheduler's
37-
performance in large clusters. The number is specified as a percentage of the
38-
cluster size. The percentage can be controlled by a configuration option called
39-
`percentageOfNodesToScore`. The range should be between 1 and 100. Larger values
40-
are considered as 100%. Zero is equivalent to not providing the config option.
41-
Kubernetes 1.14 has logic to find the percentage of nodes to score based on the
42-
size of the cluster if it is not specified in the configuration. It uses a
43-
linear formula which yields 50% for a 100-node cluster. The formula yields 10%
44-
for a 5000-node cluster. The lower bound for the automatic value is 5%. In other
45-
words, the scheduler always scores at least 5% of the cluster no matter how
46-
large the cluster is, unless the user provides the config option with a value
47-
smaller than 5.
31+
In large clusters, you can tune the scheduler's behaviour balancing
32+
scheduling outcomes between latency (new Pods are placed quickly) and
33+
accuracy (the scheduler rarely makes poor placement decisions).
34+
35+
You configure this tuning setting via kube-scheduler setting
36+
`percentageOfNodesToScore`. This KubeSchedulerConfiguration setting determines
37+
a threshold for scheduling nodes in your cluster.
38+
39+
### Setting the threshold
40+
41+
The `percentageOfNodesToScore` option accepts whole numeric values between 0
42+
and 100. The value 0 is a special number which indicates that the kube-scheduler
43+
should use its compiled-in default.
44+
If you set `percentageOfNodesToScore` above 100, kube-scheduler acts as if you
45+
had set a value of 100.
46+
47+
To change the value, edit the kube-scheduler configuration file (this is likely
48+
to be `/etc/kubernetes/config/kube-scheduler.yaml`), then restart the scheduler.
49+
50+
After you have made this change, you can run
51+
```bash
52+
kubectl get componentstatuses
53+
```
54+
to verify that the kube-scheduler component is healthy. The output is similar to:
55+
```
56+
NAME STATUS MESSAGE ERROR
57+
controller-manager Healthy ok
58+
scheduler Healthy ok
59+
...
60+
```
61+
62+
## Node scoring threshold {#percentage-of-nodes-to-score}
63+
64+
To improve scheduling performance, the kube-scheduler can stop looking for
65+
feasible nodes once it has found enough of them. In large clusters, this saves
66+
time compared to a naive approach that would consider every node.
67+
68+
You specify a threshold for how many nodes are enough, as a whole number percentage
69+
of all the nodes in your cluster. The kube-scheduler converts this into an
70+
integer number of nodes. During scheduling, if the kube-scheduler has identified
71+
enough feasible nodes to exceed the configured percentage, the kube-scheduler
72+
stops searching for more feasible nodes and moves on to the
73+
[scoring phase](/docs/concepts/scheduling/kube-scheduler/#kube-scheduler-implementation).
74+
75+
[How the scheduler iterates over Nodes](#how-the-scheduler-iterates-over-nodes)
76+
describes the process in detail.
77+
78+
### Default threshold
79+
80+
If you don't specify a threshold, Kubernetes calculates a figure using a
81+
linear formula that yields 50% for a 100-node cluster and yields 10%
82+
for a 5000-node cluster. The lower bound for the automatic value is 5%.
83+
84+
This means that, the kube-scheduler always scores at least 5% of your cluster no
85+
matter how large the cluster is, unless you have explicitly set
86+
`percentageOfNodesToScore` to be smaller than 5.
87+
88+
If you want the scheduler to score all nodes in your cluster, set
89+
`percentageOfNodesToScore` to 100.
90+
91+
## Example
4892

4993
Below is an example configuration that sets `percentageOfNodesToScore` to 50%.
5094

@@ -59,39 +103,38 @@ algorithmSource:
59103
percentageOfNodesToScore: 50
60104
```
61105
62-
{{< note >}} In clusters with less than 50 feasible nodes, the scheduler still
63-
checks all the nodes, simply because there are not enough feasible nodes to stop
64-
the scheduler's search early. {{< /note >}}
65-
66-
**To disable this feature**, you can set `percentageOfNodesToScore` to 100.
67106
68-
### Tuning percentageOfNodesToScore
107+
## Tuning percentageOfNodesToScore
69108
70109
`percentageOfNodesToScore` must be a value between 1 and 100 with the default
71110
value being calculated based on the cluster size. There is also a hardcoded
72-
minimum value of 50 nodes. This means that changing
73-
this option to lower values in clusters with several hundred nodes will not have
74-
much impact on the number of feasible nodes that the scheduler tries to find.
75-
This is intentional as this option is unlikely to improve performance noticeably
76-
in smaller clusters. In large clusters with over a 1000 nodes setting this value
77-
to lower numbers may show a noticeable performance improvement.
78-
79-
An important note to consider when setting this value is that when a smaller
111+
minimum value of 50 nodes.
112+
113+
{{< note >}}In clusters with less than 50 feasible nodes, the scheduler still
114+
checks all the nodes, simply because there are not enough feasible nodes to stop
115+
the scheduler's search early.
116+
117+
In a small cluster, if you set a low value for `percentageOfNodesToScore`, your
118+
change will have no or little effect, for a similar reason.
119+
120+
If your cluster has several hundred Nodes or fewer, leave this configuration option
121+
at its default value. Making changes is unlikely to improve the
122+
scheduler's performance significantly.
123+
{{< /note >}}
124+
125+
An important detail to consider when setting this value is that when a smaller
80126
number of nodes in a cluster are checked for feasibility, some nodes are not
81127
sent to be scored for a given Pod. As a result, a Node which could possibly
82128
score a higher value for running the given Pod might not even be passed to the
83-
scoring phase. This would result in a less than ideal placement of the Pod. For
84-
this reason, the value should not be set to very low percentages. A general rule
85-
of thumb is to never set the value to anything lower than 10. Lower values
86-
should be used only when the scheduler's throughput is critical for your
87-
application and the score of nodes is not important. In other words, you prefer
88-
to run the Pod on any Node as long as it is feasible.
89-
90-
If your cluster has several hundred Nodes or fewer, we do not recommend lowering
91-
the default value of this configuration option. It is unlikely to improve the
92-
scheduler's performance significantly.
129+
scoring phase. This would result in a less than ideal placement of the Pod.
130+
131+
You should avoid setting `percentageOfNodesToScore` very low so that kube-scheduler
132+
does not make frequent, poor Pod placement decisions. Avoid setting the
133+
percentage to anything below 10%, unless the scheduler's throughput is critical
134+
for your application and the score of nodes is not important. In other words, you
135+
prefer to run the Pod on any Node as long as it is feasible.
93136

94-
### How the scheduler iterates over Nodes
137+
## How the scheduler iterates over Nodes
95138

96139
This section is intended for those who want to understand the internal details
97140
of this feature.

0 commit comments

Comments
 (0)