Skip to content

Commit 1cccd3f

Browse files
committed
update content/zh/docs/concepts/scheduling-eviction/scheduler-perf-tuning.md
1 parent b53b8ae commit 1cccd3f

File tree

1 file changed

+148
-57
lines changed

1 file changed

+148
-57
lines changed

content/zh/docs/concepts/scheduling-eviction/scheduler-perf-tuning.md

Lines changed: 148 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,11 @@ weight: 70
1818
{{< feature-state for_k8s_version="1.14" state="beta" >}}
1919

2020
<!--
21-
[kube-scheduler](/docs/concepts/scheduling/kube-scheduler/#kube-scheduler)
21+
[kube-scheduler](/docs/concepts/scheduling-eviction/kube-scheduler/#kube-scheduler)
2222
is the Kubernetes default scheduler. It is responsible for placement of Pods
2323
on Nodes in a cluster.
2424
-->
25-
作为 kubernetes 集群的默认调度器,kube-scheduler 主要负责将 Pod 调度到集群的 Node 上。
25+
作为 kubernetes 集群的默认调度器,[kube-scheduler](/docs/concepts/scheduling-eviction/kube-scheduler/#kube-scheduler) 主要负责将 Pod 调度到集群的 Node 上。
2626

2727
<!--
2828
Nodes in a cluster that meet the scheduling requirements of a Pod are
@@ -44,29 +44,118 @@ large Kubernetes clusters.
4444

4545
{{% capture body %}}
4646

47-
<!--
48-
## Percentage of Nodes to Score
49-
-->
50-
## 设置打分阶段 Node 数量占集群总规模的百分比
47+
<!--
48+
In large clusters, you can tune the scheduler's behaviour balancing
49+
scheduling outcomes between latency (new Pods are placed quickly) and
50+
accuracy (the scheduler rarely makes poor placement decisions).
5151
52-
<!--
53-
Before Kubernetes 1.12, Kube-scheduler used to check the feasibility of all
54-
nodes in a cluster and then scored the feasible ones. Kubernetes 1.12 added a
55-
new feature that allows the scheduler to stop looking for more feasible nodes
56-
once it finds a certain number of them. This improves the scheduler's
57-
performance in large clusters. The number is specified as a percentage of the
58-
cluster size. The percentage can be controlled by a configuration option called
59-
`percentageOfNodesToScore`. The range should be between 1 and 100. Larger values
60-
are considered as 100%. Zero is equivalent to not providing the config option.
61-
Kubernetes 1.14 has logic to find the percentage of nodes to score based on the
62-
size of the cluster if it is not specified in the configuration. It uses a
63-
linear formula which yields 50% for a 100-node cluster. The formula yields 10%
64-
for a 5000-node cluster. The lower bound for the automatic value is 5%. In other
65-
words, the scheduler always scores at least 5% of the cluster no matter how
66-
large the cluster is, unless the user provides the config option with a value
67-
smaller than 5.
68-
-->
69-
在 Kubernetes 1.12 版本之前,kube-scheduler 会检查集群中所有节点的可调度性,并且给可调度节点打分。Kubernetes 1.12 版本添加了一个新的功能,允许调度器在找到一定数量的可调度节点之后就停止继续寻找可调度节点。该功能能提高调度器在大规模集群下的调度性能。这个数值是集群规模的百分比。这个百分比通过 `percentageOfNodesToScore` 参数来进行配置。其值的范围在 1 到 100 之间,最大值就是 100%。如果设置为 0 就代表没有提供这个参数配置。Kubernetes 1.14 版本又加入了一个特性,在该参数没有被用户配置的情况下,调度器会根据集群的规模自动设置一个集群比例,然后通过这个比例筛选一定数量的可调度节点进入打分阶段。该特性使用线性公式计算出集群比例,如在 100-node 的集群下取 50%。在 5000-node 的集群下取 10%。这个自动设置的参数的最低值是 5%。换句话说,调度器至少会对集群中 5% 的节点进行打分,除非用户将该参数设置的低于 5。
52+
You configure this tuning setting via kube-scheduler setting
53+
`percentageOfNodesToScore`. This KubeSchedulerConfiguration setting determines
54+
a threshold for scheduling nodes in your cluster.
55+
-->
56+
在大规模集群中,你可以调节调度器的表现来平衡调度的延迟(新 Pod 快速就位)和精度(调度器很少做出糟糕的放置决策)。
57+
58+
你可以通过设置 kube-scheduler 的 `percentageOfNodesToScore` 来配置这个调优设置。这个 KubeSchedulerConfiguration 设置决定了调度集群中节点的阈值。
59+
60+
<!--
61+
### Setting the threshold
62+
-->
63+
### 设置阈值
64+
65+
<!--
66+
The `percentageOfNodesToScore` option accepts whole numeric values between 0
67+
and 100. The value 0 is a special number which indicates that the kube-scheduler
68+
should use its compiled-in default.
69+
If you set `percentageOfNodesToScore` above 100, kube-scheduler acts as if you
70+
had set a value of 100.
71+
-->
72+
`percentageOfNodesToScore` 选项接受从 0 到 100 之间的整数值。0 值比较特殊,表示 kube-scheduler 应该使用其编译后的默认值。
73+
如果你设置 `percentageOfNodesToScore` 的值超过了 100,kube-scheduler 的表现等价于设置值为 100。
74+
75+
<!--
76+
To change the value, edit the kube-scheduler configuration file (this is likely
77+
to be `/etc/kubernetes/config/kube-scheduler.yaml`), then restart the scheduler.
78+
-->
79+
要修改这个值,编辑 kube-scheduler 的配置文件(通常是 `/etc/kubernetes/config/kube-scheduler.yaml`),然后重启调度器。
80+
81+
<!--
82+
After you have made this change, you can run
83+
-->
84+
修改完成后,你可以执行
85+
```bash
86+
kubectl get componentstatuses
87+
```
88+
89+
<!--
90+
to verify that the kube-scheduler component is healthy. The output is similar to:
91+
-->
92+
来检查该 kube-scheduler 组件是否健康。输出类似如下:
93+
```
94+
NAME STATUS MESSAGE ERROR
95+
controller-manager Healthy ok
96+
scheduler Healthy ok
97+
...
98+
```
99+
100+
<!--
101+
## Node scoring threshold {#percentage-of-nodes-to-score}
102+
-->
103+
## 节点打分阈值 {#percentage-of-nodes-to-score}
104+
105+
<!--
106+
To improve scheduling performance, the kube-scheduler can stop looking for
107+
feasible nodes once it has found enough of them. In large clusters, this saves
108+
time compared to a naive approach that would consider every node.
109+
-->
110+
要提升调度性能,kube-scheduler 可以在找到足够的可调度节点之后停止查找。在大规模集群中,比起考虑每个节点的简单方法相比可以节省时间。
111+
112+
<!--
113+
You specify a threshold for how many nodes are enough, as a whole number percentage
114+
of all the nodes in your cluster. The kube-scheduler converts this into an
115+
integer number of nodes. During scheduling, if the kube-scheduler has identified
116+
enough feasible nodes to exceed the configured percentage, the kube-scheduler
117+
stops searching for more feasible nodes and moves on to the
118+
[scoring phase](/docs/concepts/scheduling-eviction/kube-scheduler/#kube-scheduler-implementation).
119+
-->
120+
你可以使用整个集群节点总数的百分比作为阈值来指定需要多少节点就足够。 kube-scheduler 会将它转换为节点数的整数值。在调度期间,如果
121+
kube-scheduler 已确认的可调度节点数足以超过了配置的百分比数量,kube-scheduler 将停止继续查找可调度节点并继续进行 [打分阶段](/docs/concepts/scheduling-eviction/kube-scheduler/#kube-scheduler-implementation)
122+
123+
<!--
124+
[How the scheduler iterates over Nodes](#how-the-scheduler-iterates-over-nodes)
125+
describes the process in detail.
126+
-->
127+
[调度器如何遍历节点](#how-the-scheduler-iterates-over-nodes) 详细介绍了这个过程。
128+
129+
<!--
130+
### Default threshold
131+
-->
132+
### 默认阈值
133+
134+
<!--
135+
If you don't specify a threshold, Kubernetes calculates a figure using a
136+
linear formula that yields 50% for a 100-node cluster and yields 10%
137+
for a 5000-node cluster. The lower bound for the automatic value is 5%.
138+
-->
139+
如果你不指定阈值,Kubernetes 使用线性公式计算出一个比例,在 100-node 集群下取 50%,在 5000-node 的集群下取 10%。
140+
这个自动设置的参数的最低值是 5%。
141+
142+
<!--
143+
This means that, the kube-scheduler always scores at least 5% of your cluster no
144+
matter how large the cluster is, unless you have explicitly set
145+
`percentageOfNodesToScore` to be smaller than 5.
146+
-->
147+
这意味着,调度器至少会对集群中 5% 的节点进行打分,除非用户将该参数设置的低于 5。
148+
149+
<!--
150+
If you want the scheduler to score all nodes in your cluster, set
151+
`percentageOfNodesToScore` to 100.
152+
-->
153+
如果你想让调度器对集群内所有节点进行打分,则将 `percentageOfNodesToScore` 设置为 100。
154+
155+
<!--
156+
## Example
157+
-->
158+
## 示例
70159

71160
<!--
72161
Below is an example configuration that sets `percentageOfNodesToScore` to 50%.
@@ -84,18 +173,6 @@ algorithmSource:
84173
percentageOfNodesToScore: 50
85174
```
86175
87-
<!--
88-
{{< note >}} In clusters with less than 50 feasible nodes, the scheduler still
89-
checks all the nodes, simply because there are not enough feasible nodes to stop
90-
the scheduler's search early. {{< /note >}}
91-
-->
92-
{{< note >}} 当集群中的可调度节点少于 50 个时,调度器仍然会去检查所有的 Node,因为可调度节点太少,不足以停止调度器最初的过滤选择。{{< /note >}}
93-
94-
<!--
95-
**To disable this feature**, you can set `percentageOfNodesToScore` to 100.
96-
-->
97-
**如果想要关闭这个功能**,你可以将 `percentageOfNodesToScore` 值设置成 100。
98-
99176
<!--
100177
### Tuning percentageOfNodesToScore
101178
-->
@@ -104,35 +181,49 @@ the scheduler's search early. {{< /note >}}
104181
<!--
105182
`percentageOfNodesToScore` must be a value between 1 and 100 with the default
106183
value being calculated based on the cluster size. There is also a hardcoded
107-
minimum value of 50 nodes. This means that changing
108-
this option to lower values in clusters with several hundred nodes will not have
109-
much impact on the number of feasible nodes that the scheduler tries to find.
110-
This is intentional as this option is unlikely to improve performance noticeably
111-
in smaller clusters. In large clusters with over a 1000 nodes setting this value
112-
to lower numbers may show a noticeable performance improvement.
184+
minimum value of 50 nodes.
113185
-->
114-
`percentageOfNodesToScore` 的值必须在 1 到 100 之间,而且其默认值是通过集群的规模计算得来的。另外,还有一个 50 个 Node 的数值是硬编码在程序里面的。设置这个值的作用在于:当集群的规模是数百个 Node 并且 `percentageOfNodesToScore` 参数设置的过低的时候,调度器筛选到的可调度节点数目基本不会受到该参数影响。当集群规模较小时,这个设置将导致调度器性能提升并不明显。然而在一个超过 1000 个 Node 的集群中,将调优参数设置为一个较低的值可以很明显的提升调度器性能。
186+
`percentageOfNodesToScore` 的值必须在 1 到 100 之间,而且其默认值是通过集群的规模计算得来的。
187+
另外,还有一个 50 个 Node 的最小值是硬编码在程序中。
115188

116189
<!--
117-
An important note to consider when setting this value is that when a smaller
190+
{{< note >}} In clusters with less than 50 feasible nodes, the scheduler still
191+
checks all the nodes, simply because there are not enough feasible nodes to stop
192+
the scheduler's search early.
193+
194+
In a small cluster, if you set a low value for `percentageOfNodesToScore`, your
195+
change will have no or little effect, for a similar reason.
196+
197+
If your cluster has several hundred Nodes or fewer, leave this configuration option
198+
at its default value. Making changes is unlikely to improve the
199+
scheduler's performance significantly.
200+
{{< /note >}}
201+
-->
202+
{{< note >}}
203+
当集群中的可调度节点少于 50 个时,调度器仍然会去检查所有的 Node,因为可调度节点太少,不足以停止调度器最初的过滤选择。
204+
205+
同理,在小规模集群中,如果你将 `percentageOfNodesToScore` 设置为一个较低的值,则没有或者只有很小的效果。
206+
207+
如果集群只有几百个节点或者更少,请保持这个配置的默认值。改变基本不会对调度器的性能有明显的提升。
208+
209+
{{< /note >}}
210+
211+
<!--
212+
An important detail to consider when setting this value is that when a smaller
118213
number of nodes in a cluster are checked for feasibility, some nodes are not
119214
sent to be scored for a given Pod. As a result, a Node which could possibly
120215
score a higher value for running the given Pod might not even be passed to the
121-
scoring phase. This would result in a less than ideal placement of the Pod. For
122-
this reason, the value should not be set to very low percentages. A general rule
123-
of thumb is to never set the value to anything lower than 10. Lower values
124-
should be used only when the scheduler's throughput is critical for your
125-
application and the score of nodes is not important. In other words, you prefer
126-
to run the Pod on any Node as long as it is feasible.
127-
-->
128-
值得注意的是,该参数设置后可能会导致只有集群中少数节点被选为可调度节点,很多 node 都没有进入到打分阶段。这样就会造成一种后果,一个本来可以在打分阶段得分很高的 Node 甚至都不能进入打分阶段。由于这个原因,这个参数不应该被设置成一个很低的值。通常的做法是不会将这个参数的值设置的低于 10。很低的参数值一般在调度器的吞吐量很高且对 node 的打分不重要的情况下才使用。换句话说,只有当你更倾向于在可调度节点中任意选择一个 Node 来运行这个 Pod 时,才使用很低的参数设置。
216+
scoring phase. This would result in a less than ideal placement of the Pod.
129217

130-
<!--
131-
If your cluster has several hundred Nodes or fewer, we do not recommend lowering
132-
the default value of this configuration option. It is unlikely to improve the
133-
scheduler's performance significantly.
218+
You should avoid setting `percentageOfNodesToScore` very low so that kube-scheduler
219+
does not make frequent, poor Pod placement decisions. Avoid setting the
220+
percentage to anything below 10%, unless the scheduler's throughput is critical
221+
for your application and the score of nodes is not important. In other words, you
222+
prefer to run the Pod on any Node as long as it is feasible.
134223
-->
135-
如果你的集群规模只有数百个节点或者更少,我们并不推荐你将这个参数设置得比默认值更低。因为这种情况下不太可能有效的提高调度器性能。
224+
值得注意的是,该参数设置后可能会导致只有集群中少数节点被选为可调度节点,很多 node 都没有进入到打分阶段。这样就会造成一种后果,一个本来可以在打分阶段得分很高的 Node 甚至都不能进入打分阶段。
225+
226+
由于这个原因,这个参数不应该被设置成一个很低的值。通常的做法是不会将这个参数的值设置的低于 10。很低的参数值一般在调度器的吞吐量很高且对 node 的打分不重要的情况下才使用。换句话说,只有当你更倾向于在可调度节点中任意选择一个 Node 来运行这个 Pod 时,才使用很低的参数设置。
136227

137228
<!--
138229
### How the scheduler iterates over Nodes

0 commit comments

Comments
 (0)