Skip to content

Commit 04ea603

Browse files
authored
Merge pull request #28176 from chenrui333/zh/resync-scheduling-eviction-files
zh: resync scheduling files
2 parents 1f70fdb + 6e1ac28 commit 04ea603

File tree

7 files changed

+168
-94
lines changed

7 files changed

+168
-94
lines changed
Lines changed: 50 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,53 @@
11
---
2-
title: 调度和驱逐 (Scheduling and Eviction)
2+
title: 调度,抢占和驱逐
33
weight: 90
4-
description: 在Kubernetes中,调度 (Scheduling) 指的是确保 Pods 匹配到合适的节点,以便 kubelet 能够运行它们。驱逐 (Eviction) 是在资源匮乏的节点上,主动让一个或多个 Pods 失效的过程。
4+
content_type: concept
5+
description: >
6+
在Kubernetes中,调度 (scheduling) 指的是确保 Pods 匹配到合适的节点,
7+
以便 kubelet 能够运行它们。抢占 (Preemption) 指的是终止低优先级的 Pods 以便高优先级的 Pods 可以
8+
调度运行的过程。驱逐 (Eviction) 是在资源匮乏的节点上,主动让一个或多个 Pods 失效的过程。
59
---
10+
11+
<!--
12+
---
13+
title: "Scheduling, Preemption and Eviction"
14+
weight: 90
15+
content_type: concept
16+
description: >
17+
In Kubernetes, scheduling refers to making sure that Pods are matched to Nodes
18+
so that the kubelet can run them. Preemption is the process of terminating
19+
Pods with lower Priority so that Pods with higher Priority can schedule on
20+
Nodes. Eviction is the process of proactively terminating one or more Pods on
21+
resource-starved Nodes.
22+
no_list: true
23+
---
24+
-->
25+
26+
<!--
27+
In Kubernetes, scheduling refers to making sure that {{<glossary_tooltip text="Pods" term_id="pod">}}
28+
are matched to {{<glossary_tooltip text="Nodes" term_id="node">}} so that the
29+
{{<glossary_tooltip text="kubelet" term_id="kubelet">}} can run them. Preemption
30+
is the process of terminating Pods with lower {{<glossary_tooltip text="Priority" term_id="pod-priority">}}
31+
so that Pods with higher Priority can schedule on Nodes. Eviction is the process
32+
of terminating one or more Pods on Nodes.
33+
-->
34+
35+
<!-- ## Scheduling -->
36+
37+
## 调度
38+
39+
* [Kubernetes 调度器](/zh/docs/concepts/scheduling-eviction/kube-scheduler/)
40+
* [将 Pods 指派到节点](/zh/docs/concepts/scheduling-eviction/assign-pod-node/)
41+
* [Pod 开销](/zh/docs/concepts/scheduling-eviction/pod-overhead/)
42+
* [污点和容忍](/zh/docs/concepts/scheduling-eviction/taint-and-toleration/)
43+
* [调度框架](/zh/docs/concepts/scheduling-eviction/scheduling-framework)
44+
* [调度器的性能调试](/zh/docs/concepts/scheduling-eviction/scheduler-perf-tuning/)
45+
* [扩展资源的资源装箱](/zh/docs/concepts/scheduling-eviction/resource-bin-packing/)
46+
47+
<!-- ## Pod Disruption -->
48+
49+
## Pod 干扰
50+
51+
* [Pod 优先级和抢占](/zh/docs/concepts/scheduling-eviction/pod-priority-preemption/)
52+
* [节点压力驱逐](/zh/docs/concepts/scheduling-eviction/node-pressure-eviction/)
53+
* [API发起的驱逐](/zh/docs/concepts/scheduling-eviction/api-eviction/)

content/zh/docs/concepts/scheduling-eviction/assign-pod-node.md

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -75,9 +75,9 @@ Run `kubectl get nodes` to get the names of your cluster's nodes. Pick out the o
7575

7676
执行 `kubectl get nodes` 命令获取集群的节点名称。
7777
选择一个你要增加标签的节点,然后执行
78-
`kubectl label nodes <node-name> <label-key>=<label-value>`
78+
`kubectl label nodes <node-name> <label-key>=<label-value>`
7979
命令将标签添加到你所选择的节点上。
80-
例如,如果你的节点名称为 'kubernetes-foo-node-1.c.a-robinson.internal'
80+
例如,如果你的节点名称为 'kubernetes-foo-node-1.c.a-robinson.internal'
8181
并且想要的标签是 'disktype=ssd',则可以执行
8282
`kubectl label nodes kubernetes-foo-node-1.c.a-robinson.internal disktype=ssd` 命令。
8383

@@ -136,8 +136,18 @@ with a standard set of labels. See [Well-Known Labels, Annotations and Taints](/
136136
-->
137137
## 插曲:内置的节点标签 {#built-in-node-labels}
138138

139-
除了你[添加](#attach-labels-to-node)的标签外,节点还预先填充了一组标准标签。
140-
参见[常用标签、注解和污点](/zh/docs/reference/labels-annotations-taints/)。
139+
除了你[添加](#step-one-attach-label-to-the-node)的标签外,节点还预制了一组标准标签。
140+
参见这些[常用的标签,注解以及污点](/zh/docs/reference/labels-annotations-taints/):
141+
142+
* [`kubernetes.io/hostname`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-hostname)
143+
* [`failure-domain.beta.kubernetes.io/zone`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#failure-domainbetakubernetesiozone)
144+
* [`failure-domain.beta.kubernetes.io/region`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#failure-domainbetakubernetesioregion)
145+
* [`topology.kubernetes.io/zone`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#topologykubernetesiozone)
146+
* [`topology.kubernetes.io/region`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#topologykubernetesiozone)
147+
* [`beta.kubernetes.io/instance-type`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#beta-kubernetes-io-instance-type)
148+
* [`node.kubernetes.io/instance-type`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#nodekubernetesioinstance-type)
149+
* [`kubernetes.io/os`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-os)
150+
* [`kubernetes.io/arch`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-arch)
141151

142152
{{< note >}}
143153
<!--
@@ -181,7 +191,7 @@ To make use of that label prefix for node isolation:
181191
For example, `example.com.node-restriction.kubernetes.io/fips=true` or `example.com.node-restriction.kubernetes.io/pci-dss=true`.
182192
-->
183193
1. 检查是否在使用 Kubernetes v1.11+,以便 NodeRestriction 功能可用。
184-
2. 确保你在使用[节点授权](/zh/docs/reference/access-authn-authz/node/)并且已经_启用_
194+
2. 确保你在使用[节点授权](/zh/docs/reference/access-authn-authz/node/)并且已经_启用_
185195
[NodeRestriction 准入插件](/zh/docs/reference/access-authn-authz/admission-controllers/#noderestriction)。
186196
3. 将 `node-restriction.kubernetes.io/` 前缀下的标签添加到 Node 对象,
187197
然后在节点选择器中使用这些标签。
@@ -574,7 +584,7 @@ must be satisfied for the pod to be scheduled onto a node.
574584
<!--
575585
Users can also select matching namespaces using `namespaceSelector`, which is a label query over the set of namespaces.
576586
The affinity term is applied to the union of the namespaces selected by `namespaceSelector` and the ones listed in the `namespaces` field.
577-
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
587+
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
578588
null `namespaceSelector` means "this pod's namespace".
579589
-->
580590
用户也可以使用 `namespaceSelector` 选择匹配的名字空间,`namespaceSelector`
@@ -828,4 +838,3 @@ resource allocation decisions.
828838
一旦 Pod 分配给 节点,kubelet 应用将运行该 pod 并且分配节点本地资源。
829839
[拓扑管理器](/zh/docs/tasks/administer-cluster/topology-manager/)
830840
可以参与到节点级别的资源分配决定中。
831-

content/zh/docs/concepts/scheduling-eviction/pod-overhead.md

Lines changed: 39 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,21 @@
11
---
22
title: Pod 开销
33
content_type: concept
4-
weight: 20
4+
weight: 30
55
---
66

7+
<!--
8+
---
9+
reviewers:
10+
- dchen1107
11+
- egernst
12+
- tallclair
13+
title: Pod Overhead
14+
content_type: concept
15+
weight: 30
16+
---
17+
-->
18+
719
<!-- overview -->
820

921
{{< feature-state for_k8s_version="v1.18" state="beta" >}}
@@ -58,7 +70,7 @@ across your cluster, and a `RuntimeClass` is utilized which defines the `overhea
5870
您需要确保在集群中启用了 `PodOverhead` [特性门控](/zh/docs/reference/command-line-tools-reference/feature-gates/)
5971
(在 1.18 默认是开启的),以及一个用于定义 `overhead` 字段的 `RuntimeClass`
6072

61-
<!--
73+
<!--
6274
## Usage example
6375
-->
6476
## 使用示例
@@ -85,7 +97,7 @@ overhead:
8597
cpu: "250m"
8698
```
8799
88-
<!--
100+
<!--
89101
Workloads which are created which specify the `kata-fc` RuntimeClass handler will take the memory and
90102
cpu overheads into account for resource quota calculations, node scheduling, as well as Pod cgroup sizing.
91103

@@ -119,7 +131,7 @@ spec:
119131
memory: 100Mi
120132
```
121133

122-
<!--
134+
<!--
123135
At admission time the RuntimeClass [admission controller](/docs/reference/access-authn-authz/admission-controllers/)
124136
updates the workload's PodSpec to include the `overhead` as described in the RuntimeClass. If the PodSpec already has this field defined,
125137
the Pod will be rejected. In the given example, since only the RuntimeClass name is specified, the admission controller mutates the Pod
@@ -129,7 +141,7 @@ to include an `overhead`.
129141
RuntimeClass 中定义的 `overhead`. 如果 PodSpec 中该字段已定义,该 Pod 将会被拒绝。
130142
在这个例子中,由于只指定了 RuntimeClass 名称,所以准入控制器更新了 Pod, 包含了一个 `overhead`.
131143

132-
<!--
144+
<!--
133145
After the RuntimeClass admission controller, you can check the updated PodSpec:
134146
-->
135147
在 RuntimeClass 准入控制器之后,可以检验一下已更新的 PodSpec:
@@ -138,33 +150,33 @@ After the RuntimeClass admission controller, you can check the updated PodSpec:
138150
kubectl get pod test-pod -o jsonpath='{.spec.overhead}'
139151
```
140152

141-
<!--
153+
<!--
142154
The output is:
143155
-->
144156
输出:
145157
```
146158
map[cpu:250m memory:120Mi]
147159
```
148160
149-
<!--
161+
<!--
150162
If a ResourceQuota is defined, the sum of container requests as well as the
151163
`overhead` field are counted.
152164
-->
153165
如果定义了 ResourceQuata, 则容器请求的总量以及 `overhead` 字段都将计算在内。
154166
155-
<!--
167+
<!--
156168
When the kube-scheduler is deciding which node should run a new Pod, the scheduler considers that Pod's
157169
`overhead` as well as the sum of container requests for that Pod. For this example, the scheduler adds the
158170
requests and the overhead, then looks for a node that has 2.25 CPU and 320 MiB of memory available.
159171
-->
160172
当 kube-scheduler 决定在哪一个节点调度运行新的 Pod 时,调度器会兼顾该 Pod 的 `overhead` 以及该 Pod 的容器请求总量。在这个示例中,调度器将资源请求和开销相加,然后寻找具备 2.25 CPU 和 320 MiB 内存可用的节点。
161173
162-
<!--
174+
<!--
163175
Once a Pod is scheduled to a node, the kubelet on that node creates a new {{< glossary_tooltip text="cgroup" term_id="cgroup" >}}
164176
for the Pod. It is within this pod that the underlying container runtime will create containers. -->
165177
一旦 Pod 调度到了某个节点, 该节点上的 kubelet 将为该 Pod 新建一个 {{< glossary_tooltip text="cgroup" term_id="cgroup" >}}. 底层容器运行时将在这个 pod 中创建容器。
166178
167-
<!--
179+
<!--
168180
If the resource has a limit defined for each container (Guaranteed QoS or Bustrable QoS with limits defined),
169181
the kubelet will set an upper limit for the pod cgroup associated with that resource (cpu.cfs_quota_us for CPU
170182
and memory.limit_in_bytes memory). This upper limit is based on the sum of the container limits plus the `overhead`
@@ -179,31 +191,31 @@ requests plus the `overhead` defined in the PodSpec.
179191
-->
180192
对于 CPU, 如果 Pod 的 QoS 是 Guaranteed 或者 Burstable, kubelet 会基于容器请求总量与 PodSpec 中定义的 `overhead` 之和设置 `cpu.shares`.
181193
182-
<!--
194+
<!--
183195
Looking at our example, verify the container requests for the workload:
184196
-->
185197
请看这个例子,验证工作负载的容器请求:
186198
```bash
187199
kubectl get pod test-pod -o jsonpath='{.spec.containers[*].resources.limits}'
188200
```
189201

190-
<!--
202+
<!--
191203
The total container requests are 2000m CPU and 200MiB of memory:
192204
-->
193205
容器请求总计 2000m CPU 和 200MiB 内存:
194206
```
195207
map[cpu: 500m memory:100Mi] map[cpu:1500m memory:100Mi]
196208
```
197209

198-
<!--
210+
<!--
199211
Check this against what is observed by the node:
200212
-->
201213
对照从节点观察到的情况来检查一下:
202214
```bash
203215
kubectl describe node | grep test-pod -B2
204216
```
205217

206-
<!--
218+
<!--
207219
The output shows 2250m CPU and 320MiB of memory are requested, which includes PodOverhead:
208220
-->
209221
该输出显示请求了 2250m CPU 以及 320MiB 内存,包含了 PodOverhead 在内:
@@ -226,8 +238,9 @@ cgroups directly on the node.
226238
227239
First, on the particular node, determine the Pod identifier:
228240
-->
229-
在工作负载所运行的节点上检查 Pod 的内存 cgroups. 在接下来的例子中,将在该节点上使用具备 CRI 兼容的容器运行时命令行工具 [`crictl`](https://github.com/kubernetes-sigs/cri-tools/blob/master/docs/crictl.md).
230-
这是一个展示 PodOverhead 行为的进阶示例,用户并不需要直接在该节点上检查 cgroups.
241+
在工作负载所运行的节点上检查 Pod 的内存 cgroups. 在接下来的例子中,
242+
将在该节点上使用具备 CRI 兼容的容器运行时命令行工具
243+
[`crictl`](https://github.com/kubernetes-sigs/cri-tools/blob/master/docs/crictl.md)
231244

232245
首先在特定的节点上确定该 Pod 的标识符:
233246

@@ -240,7 +253,7 @@ First, on the particular node, determine the Pod identifier:
240253
POD_ID="$(sudo crictl pods --name test-pod -q)"
241254
```
242255

243-
<!--
256+
<!--
244257
From this, you can determine the cgroup path for the Pod:
245258
-->
246259
可以依此判断该 Pod 的 cgroup 路径:
@@ -254,15 +267,15 @@ From this, you can determine the cgroup path for the Pod:
254267
sudo crictl inspectp -o=json $POD_ID | grep cgroupsPath
255268
```
256269

257-
<!--
270+
<!--
258271
The resulting cgroup path includes the Pod's `pause` container. The Pod level cgroup is one directory above.
259272
-->
260273
执行结果的 cgroup 路径中包含了该 Pod 的 `pause` 容器。Pod 级别的 cgroup 即上面的一个目录。
261274
```
262275
"cgroupsPath": "/kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2/7ccf55aee35dd16aca4189c952d83487297f3cd760f1bbf09620e206e7d0c27a"
263276
```
264277

265-
<!--
278+
<!--
266279
In this specific case, the pod cgroup path is `kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2`. Verify the Pod level cgroup setting for memory:
267280
-->
268281
在这个例子中,该 pod 的 cgroup 路径是 `kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2`。验证内存的 Pod 级别 cgroup 设置:
@@ -278,15 +291,15 @@ In this specific case, the pod cgroup path is `kubepods/podd7f4b509-cf94-4951-94
278291
cat /sys/fs/cgroup/memory/kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2/memory.limit_in_bytes
279292
```
280293

281-
<!--
294+
<!--
282295
This is 320 MiB, as expected:
283296
-->
284297
和预期的一样是 320 MiB
285298
```
286299
335544320
287300
```
288301

289-
<!--
302+
<!--
290303
### Observability
291304
-->
292305
### 可观察性
@@ -298,8 +311,11 @@ running with a defined Overhead. This functionality is not available in the 1.9
298311
kube-state-metrics, but is expected in a following release. Users will need to build kube-state-metrics
299312
from source in the meantime.
300313
-->
301-
[kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) 中可以通过 `kube_pod_overhead` 指标来协助确定何时使用 PodOverhead 以及协助观察以一个既定开销运行的工作负载的稳定性。
302-
该特性在 kube-state-metrics 的 1.9 发行版本中不可用,不过预计将在后续版本中发布。在此之前,用户需要从源代码构建 kube-state-metrics.
314+
[kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) 中可以通过
315+
`kube_pod_overhead` 指标来协助确定何时使用 PodOverhead 以及协助观察以一个既定
316+
开销运行的工作负载的稳定性。
317+
该特性在 kube-state-metrics 的 1.9 发行版本中不可用,不过预计将在后续版本中发布。
318+
在此之前,用户需要从源代码构建 kube-state-metrics。
303319

304320
## {{% heading "whatsnext" %}}
305321

@@ -310,4 +326,3 @@ from source in the meantime.
310326

311327
* [RuntimeClass](/zh/docs/concepts/containers/runtime-class/)
312328
* [PodOverhead 设计](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/688-pod-overhead)
313-

content/zh/docs/concepts/scheduling-eviction/resource-bin-packing.md

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,26 @@
11
---
22
title: 扩展资源的资源装箱
33
content_type: concept
4-
weight: 30
4+
weight: 80
55
---
66
<!--
7+
---
78
reviewers:
89
- bsalamat
910
- k82cn
1011
- ahg-g
1112
title: Resource Bin Packing for Extended Resources
1213
content_type: concept
13-
weight: 30
14+
weight: 80
15+
---
1416
-->
1517

1618
<!-- overview -->
1719

1820
{{< feature-state for_k8s_version="1.16" state="alpha" >}}
1921

2022
<!--
21-
The kube-scheduler can be configured to enable bin packing of resources along with extended resources using `RequestedToCapacityRatioResourceAllocation` priority function. Priority functions can be used to fine-tune the kube-scheduler as per custom needs.
23+
The kube-scheduler can be configured to enable bin packing of resources along with extended resources using `RequestedToCapacityRatioResourceAllocation` priority function. Priority functions can be used to fine-tune the kube-scheduler as per custom needs.
2224
-->
2325

2426
使用 `RequestedToCapacityRatioResourceAllocation` 优先级函数,可以将 kube-scheduler
@@ -48,7 +50,7 @@ Kubernetes 1.16 在优先级函数中添加了一个新参数,该参数允许
4850
(least requested)或
4951
最多请求(most requested)计算。
5052
`resources` 包含由 `name``weight` 组成,`name` 指定评分时要考虑的资源,
51-
`weight` 指定每种资源的权重。
53+
`weight` 指定每种资源的权重。
5254

5355
<!--
5456
Below is an example configuration that sets `requestedToCapacityRatioArguments` to bin packing behavior for extended resources `intel.com/foo` and `intel.com/bar`
@@ -130,7 +132,7 @@ The above arguments give the node a score of 0 if utilization is 0% and 10 for u
130132
```
131133

132134
<!--
133-
It can be used to add extended resources as follows:
135+
It can be used to add extended resources as follows:
134136
-->
135137
它可以用来添加扩展资源,如下所示:
136138

@@ -249,4 +251,3 @@ CPU = resourceScoringFunction((2+6),8)
249251
NodeScore = (5 * 5) + (7 * 1) + (10 * 3) / (5 + 1 + 3)
250252
= 7
251253
```
252-

0 commit comments

Comments
 (0)