Skip to content

Commit 44530d1

Browse files
authored
Merge pull request #49305 from windsonsea/nodewn
[zh] Sync cluster-administration/node-shutdown.md
2 parents 3bb08d3 + c991586 commit 44530d1

File tree

1 file changed

+78
-70
lines changed

1 file changed

+78
-70
lines changed

content/zh-cn/docs/concepts/cluster-administration/node-shutdown.md

Lines changed: 78 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ weight: 10
1212
<!-- overview -->
1313
<!--
1414
In a Kubernetes cluster, a {{< glossary_tooltip text="node" term_id="node" >}}
15-
can be shutdown in a planned graceful way or unexpectedly because of reasons such
15+
can be shut down in a planned graceful way or unexpectedly because of reasons such
1616
as a power outage or something else external. A node shutdown could lead to workload
1717
failure if the node is not drained before the shutdown. A node shutdown can be
1818
either **graceful** or **non-graceful**.
@@ -33,7 +33,7 @@ either **graceful** or **non-graceful**.
3333
<!--
3434
The kubelet attempts to detect node system shutdown and terminates pods running on the node.
3535
36-
Kubelet ensures that pods follow the normal
36+
kubelet ensures that pods follow the normal
3737
[pod termination process](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination)
3838
during the node shutdown. During node shutdown, the kubelet does not accept new
3939
Pods (even if those Pods are already bound to the node).
@@ -45,7 +45,7 @@ kubelet 会尝试检测节点系统关闭事件并终止在节点上运行的所
4545
且不接受新的 Pod(即使这些 Pod 已经绑定到该节点)。
4646

4747
<!--
48-
The Graceful node shutdown feature depends on systemd since it takes advantage of
48+
The graceful node shutdown feature depends on systemd since it takes advantage of
4949
[systemd inhibitor locks](https://www.freedesktop.org/wiki/Software/systemd/inhibit/) to
5050
delay the node shutdown with a given duration.
5151
-->
@@ -64,23 +64,23 @@ enabled by default in 1.21.
6464

6565
<!--
6666
Note that by default, both configuration options described below,
67-
`shutdownGracePeriod` and `shutdownGracePeriodCriticalPods` are set to zero,
67+
`shutdownGracePeriod` and `shutdownGracePeriodCriticalPods`, are set to zero,
6868
thus not activating the graceful node shutdown functionality.
69-
To activate the feature, the two kubelet config settings should be configured appropriately and
69+
To activate the feature, both options should be configured appropriately and
7070
set to non-zero values.
7171
-->
7272
注意,默认情况下,下面描述的两个配置选项,`shutdownGracePeriod`
7373
`shutdownGracePeriodCriticalPods` 都是被设置为 0 的,因此不会激活节点体面关闭特性。
74-
要激活此功能特性,这两个 kubelet 配置选项要适当配置,并设置为非零值。
74+
要激活此功能特性,这两个选项要适当配置,并设置为非零值。
7575

7676
<!--
77-
Once systemd detects or notifies node shutdown, the kubelet sets a `NotReady` condition on
77+
Once systemd detects or is notified of a node shutdown, the kubelet sets a `NotReady` condition on
7878
the Node, with the `reason` set to `"node is shutting down"`. The kube-scheduler honors this condition
7979
and does not schedule any Pods onto the affected node; other third-party schedulers are
8080
expected to follow the same logic. This means that new Pods won't be scheduled onto that node
8181
and therefore none will start.
8282
-->
83-
一旦 systemd 检测到或通知节点关闭,kubelet 就会在节点上设置一个
83+
一旦 systemd 检测到或收到节点关闭的通知,kubelet 就会在节点上设置一个
8484
`NotReady` 状况,并将 `reason` 设置为 `"node is shutting down"`
8585
kube-scheduler 会重视此状况,不将 Pod 调度到受影响的节点上;
8686
其他第三方调度程序也应当遵循相同的逻辑。这意味着新的 Pod 不会被调度到该节点上,
@@ -97,17 +97,17 @@ node shutdown has been detected, so that even Pods with a
9797
的{{< glossary_tooltip text="容忍度" term_id="toleration" >}},也不会在此节点上启动。
9898

9999
<!--
100-
At the same time when kubelet is setting that condition on its Node via the API, the kubelet also begins
101-
terminating any Pods that are running locally.
100+
When kubelet is setting that condition on its Node via the API,
101+
the kubelet also begins terminating any Pods that are running locally.
102102
-->
103-
同时,当 kubelet 通过 API 在其 Node 上设置该状况时,kubelet
103+
当 kubelet 通过 API 在其 Node 上设置该状况时,kubelet
104104
也开始终止在本地运行的所有 Pod。
105105

106106
<!--
107107
During a graceful shutdown, kubelet terminates pods in two phases:
108108
109109
1. Terminate regular pods running on the node.
110-
2. Terminate [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
110+
1. Terminate [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
111111
running on the node.
112112
-->
113113
在体面关闭过程中,kubelet 分两个阶段来终止 Pod:
@@ -116,34 +116,42 @@ During a graceful shutdown, kubelet terminates pods in two phases:
116116
2. 终止在节点上运行的[关键 Pod](/zh-cn/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
117117

118118
<!--
119-
Graceful node shutdown feature is configured with two
119+
The graceful node shutdown feature is configured with two
120120
[`KubeletConfiguration`](/docs/tasks/administer-cluster/kubelet-config-file/) options:
121-
* `shutdownGracePeriod`:
122-
* Specifies the total duration that the node should delay the shutdown by. This is the total
123-
grace period for pod termination for both regular and
124-
[critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical).
125-
* `shutdownGracePeriodCriticalPods`:
126-
* Specifies the duration used to terminate
127-
[critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
128-
during a node shutdown. This value should be less than `shutdownGracePeriod`.
129121
-->
130122
节点体面关闭的特性对应两个
131123
[`KubeletConfiguration`](/zh-cn/docs/tasks/administer-cluster/kubelet-config-file/) 选项:
132124

133-
* `shutdownGracePeriod`
134-
* 指定节点应延迟关闭的总持续时间。这是 Pod 体面终止的时间总和,不区分常规 Pod
135-
还是[关键 Pod](/zh-cn/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
136-
* `shutdownGracePeriodCriticalPods`
137-
* 在节点关闭期间指定用于终止[关键 Pod](/zh-cn/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
138-
的持续时间。该值应小于 `shutdownGracePeriod`
125+
<!--
126+
- `shutdownGracePeriod`:
127+
128+
Specifies the total duration that the node should delay the shutdown by. This is the total
129+
grace period for pod termination for both regular and
130+
[critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical).
131+
-->
132+
- `shutdownGracePeriod`
133+
134+
指定节点应延迟关闭的总持续时间。这是 Pod 体面终止的时间总和,不区分常规 Pod
135+
还是[关键 Pod](/zh-cn/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
136+
137+
<!--
138+
- `shutdownGracePeriodCriticalPods`:
139+
140+
Specifies the duration used to terminate
141+
[critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
142+
during a node shutdown. This value should be less than `shutdownGracePeriod`.
143+
-->
144+
- `shutdownGracePeriodCriticalPods`
145+
146+
在节点关闭期间指定用于终止[关键 Pod](/zh-cn/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
147+
的持续时间。该值应小于 `shutdownGracePeriod`
139148

140149
{{< note >}}
141150
<!--
142151
There are cases when Node termination was cancelled by the system (or perhaps manually
143-
by an administrator). In either of those situations the
144-
Node will return to the `Ready` state. However Pods which already started the process
145-
of termination
146-
will not be restored by kubelet and will need to be re-scheduled.
152+
by an administrator). In either of those situations the Node will return to the `Ready` state.
153+
However, Pods which already started the process of termination will not be restored by kubelet
154+
and will need to be re-scheduled.
147155
-->
148156
在某些情况下,节点终止过程会被系统取消(或者可能由管理员手动取消)。
149157
无论哪种情况下,节点都将返回到 `Ready` 状态。然而,已经开始终止进程的
@@ -229,12 +237,12 @@ in a cluster,
229237
[优先级类](/zh-cn/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass)
230238

231239
<!--
232-
|Pod priority class name|Pod priority class value|
233-
|-------------------------|------------------------|
234-
|`custom-class-a` | 100000 |
235-
|`custom-class-b` | 10000 |
236-
|`custom-class-c` | 1000 |
237-
|`regular/unset` | 0 |
240+
| Pod priority class name | Pod priority class value |
241+
| ----------------------- | ------------------------ |
242+
| `custom-class-a` | 100000 |
243+
| `custom-class-b` | 10000 |
244+
| `custom-class-c` | 1000 |
245+
| `regular/unset` | 0 |
238246
-->
239247
| Pod 优先级类名称 | Pod 优先级类数值 |
240248
|-------------------------|------------------------|
@@ -251,12 +259,12 @@ the settings for `shutdownGracePeriodByPodPriority` could look like:
251259
`shutdownGracePeriodByPodPriority` 看起来可能是这样:
252260

253261
<!--
254-
|Pod priority class value|Shutdown period|
255-
|------------------------|---------------|
256-
| 100000 |10 seconds |
257-
| 10000 |180 seconds |
258-
| 1000 |120 seconds |
259-
| 0 |60 seconds |
262+
| Pod priority class value | Shutdown period |
263+
| ------------------------ | --------------- |
264+
| 100000 | 10 seconds |
265+
| 10000 | 180 seconds |
266+
| 1000 | 120 seconds |
267+
| 0 | 60 seconds |
260268
-->
261269
| Pod 优先级类数值 | 关闭期限 |
262270
|------------------------|-----------|
@@ -284,26 +292,26 @@ shutdownGracePeriodByPodPriority:
284292
285293
<!--
286294
The above table implies that any pod with `priority` value >= 100000 will get
287-
just 10 seconds to stop, any pod with value >= 10000 and < 100000 will get 180
288-
seconds to stop, any pod with value >= 1000 and < 10000 will get 120 seconds to stop.
289-
Finally, all other pods will get 60 seconds to stop.
295+
just 10 seconds to shut down, any pod with value >= 10000 and < 100000 will get 180
296+
seconds to shut down, any pod with value >= 1000 and < 10000 will get 120 seconds to shut down.
297+
Finally, all other pods will get 60 seconds to shut down.
290298

291299
One doesn't have to specify values corresponding to all of the classes. For
292300
example, you could instead use these settings:
293301
-->
294-
上面的表格表明,所有 `priority` 值大于等于 100000 的 Pod 停止期限只有 10 秒,
295-
所有 `priority` 值介于 10000 和 100000 之间的 Pod 停止期限是 180 秒,
296-
所有 `priority` 值介于 1000 和 10000 之间的 Pod 停止期限是 120 秒,
297-
其他所有 Pod 停止期限是 60 秒。
302+
上面的表格表明,所有 `priority` 值大于等于 100000 的 Pod 关闭期限只有 10 秒,
303+
所有 `priority` 值介于 10000 和 100000 之间的 Pod 关闭期限是 180 秒,
304+
所有 `priority` 值介于 1000 和 10000 之间的 Pod 关闭期限是 120 秒,
305+
其他所有 Pod 关闭期限是 60 秒。
298306

299307
用户不需要为所有的优先级类都设置数值。例如,你也可以使用下面这种配置:
300308

301309
<!--
302-
|Pod priority class value|Shutdown period|
303-
|------------------------|---------------|
304-
| 100000 |300 seconds |
305-
| 1000 |120 seconds |
306-
| 0 |60 seconds |
310+
| Pod priority class value | Shutdown period |
311+
| ------------------------ | --------------- |
312+
| 100000 | 300 seconds |
313+
| 1000 | 120 seconds |
314+
| 0 | 60 seconds |
307315
-->
308316
| Pod 优先级类数值 | 关闭期限 |
309317
|------------------------|-----------|
@@ -422,7 +430,7 @@ on a different node.
422430
During a non-graceful shutdown, Pods are terminated in the two phases:
423431

424432
1. Force delete the Pods that do not have matching `out-of-service` tolerations.
425-
2. Immediately perform detach volume operation for such pods.
433+
1. Immediately perform detach volume operation for such pods.
426434
-->
427435
在非体面关闭期间,Pod 分两个阶段终止:
428436

@@ -431,9 +439,8 @@ During a non-graceful shutdown, Pods are terminated in the two phases:
431439

432440
{{< note >}}
433441
<!--
434-
- Before adding the taint `node.kubernetes.io/out-of-service` , it should be verified
435-
that the node is already in shutdown or power off state (not in the middle of
436-
restarting).
442+
- Before adding the taint `node.kubernetes.io/out-of-service`, it should be verified
443+
that the node is already in shutdown or power off state (not in the middle of restarting).
437444
- The user is required to manually remove the out-of-service taint after the pods are
438445
moved to a new node and the user has checked that the shutdown node has been
439446
recovered since the user was the one who originally added the taint.
@@ -486,7 +493,7 @@ deleted.
486493
[VolumeAttachment](/zh-cn/docs/reference/kubernetes-api/config-and-storage-resources/volume-attachment-v1/)。
487494

488495
<!--
489-
After this setting has been applied, unhealthy pods still attached to a volumes must be recovered
496+
After this setting has been applied, unhealthy pods still attached to volumes must be recovered
490497
via the [Non-Graceful Node Shutdown](#non-graceful-node-shutdown) procedure mentioned above.
491498
-->
492499
应用此设置后,仍然关联卷到不健康 Pod 必须通过上述[非体面节点关闭](#non-graceful-node-shutdown)过程进行恢复。
@@ -508,16 +515,16 @@ via the [Non-Graceful Node Shutdown](#non-graceful-node-shutdown) procedure ment
508515
{{< feature-state feature_gate_name="WindowsGracefulNodeShutdown" >}}
509516

510517
<!--
511-
The Windows graceful node shutdown feature depends on kubelet running as a Windows service,
512-
it will then have a registered [service control handler](https://learn.microsoft.com/en-us/windows/win32/services/service-control-handler-function)
513-
to delay the presshutdown event with a given duration.
518+
The Windows graceful node shutdown feature depends on kubelet running as a Windows service,
519+
it will then have a registered [service control handler](https://learn.microsoft.com/en-us/windows/win32/services/service-control-handler-function)
520+
to delay the preshutdown event with a given duration.
514521
-->
515522
此服务会使用一个注册的[服务控制处理程序函数](https://learn.microsoft.com/zh-cn/windows/win32/services/service-control-handler-function)将
516523
preshutdown 事件延迟一段时间。
517524

518525
<!--
519-
Windows graceful node shutdown is controlled with the `WindowsGracefulNodeShutdown`
520-
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
526+
Windows graceful node shutdown is controlled with the `WindowsGracefulNodeShutdown`
527+
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
521528
which is introduced in 1.32 as an alpha feature.
522529

523530
Windows graceful node shutdown can not be cancelled.
@@ -528,7 +535,7 @@ Windows 体面节点关闭是通过 1.32 中作为 Alpha 特性所引入的 `Win
528535
Windows 体面节点关闭无法被取消。
529536

530537
<!--
531-
If Kubelet is not running as a Windows service, it will not be able to set and monitor
538+
If kubelet is not running as a Windows service, it will not be able to set and monitor
532539
the [Preshutdown](https://learn.microsoft.com/en-us/windows/win32/api/winsvc/ns-winsvc-service_preshutdown_info) event,
533540
the node will have to go through the [Non-Graceful Node Shutdown](#non-graceful-node-shutdown) procedure mentioned above.
534541
-->
@@ -537,8 +544,8 @@ the node will have to go through the [Non-Graceful Node Shutdown](#non-graceful-
537544
事件,对应节点将不得不跑完上述[非体面节点关闭](#non-graceful-node-shutdown)的流程。
538545

539546
<!--
540-
In the case where the Windows graceful node shutdown feature is enabled, but the kubelet is not
541-
running as a Windows service, the kubelet will continue running instead of failing. However,
547+
In the case where the Windows graceful node shutdown feature is enabled, but the kubelet is not
548+
running as a Windows service, the kubelet will continue running instead of failing. However,
542549
it will log an error indicating that it needs to be run as a Windows service.
543550
-->
544551
在启用 Windows 体面节点关闭特性但 kubelet 未作为 Windows 服务运行的情况下,kubelet 将继续运行而不会失败。
@@ -548,8 +555,9 @@ it will log an error indicating that it needs to be run as a Windows service.
548555

549556
<!--
550557
Learn more about the following:
551-
* Blog: [Non-Graceful Node Shutdown](/blog/2023/08/16/kubernetes-1-28-non-graceful-node-shutdown-ga/).
552-
* Cluster Architecture: [Nodes](/docs/concepts/architecture/nodes/).
558+
559+
- Blog: [Non-Graceful Node Shutdown](/blog/2023/08/16/kubernetes-1-28-non-graceful-node-shutdown-ga/).
560+
- Cluster Architecture: [Nodes](/docs/concepts/architecture/nodes/).
553561
-->
554562
了解更多以下信息:
555563

0 commit comments

Comments
 (0)