@@ -12,7 +12,7 @@ weight: 10
12
12
<!-- overview -->
13
13
<!--
14
14
In a Kubernetes cluster, a {{< glossary_tooltip text="node" term_id="node" >}}
15
- can be shutdown in a planned graceful way or unexpectedly because of reasons such
15
+ can be shut down in a planned graceful way or unexpectedly because of reasons such
16
16
as a power outage or something else external. A node shutdown could lead to workload
17
17
failure if the node is not drained before the shutdown. A node shutdown can be
18
18
either **graceful** or **non-graceful**.
@@ -33,7 +33,7 @@ either **graceful** or **non-graceful**.
33
33
<!--
34
34
The kubelet attempts to detect node system shutdown and terminates pods running on the node.
35
35
36
- Kubelet ensures that pods follow the normal
36
+ kubelet ensures that pods follow the normal
37
37
[pod termination process](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination)
38
38
during the node shutdown. During node shutdown, the kubelet does not accept new
39
39
Pods (even if those Pods are already bound to the node).
@@ -45,7 +45,7 @@ kubelet 会尝试检测节点系统关闭事件并终止在节点上运行的所
45
45
且不接受新的 Pod(即使这些 Pod 已经绑定到该节点)。
46
46
47
47
<!--
48
- The Graceful node shutdown feature depends on systemd since it takes advantage of
48
+ The graceful node shutdown feature depends on systemd since it takes advantage of
49
49
[systemd inhibitor locks](https://www.freedesktop.org/wiki/Software/systemd/inhibit/) to
50
50
delay the node shutdown with a given duration.
51
51
-->
@@ -64,23 +64,23 @@ enabled by default in 1.21.
64
64
65
65
<!--
66
66
Note that by default, both configuration options described below,
67
- `shutdownGracePeriod` and `shutdownGracePeriodCriticalPods` are set to zero,
67
+ `shutdownGracePeriod` and `shutdownGracePeriodCriticalPods`, are set to zero,
68
68
thus not activating the graceful node shutdown functionality.
69
- To activate the feature, the two kubelet config settings should be configured appropriately and
69
+ To activate the feature, both options should be configured appropriately and
70
70
set to non-zero values.
71
71
-->
72
72
注意,默认情况下,下面描述的两个配置选项,` shutdownGracePeriod ` 和
73
73
` shutdownGracePeriodCriticalPods ` 都是被设置为 0 的,因此不会激活节点体面关闭特性。
74
- 要激活此功能特性,这两个 kubelet 配置选项要适当配置 ,并设置为非零值。
74
+ 要激活此功能特性,这两个选项要适当配置 ,并设置为非零值。
75
75
76
76
<!--
77
- Once systemd detects or notifies node shutdown, the kubelet sets a `NotReady` condition on
77
+ Once systemd detects or is notified of a node shutdown, the kubelet sets a `NotReady` condition on
78
78
the Node, with the `reason` set to `"node is shutting down"`. The kube-scheduler honors this condition
79
79
and does not schedule any Pods onto the affected node; other third-party schedulers are
80
80
expected to follow the same logic. This means that new Pods won't be scheduled onto that node
81
81
and therefore none will start.
82
82
-->
83
- 一旦 systemd 检测到或通知节点关闭 ,kubelet 就会在节点上设置一个
83
+ 一旦 systemd 检测到或收到节点关闭的通知 ,kubelet 就会在节点上设置一个
84
84
` NotReady ` 状况,并将 ` reason ` 设置为 ` "node is shutting down" ` 。
85
85
kube-scheduler 会重视此状况,不将 Pod 调度到受影响的节点上;
86
86
其他第三方调度程序也应当遵循相同的逻辑。这意味着新的 Pod 不会被调度到该节点上,
@@ -97,17 +97,17 @@ node shutdown has been detected, so that even Pods with a
97
97
的{{< glossary_tooltip text="容忍度" term_id="toleration" >}},也不会在此节点上启动。
98
98
99
99
<!--
100
- At the same time when kubelet is setting that condition on its Node via the API, the kubelet also begins
101
- terminating any Pods that are running locally.
100
+ When kubelet is setting that condition on its Node via the API,
101
+ the kubelet also begins terminating any Pods that are running locally.
102
102
-->
103
- 同时, 当 kubelet 通过 API 在其 Node 上设置该状况时,kubelet
103
+ 当 kubelet 通过 API 在其 Node 上设置该状况时,kubelet
104
104
也开始终止在本地运行的所有 Pod。
105
105
106
106
<!--
107
107
During a graceful shutdown, kubelet terminates pods in two phases:
108
108
109
109
1. Terminate regular pods running on the node.
110
- 2 . Terminate [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
110
+ 1 . Terminate [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
111
111
running on the node.
112
112
-->
113
113
在体面关闭过程中,kubelet 分两个阶段来终止 Pod:
@@ -116,34 +116,42 @@ During a graceful shutdown, kubelet terminates pods in two phases:
116
116
2 . 终止在节点上运行的[ 关键 Pod] ( /zh-cn/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical ) 。
117
117
118
118
<!--
119
- Graceful node shutdown feature is configured with two
119
+ The graceful node shutdown feature is configured with two
120
120
[`KubeletConfiguration`](/docs/tasks/administer-cluster/kubelet-config-file/) options:
121
- * `shutdownGracePeriod`:
122
- * Specifies the total duration that the node should delay the shutdown by. This is the total
123
- grace period for pod termination for both regular and
124
- [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical).
125
- * `shutdownGracePeriodCriticalPods`:
126
- * Specifies the duration used to terminate
127
- [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
128
- during a node shutdown. This value should be less than `shutdownGracePeriod`.
129
121
-->
130
122
节点体面关闭的特性对应两个
131
123
[ ` KubeletConfiguration ` ] ( /zh-cn/docs/tasks/administer-cluster/kubelet-config-file/ ) 选项:
132
124
133
- * ` shutdownGracePeriod ` :
134
- * 指定节点应延迟关闭的总持续时间。这是 Pod 体面终止的时间总和,不区分常规 Pod
135
- 还是[ 关键 Pod] ( /zh-cn/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical ) 。
136
- * ` shutdownGracePeriodCriticalPods ` :
137
- * 在节点关闭期间指定用于终止[ 关键 Pod] ( /zh-cn/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical )
138
- 的持续时间。该值应小于 ` shutdownGracePeriod ` 。
125
+ <!--
126
+ - `shutdownGracePeriod`:
127
+
128
+ Specifies the total duration that the node should delay the shutdown by. This is the total
129
+ grace period for pod termination for both regular and
130
+ [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical).
131
+ -->
132
+ - ` shutdownGracePeriod ` :
133
+
134
+ 指定节点应延迟关闭的总持续时间。这是 Pod 体面终止的时间总和,不区分常规 Pod
135
+ 还是[ 关键 Pod] ( /zh-cn/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical ) 。
136
+
137
+ <!--
138
+ - `shutdownGracePeriodCriticalPods`:
139
+
140
+ Specifies the duration used to terminate
141
+ [critical pods](/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical)
142
+ during a node shutdown. This value should be less than `shutdownGracePeriod`.
143
+ -->
144
+ - ` shutdownGracePeriodCriticalPods ` :
145
+
146
+ 在节点关闭期间指定用于终止[ 关键 Pod] ( /zh-cn/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/#marking-pod-as-critical )
147
+ 的持续时间。该值应小于 ` shutdownGracePeriod ` 。
139
148
140
149
{{< note >}}
141
150
<!--
142
151
There are cases when Node termination was cancelled by the system (or perhaps manually
143
- by an administrator). In either of those situations the
144
- Node will return to the `Ready` state. However Pods which already started the process
145
- of termination
146
- will not be restored by kubelet and will need to be re-scheduled.
152
+ by an administrator). In either of those situations the Node will return to the `Ready` state.
153
+ However, Pods which already started the process of termination will not be restored by kubelet
154
+ and will need to be re-scheduled.
147
155
-->
148
156
在某些情况下,节点终止过程会被系统取消(或者可能由管理员手动取消)。
149
157
无论哪种情况下,节点都将返回到 ` Ready ` 状态。然而,已经开始终止进程的
@@ -229,12 +237,12 @@ in a cluster,
229
237
[ 优先级类] ( /zh-cn/docs/concepts/scheduling-eviction/pod-priority-preemption/#priorityclass ) 。
230
238
231
239
<!--
232
- |Pod priority class name| Pod priority class value|
233
- |-------------------------| ------------------------|
234
- |`custom-class-a` | 100000 |
235
- |`custom-class-b` | 10000 |
236
- |`custom-class-c` | 1000 |
237
- |`regular/unset` | 0 |
240
+ | Pod priority class name | Pod priority class value |
241
+ | ----------------------- | ------------------------ |
242
+ | `custom-class-a` | 100000 |
243
+ | `custom-class-b` | 10000 |
244
+ | `custom-class-c` | 1000 |
245
+ | `regular/unset` | 0 |
238
246
-->
239
247
| Pod 优先级类名称 | Pod 优先级类数值 |
240
248
| -------------------------| ------------------------|
@@ -251,12 +259,12 @@ the settings for `shutdownGracePeriodByPodPriority` could look like:
251
259
` shutdownGracePeriodByPodPriority ` 看起来可能是这样:
252
260
253
261
<!--
254
- |Pod priority class value| Shutdown period|
255
- |------------------------| ---------------|
256
- | 100000 | 10 seconds |
257
- | 10000 | 180 seconds |
258
- | 1000 | 120 seconds |
259
- | 0 | 60 seconds |
262
+ | Pod priority class value | Shutdown period |
263
+ | ------------------------ | --------------- |
264
+ | 100000 | 10 seconds |
265
+ | 10000 | 180 seconds |
266
+ | 1000 | 120 seconds |
267
+ | 0 | 60 seconds |
260
268
-->
261
269
| Pod 优先级类数值 | 关闭期限 |
262
270
| ------------------------| -----------|
@@ -284,26 +292,26 @@ shutdownGracePeriodByPodPriority:
284
292
285
293
<!--
286
294
The above table implies that any pod with ` priority` value >= 100000 will get
287
- just 10 seconds to stop , any pod with value >= 10000 and < 100000 will get 180
288
- seconds to stop , any pod with value >= 1000 and < 10000 will get 120 seconds to stop .
289
- Finally, all other pods will get 60 seconds to stop .
295
+ just 10 seconds to shut down , any pod with value >= 10000 and < 100000 will get 180
296
+ seconds to shut down , any pod with value >= 1000 and < 10000 will get 120 seconds to shut down .
297
+ Finally, all other pods will get 60 seconds to shut down .
290
298
291
299
One doesn't have to specify values corresponding to all of the classes. For
292
300
example, you could instead use these settings :
293
301
-->
294
- 上面的表格表明,所有 `priority` 值大于等于 100000 的 Pod 停止期限只有 10 秒,
295
- 所有 `priority` 值介于 10000 和 100000 之间的 Pod 停止期限是 180 秒,
296
- 所有 `priority` 值介于 1000 和 10000 之间的 Pod 停止期限是 120 秒,
297
- 其他所有 Pod 停止期限是 60 秒。
302
+ 上面的表格表明,所有 `priority` 值大于等于 100000 的 Pod 关闭期限只有 10 秒,
303
+ 所有 `priority` 值介于 10000 和 100000 之间的 Pod 关闭期限是 180 秒,
304
+ 所有 `priority` 值介于 1000 和 10000 之间的 Pod 关闭期限是 120 秒,
305
+ 其他所有 Pod 关闭期限是 60 秒。
298
306
299
307
用户不需要为所有的优先级类都设置数值。例如,你也可以使用下面这种配置:
300
308
301
309
<!--
302
- |Pod priority class value| Shutdown period|
303
- |------------------------| ---------------|
304
- | 100000 | 300 seconds |
305
- | 1000 | 120 seconds |
306
- | 0 | 60 seconds |
310
+ | Pod priority class value | Shutdown period |
311
+ | ------------------------ | --------------- |
312
+ | 100000 | 300 seconds |
313
+ | 1000 | 120 seconds |
314
+ | 0 | 60 seconds |
307
315
-->
308
316
| Pod 优先级类数值 | 关闭期限 |
309
317
|------------------------|-----------|
@@ -422,7 +430,7 @@ on a different node.
422
430
During a non-graceful shutdown, Pods are terminated in the two phases :
423
431
424
432
1. Force delete the Pods that do not have matching `out-of-service` tolerations.
425
- 2 . Immediately perform detach volume operation for such pods.
433
+ 1 . Immediately perform detach volume operation for such pods.
426
434
-->
427
435
在非体面关闭期间,Pod 分两个阶段终止:
428
436
@@ -431,9 +439,8 @@ During a non-graceful shutdown, Pods are terminated in the two phases:
431
439
432
440
{{< note >}}
433
441
<!--
434
- - Before adding the taint `node.kubernetes.io/out-of-service` , it should be verified
435
- that the node is already in shutdown or power off state (not in the middle of
436
- restarting).
442
+ - Before adding the taint `node.kubernetes.io/out-of-service`, it should be verified
443
+ that the node is already in shutdown or power off state (not in the middle of restarting).
437
444
- The user is required to manually remove the out-of-service taint after the pods are
438
445
moved to a new node and the user has checked that the shutdown node has been
439
446
recovered since the user was the one who originally added the taint.
@@ -486,7 +493,7 @@ deleted.
486
493
[VolumeAttachment](/zh-cn/docs/reference/kubernetes-api/config-and-storage-resources/volume-attachment-v1/)。
487
494
488
495
<!--
489
- After this setting has been applied, unhealthy pods still attached to a volumes must be recovered
496
+ After this setting has been applied, unhealthy pods still attached to volumes must be recovered
490
497
via the [Non-Graceful Node Shutdown](#non-graceful-node-shutdown) procedure mentioned above.
491
498
-->
492
499
应用此设置后,仍然关联卷到不健康 Pod 必须通过上述[非体面节点关闭](#non-graceful-node-shutdown)过程进行恢复。
@@ -508,16 +515,16 @@ via the [Non-Graceful Node Shutdown](#non-graceful-node-shutdown) procedure ment
508
515
{{< feature-state feature_gate_name="WindowsGracefulNodeShutdown" >}}
509
516
510
517
<!--
511
- The Windows graceful node shutdown feature depends on kubelet running as a Windows service,
512
- it will then have a registered [service control handler](https://learn.microsoft.com/en-us/windows/win32/services/service-control-handler-function)
513
- to delay the presshutdown event with a given duration.
518
+ The Windows graceful node shutdown feature depends on kubelet running as a Windows service,
519
+ it will then have a registered [service control handler](https://learn.microsoft.com/en-us/windows/win32/services/service-control-handler-function)
520
+ to delay the preshutdown event with a given duration.
514
521
-->
515
522
此服务会使用一个注册的[服务控制处理程序函数](https://learn.microsoft.com/zh-cn/windows/win32/services/service-control-handler-function)将
516
523
preshutdown 事件延迟一段时间。
517
524
518
525
<!--
519
- Windows graceful node shutdown is controlled with the `WindowsGracefulNodeShutdown`
520
- [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
526
+ Windows graceful node shutdown is controlled with the `WindowsGracefulNodeShutdown`
527
+ [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
521
528
which is introduced in 1.32 as an alpha feature.
522
529
523
530
Windows graceful node shutdown can not be cancelled.
@@ -528,7 +535,7 @@ Windows 体面节点关闭是通过 1.32 中作为 Alpha 特性所引入的 `Win
528
535
Windows 体面节点关闭无法被取消。
529
536
530
537
<!--
531
- If Kubelet is not running as a Windows service, it will not be able to set and monitor
538
+ If kubelet is not running as a Windows service, it will not be able to set and monitor
532
539
the [Preshutdown](https://learn.microsoft.com/en-us/windows/win32/api/winsvc/ns-winsvc-service_preshutdown_info) event,
533
540
the node will have to go through the [Non-Graceful Node Shutdown](#non-graceful-node-shutdown) procedure mentioned above.
534
541
-->
@@ -537,8 +544,8 @@ the node will have to go through the [Non-Graceful Node Shutdown](#non-graceful-
537
544
事件,对应节点将不得不跑完上述[非体面节点关闭](#non-graceful-node-shutdown)的流程。
538
545
539
546
<!--
540
- In the case where the Windows graceful node shutdown feature is enabled, but the kubelet is not
541
- running as a Windows service, the kubelet will continue running instead of failing. However,
547
+ In the case where the Windows graceful node shutdown feature is enabled, but the kubelet is not
548
+ running as a Windows service, the kubelet will continue running instead of failing. However,
542
549
it will log an error indicating that it needs to be run as a Windows service.
543
550
-->
544
551
在启用 Windows 体面节点关闭特性但 kubelet 未作为 Windows 服务运行的情况下,kubelet 将继续运行而不会失败。
@@ -548,8 +555,9 @@ it will log an error indicating that it needs to be run as a Windows service.
548
555
549
556
<!--
550
557
Learn more about the following :
551
- * Blog: [Non-Graceful Node Shutdown](/blog/2023/08/16/kubernetes-1-28-non-graceful-node-shutdown-ga/).
552
- * Cluster Architecture: [Nodes](/docs/concepts/architecture/nodes/).
558
+
559
+ - Blog : [Non-Graceful Node Shutdown](/blog/2023/08/16/kubernetes-1-28-non-graceful-node-shutdown-ga/).
560
+ - Cluster Architecture : [Nodes](/docs/concepts/architecture/nodes/).
553
561
-->
554
562
了解更多以下信息:
555
563
0 commit comments