@@ -17,13 +17,13 @@ weight: 10
17
17
<!--
18
18
Kubernetes runs your workload by placing containers into Pods to run on _Nodes_.
19
19
A node may be a virtual or physical machine, depending on the cluster. Each node
20
- is managed by the
20
+ is managed by the
21
21
{{< glossary_tooltip text="control plane" term_id="control-plane" >}}
22
22
and contains the services necessary to run
23
23
{{< glossary_tooltip text="Pods" term_id="pod" >}}.
24
24
25
25
Typically you have several nodes in a cluster; in a learning or resource-limited
26
- environment, you might have just one.
26
+ environment, you might have only one node .
27
27
28
28
The [components](/docs/concepts/overview/components/#node-components) on a node include the
29
29
{{< glossary_tooltip text="kubelet" term_id="kubelet" >}}, a
@@ -32,11 +32,10 @@ The [components](/docs/concepts/overview/components/#node-components) on a node
32
32
-->
33
33
Kubernetes 通过将容器放入在节点(Node)上运行的 Pod 中来执行你的工作负载。
34
34
节点可以是一个虚拟机或者物理机器,取决于所在的集群配置。
35
- 每个节点包含运行 {{< glossary_tooltip text="Pods " term_id="pod" >}} 所需的服务;
35
+ 每个节点包含运行 {{< glossary_tooltip text="Pod " term_id="pod" >}} 所需的服务;
36
36
这些节点由 {{< glossary_tooltip text="控制面" term_id="control-plane" >}} 负责管理。
37
37
38
- 通常集群中会有若干个节点;而在一个学习用或者资源受限的环境中,你的集群中也可能
39
- 只有一个节点。
38
+ 通常集群中会有若干个节点;而在一个学习所用或者资源受限的环境中,你的集群中也可能只有一个节点。
40
39
41
40
节点上的[ 组件] ( /zh-cn/docs/concepts/overview/components/#node-components ) 包括
42
41
{{< glossary_tooltip text="kubelet" term_id="kubelet" >}}、
@@ -50,7 +49,7 @@ Kubernetes 通过将容器放入在节点(Node)上运行的 Pod 中来执行
50
49
There are two main ways to have Nodes added to the {{< glossary_tooltip text="API server" term_id="kube-apiserver" >}}:
51
50
52
51
1. The kubelet on a node self-registers to the control plane
53
- 2. You, or another human user, manually add a Node object
52
+ 2. You ( or another human user) manually add a Node object
54
53
55
54
After you create a Node {{< glossary_tooltip text="object" term_id="object" >}},
56
55
or the kubelet on a node self-registers, the control plane checks whether the new Node object is
@@ -61,7 +60,7 @@ valid. For example, if you try to create a Node from the following JSON manifest
61
60
向 {{< glossary_tooltip text="API 服务器" term_id="kube-apiserver" >}}添加节点的方式主要有两种:
62
61
63
62
1 . 节点上的 ` kubelet ` 向控制面执行自注册;
64
- 2 . 你, 或者别的什么人, 手动添加一个 Node 对象。
63
+ 2 . 你( 或者别的什么人) 手动添加一个 Node 对象。
65
64
66
65
在你创建了 Node {{< glossary_tooltip text="对象" term_id="object" >}}或者节点上的
67
66
` kubelet ` 执行了自注册操作之后,控制面会检查新的 Node 对象是否合法。
@@ -83,14 +82,14 @@ valid. For example, if you try to create a Node from the following JSON manifest
83
82
<!--
84
83
Kubernetes creates a Node object internally (the representation). Kubernetes checks
85
84
that a kubelet has registered to the API server that matches the `metadata.name`
86
- field of the Node. If the node is healthy (if all necessary services are running),
87
- it is eligible to run a Pod. Otherwise, that node is ignored for any cluster activity
85
+ field of the Node. If the node is healthy (i.e. all necessary services are running),
86
+ then it is eligible to run a Pod. Otherwise, that node is ignored for any cluster activity
88
87
until it becomes healthy.
89
88
-->
90
89
Kubernetes 会在内部创建一个 Node 对象作为节点的表示。Kubernetes 检查 ` kubelet `
91
90
向 API 服务器注册节点时使用的 ` metadata.name ` 字段是否匹配。
92
91
如果节点是健康的(即所有必要的服务都在运行中),则该节点可以用来运行 Pod。
93
- 否则,直到该节点变为健康之前,所有的集群活动都会忽略该节点。
92
+ 否则,直到该节点变为健康之前,所有的集群活动都会忽略该节点。
94
93
95
94
{{< note >}}
96
95
<!--
@@ -136,7 +135,7 @@ first and re-added after the update.
136
135
<!--
137
136
### Self-registration of Nodes
138
137
139
- When the kubelet flag `-register-node` is true (the default), the kubelet will attempt to
138
+ When the kubelet flag `-- register-node` is true (the default), the kubelet will attempt to
140
139
register itself with the API server. This is the preferred pattern, used by most distros.
141
140
142
141
For self-registration, the kubelet is started with the following options:
@@ -175,7 +174,7 @@ When the [Node authorization mode](/docs/reference/access-authn-authz/node/) and
175
174
[NodeRestriction admission plugin](/docs/reference/access-authn-authz/admission-controllers/#noderestriction) are enabled,
176
175
kubelets are only authorized to create/modify their own Node resource.
177
176
-->
178
- 启用[ Node 鉴权模式] ( /zh-cn/docs/reference/access-authn-authz/node/ ) 和
177
+ 启用 [ Node 鉴权模式] ( /zh-cn/docs/reference/access-authn-authz/node/ ) 和
179
178
[ NodeRestriction 准入插件] ( /zh-cn/docs/reference/access-authn-authz/admission-controllers/#noderestriction ) 时,
180
179
仅授权 ` kubelet ` 创建或修改其自己的节点资源。
181
180
@@ -216,7 +215,7 @@ You can create and modify Node objects using
216
215
When you want to create Node objects manually, set the kubelet flag `--register-node=false`.
217
216
218
217
You can modify Node objects regardless of the setting of `--register-node`.
219
- For example, you can set labels on an existing Node, or mark it unschedulable.
218
+ For example, you can set labels on an existing Node or mark it unschedulable.
220
219
-->
221
220
### 手动节点管理 {#manual-node-administration}
222
221
@@ -226,15 +225,15 @@ For example, you can set labels on an existing Node, or mark it unschedulable.
226
225
如果你希望手动创建节点对象时,请设置 kubelet 标志 ` --register-node=false ` 。
227
226
228
227
你可以修改 Node 对象(忽略 ` --register-node ` 设置)。
229
- 例如,修改节点上的标签或标记其为不可调度 。
228
+ 例如,你可以修改节点上的标签或并标记其为不可调度 。
230
229
231
230
<!--
232
231
You can use labels on Nodes in conjunction with node selectors on Pods to control
233
- scheduling. For example, you can to constrain a Pod to only be eligible to run on
232
+ scheduling. For example, you can constrain a Pod to only be eligible to run on
234
233
a subset of the available nodes.
235
234
236
235
Marking a node as unschedulable prevents the scheduler from placing new pods onto
237
- that Node, but does not affect existing Pods on the Node. This is useful as a
236
+ that Node but does not affect existing Pods on the Node. This is useful as a
238
237
preparatory step before a node reboot or other maintenance.
239
238
240
239
To mark a Node unschedulable, run:
@@ -271,9 +270,9 @@ DaemonSet 通常提供节点本地的服务,即使节点上的负载应用已
271
270
{{< /note >}}
272
271
273
272
<!--
274
- ## Node Status
273
+ ## Node status
275
274
276
- A node 's status contains the following information:
275
+ A Node 's status contains the following information:
277
276
278
277
* [Addresses](#addresses)
279
278
* [Conditions](#condition)
@@ -330,13 +329,13 @@ The `conditions` field describes the status of all `Running` nodes. Examples of
330
329
331
330
<!--
332
331
{{< table caption = "Node conditions, and a description of when each condition applies." >}}
333
- | Node Condition | Description |
334
- |----------------|-------------|
332
+ | Node Condition | Description |
333
+ |---------------------- |-------------|
335
334
| `Ready` | `True` if the node is healthy and ready to accept pods, `False` if the node is not healthy and is not accepting pods, and `Unknown` if the node controller has not heard from the node in the last `node-monitor-grace-period` (default is 40 seconds) |
336
335
| `DiskPressure` | `True` if pressure exists on the disk size—that is, if the disk capacity is low; otherwise `False` |
337
336
| `MemoryPressure` | `True` if pressure exists on the node memory—that is, if the node memory is low; otherwise `False` |
338
- | `PIDPressure` | `True` if pressure exists on the processes - that is, if there are too many processes on the node; otherwise `False` |
339
- | `NetworkUnavailable` | `True` if the network for the node is not correctly configured, otherwise `False` |
337
+ | `PIDPressure` | `True` if pressure exists on the processes— that is, if there are too many processes on the node; otherwise `False` |
338
+ | `NetworkUnavailable` | `True` if the network for the node is not correctly configured, otherwise `False` |
340
339
{{< /table >}}
341
340
-->
342
341
{{< table caption = "节点状况及每种状况适用场景的描述" >}}
@@ -364,7 +363,7 @@ Condition,被保护起来的节点在其规约中被标记为不可调度(Un
364
363
In the Kubernetes API, a node's condition is represented as part of the `.status`
365
364
of the Node resource. For example, the following JSON structure describes a healthy node:
366
365
-->
367
- 在 Kubernetes API 中,节点的状况表示节点资源中` .status ` 的一部分。
366
+ 在 Kubernetes API 中,节点的状况表示节点资源中` .status ` 的一部分。
368
367
例如,以下 JSON 结构描述了一个健康节点:
369
368
370
369
``` json
@@ -393,7 +392,7 @@ for all Pods assigned to that node. The default eviction timeout duration is
393
392
` pod-eviction-timeout ` 值(一个传递给
394
393
{{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager" >}}
395
394
的参数),[ 节点控制器] ( #node-controller ) 会对节点上的所有 Pod 触发
396
- {{< glossary_tooltip text="API- 发起的驱逐" term_id="api-eviction" >}}。
395
+ {{< glossary_tooltip text="API 发起的驱逐" term_id="api-eviction" >}}。
397
396
默认的逐出超时时长为 ** 5 分钟** 。
398
397
399
398
<!--
@@ -411,8 +410,8 @@ The node controller does not force delete pods until it is confirmed that they h
411
410
running in the cluster. You can see the pods that might be running on an unreachable node as
412
411
being in the `Terminating` or `Unknown` state. In cases where Kubernetes cannot deduce from the
413
412
underlying infrastructure if a node has permanently left a cluster, the cluster administrator
414
- may need to delete the node object by hand. Deleting the node object from Kubernetes causes
415
- all the Pod objects running on the node to be deleted from the API server, and frees up their
413
+ may need to delete the node object by hand. Deleting the node object from Kubernetes causes
414
+ all the Pod objects running on the node to be deleted from the API server and frees up their
416
415
names.
417
416
-->
418
417
节点控制器在确认 Pod 在集群中已经停止运行前,不会强制删除它们。
@@ -461,7 +460,8 @@ Node that is available to be consumed by normal Pods.
461
460
462
461
<!--
463
462
You may read more about capacity and allocatable resources while learning how
464
- to [reserve compute resources](/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable) on a Node.
463
+ to [reserve compute resources](/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable)
464
+ on a Node.
465
465
-->
466
466
可以在学习如何在节点上[ 预留计算资源] ( /zh-cn/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable )
467
467
的时候了解有关容量和可分配资源的更多信息。
@@ -505,7 +505,7 @@ Kubernetes 节点发送的心跳帮助你的集群确定每个节点的可用性
505
505
-->
506
506
* 更新节点的 ` .status `
507
507
* ` kube-node-lease ` {{<glossary_tooltip term_id="namespace" text="名字空间">}}中的
508
- [ Lease(租约)] ( /docs/reference/kubernetes-api/cluster-resources/lease-v1/ ) 对象。
508
+ [ Lease(租约)] ( /zh-cn/ docs/reference/kubernetes-api/cluster-resources/lease-v1/ ) 对象。
509
509
每个节点都有一个关联的 Lease 对象。
510
510
511
511
<!--
@@ -539,7 +539,7 @@ kubelet 负责创建和更新节点的 `.status`,以及更新它们对应的 L
539
539
最长重试间隔为 7 秒钟。
540
540
541
541
<!--
542
- ## Node Controller
542
+ ## Node controller
543
543
544
544
The node {{< glossary_tooltip text="controller" term_id="controller" >}} is a
545
545
Kubernetes control plane component that manages various aspects of nodes.
@@ -558,7 +558,7 @@ CIDR block to the node when it is registered (if CIDR assignment is turned on).
558
558
<!--
559
559
The second is keeping the node controller's internal list of nodes up to date with
560
560
the cloud provider's list of available machines. When running in a cloud
561
- environment, whenever a node is unhealthy, the node controller asks the cloud
561
+ environment and whenever a node is unhealthy, the node controller asks the cloud
562
562
provider if the VM for that node is still available. If not, the node
563
563
controller deletes the node from its list of nodes.
564
564
-->
@@ -586,7 +586,7 @@ This period can be configured using the `--node-monitor-period` flag on the
586
586
第三个是监控节点的健康状况。节点控制器负责:
587
587
588
588
- 在节点不可达的情况下,在 Node 的 ` .status ` 中更新 ` Ready ` 状况。
589
- 在这种情况下,节点控制器将 NodeReady 状况更新为 ` Unknown ` 。
589
+ 在这种情况下,节点控制器将 NodeReady 状况更新为 ` Unknown ` 。
590
590
- 如果节点仍然无法访问:对于不可达节点上的所有 Pod 触发
591
591
[ API 发起的逐出] ( /zh-cn/docs/concepts/scheduling-eviction/api-eviction/ ) 操作。
592
592
默认情况下,节点控制器在将节点标记为 ` Unknown ` 后等待 5 分钟提交第一个驱逐请求。
@@ -598,7 +598,7 @@ This period can be configured using the `--node-monitor-period` flag on the
598
598
### Rate limits on eviction
599
599
600
600
In most cases, the node controller limits the eviction rate to
601
- `-node-eviction-rate` (default 0.1) per second, meaning it won't evict pods
601
+ `-- node-eviction-rate` (default 0.1) per second, meaning it won't evict pods
602
602
from more than 1 node per 10 seconds.
603
603
-->
604
604
### 逐出速率限制 {#rate-limits-on-eviction}
@@ -627,7 +627,7 @@ the same time:
627
627
- 如果不健康节点的比例超过 ` --unhealthy-zone-threshold ` (默认为 0.55),
628
628
驱逐速率将会降低。
629
629
- 如果集群较小(意即小于等于 ` --large-cluster-size-threshold ` 个节点 - 默认为 50),
630
- 驱逐操作将会停止。
630
+ 驱逐操作将会停止。
631
631
- 否则驱逐速率将降为每秒 ` --secondary-node-eviction-rate ` 个(默认为 0.01)。
632
632
633
633
<!--
@@ -643,8 +643,8 @@ then the eviction mechanism does not take per-zone unavailability into account.
643
643
<!--
644
644
A key reason for spreading your nodes across availability zones is so that the
645
645
workload can be shifted to healthy zones when one entire zone goes down.
646
- Therefore, if all nodes in a zone are unhealthy then node controller evicts at
647
- the normal rate ` -node-eviction-rate`. The corner case is when all zones are
646
+ Therefore, if all nodes in a zone are unhealthy, then the node controller evicts at
647
+ the normal rate of `- -node-eviction-rate`. The corner case is when all zones are
648
648
completely unhealthy (none of the nodes in the cluster are healthy). In such a
649
649
case, the node controller assumes that there is some problem with connectivity
650
650
between the control plane and the nodes, and doesn't perform any evictions.
@@ -660,9 +660,9 @@ evict pods from the remaining nodes that are unhealthy or unreachable).
660
660
(如果故障后部分节点重新连接,节点控制器会从剩下不健康或者不可达节点中驱逐 Pod)。
661
661
662
662
<!--
663
- The Node Controller is also responsible for evicting pods running on nodes with
664
- `NoExecute` taints, unless the pods do not tolerate the taints .
665
- The Node Controller also adds {{< glossary_tooltip text="taints" term_id="taint" >}}
663
+ The node controller is also responsible for evicting pods running on nodes with
664
+ `NoExecute` taints, unless those pods tolerate that taint .
665
+ The node controller also adds {{< glossary_tooltip text="taints" term_id="taint" >}}
666
666
corresponding to node problems like node unreachable or not ready. This means
667
667
that the scheduler won't place Pods onto unhealthy nodes.
668
668
-->
@@ -743,7 +743,7 @@ Kubelet ensures that pods follow the normal
743
743
[pod termination process](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination)
744
744
during the node shutdown.
745
745
-->
746
- kubelet 会尝试检测节点系统关闭事件并终止在节点上运行的 Pods 。
746
+ kubelet 会尝试检测节点系统关闭事件并终止在节点上运行的所有 Pod 。
747
747
748
748
在节点终止期间,kubelet 保证 Pod 遵从常规的
749
749
[ Pod 终止流程] ( /zh-cn/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination ) 。
@@ -763,7 +763,7 @@ Graceful node shutdown is controlled with the `GracefulNodeShutdown`
763
763
enabled by default in 1.21.
764
764
-->
765
765
节点体面关闭特性受 ` GracefulNodeShutdown `
766
- [ 特性门控] ( /docs/reference/command-line-tools-reference/feature-gates/ ) 控制,
766
+ [ 特性门控] ( /zh-cn/ docs/reference/command-line-tools-reference/feature-gates/ ) 控制,
767
767
在 1.21 版本中是默认启用的。
768
768
769
769
<!--
@@ -840,7 +840,7 @@ Message: Pod was terminated in response to imminent node shutdown.
840
840
{{< feature-state state="alpha" for_k8s_version="v1.24" >}}
841
841
842
842
<!--
843
- A node shutdown action may not be detected by kubelet's Node Shutdown Mananger ,
843
+ A node shutdown action may not be detected by kubelet's Node Shutdown Manager ,
844
844
either because the command does not trigger the inhibitor locks mechanism used by
845
845
kubelet or because of a user error, i.e., the ShutdownGracePeriod and
846
846
ShutdownGracePeriodCriticalPods are not configured properly. Please refer to above
@@ -888,7 +888,7 @@ different node.
888
888
-->
889
889
为了缓解上述情况,用户可以手动将具有 ` NoExecute ` 或 ` NoSchedule ` 效果的
890
890
` node kubernetes.io/out-of-service ` 污点添加到节点上,标记其无法提供服务。
891
- 如果在 ` kube-controller-manager ` 上启用了 ` NodeOutOfServiceVolumeDetach `
891
+ 如果在 ` kube-controller-manager ` 上启用了 ` NodeOutOfServiceVolumeDetach `
892
892
[ 特性门控] ( /zh-cn/docs/reference/command-line-tools-reference/feature-gates/ ) ,
893
893
并且节点被通过污点标记为无法提供服务,如果节点 Pod 上没有设置对应的容忍度,
894
894
那么这样的 Pod 将被强制删除,并且该在节点上被终止的 Pod 将立即进行卷分离操作。
@@ -1058,7 +1058,7 @@ their respective shutdown periods.
1058
1058
-->
1059
1059
如果此功能特性被启用,但没有提供配置数据,则不会出现排序操作。
1060
1060
1061
- 使用此功能特性需要启用 `GracefulNodeShutdownBasedOnPodPriority`
1061
+ 使用此功能特性需要启用 `GracefulNodeShutdownBasedOnPodPriority`
1062
1062
[特性门控](/zh-cn/docs/reference/command-line-tools-reference/feature-gates/),
1063
1063
并将 [kubelet 配置](/zh-cn/docs/reference/config-api/kubelet-config.v1beta1/)
1064
1064
中的 `shutdownGracePeriodByPodPriority` 设置为期望的配置,
@@ -1074,7 +1074,7 @@ the feature is Beta and is enabled by default.
1074
1074
{{< note >}}
1075
1075
在节点体面关闭期间考虑 Pod 优先级的能力是作为 Kubernetes v1.23 中的 Alpha 功能引入的。
1076
1076
在 Kubernetes {{< skew currentVersion >}} 中该功能是 Beta 版,默认启用。
1077
- {{< /note >}}
1077
+ {{< /note >}}
1078
1078
1079
1079
<!--
1080
1080
Metrics `graceful_shutdown_start_time_seconds` and `graceful_shutdown_end_time_seconds`
0 commit comments