Skip to content

Commit a1c49ec

Browse files
authored
Merge pull request #35949 from yanrongshi/zh-cn]Sync-disruptions.md
[zh-cn]Sync disruptions.md
2 parents fe1cb0e + f257e51 commit a1c49ec

File tree

1 file changed

+56
-54
lines changed

1 file changed

+56
-54
lines changed

content/zh-cn/docs/concepts/workloads/pods/disruptions.md

Lines changed: 56 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -18,20 +18,20 @@ weight: 60
1818
<!--
1919
This guide is for application owners who want to build
2020
highly available applications, and thus need to understand
21-
what types of Disruptions can happen to Pods.
21+
what types of disruptions can happen to Pods.
2222
-->
23-
本指南针对的是希望构建高可用性应用程序的应用所有者,他们有必要了解可能发生在 Pod 上的干扰类型。
23+
本指南针对的是希望构建高可用性应用的应用所有者,他们有必要了解可能发生在 Pod 上的干扰类型。
2424

2525
<!--
26-
It is also for Cluster Administrators who want to perform automated
26+
It is also for cluster administrators who want to perform automated
2727
cluster actions, like upgrading and autoscaling clusters.
2828
-->
2929
文档同样适用于想要执行自动化集群操作(例如升级和自动扩展集群)的集群管理员。
3030

3131
<!-- body -->
3232

3333
<!--
34-
## Voluntary and Involuntary Disruptions
34+
## Voluntary and involuntary disruptions
3535
3636
Pods do not disappear until someone (a person or a controller) destroys them, or
3737
there is an unavoidable hardware or system software error.
@@ -44,7 +44,7 @@ Pod 不会消失,除非有人(用户或控制器)将其销毁,或者出
4444
We call these unavoidable cases *involuntary disruptions* to
4545
an application. Examples are:
4646
-->
47-
我们把这些不可避免的情况称为应用的*非自愿干扰(Involuntary Disruptions)*。例如:
47+
我们把这些不可避免的情况称为应用的**非自愿干扰(Involuntary Disruptions)**。例如:
4848

4949
<!--
5050
- a hardware failure of the physical machine backing the node
@@ -74,9 +74,9 @@ We call other cases *voluntary disruptions*. These include both
7474
actions initiated by the application owner and those initiated by a Cluster
7575
Administrator. Typical application owner actions include:
7676
-->
77-
我们称其他情况为*自愿干扰(Voluntary Disruptions)*
78-
包括由应用程序所有者发起的操作和由集群管理员发起的操作。典型的应用程序所有者的操
79-
作包括
77+
我们称其他情况为**自愿干扰(Voluntary Disruptions)**
78+
包括由应用所有者发起的操作和由集群管理员发起的操作。
79+
典型的应用所有者的操作包括
8080

8181
<!--
8282
- deleting the deployment or other controller that manages the pod
@@ -88,7 +88,7 @@ Administrator. Typical application owner actions include:
8888
- 直接删除 Pod(例如,因为误操作)
8989

9090
<!--
91-
Cluster Administrator actions include:
91+
Cluster administrator actions include:
9292
9393
- [Draining a node](/docs/tasks/administer-cluster/safely-drain-node/) for repair or upgrade.
9494
- Draining a node from a cluster to scale the cluster down (learn about
@@ -126,7 +126,7 @@ deleting deployments or pods bypasses Pod Disruption Budgets.
126126
{{< /caution >}}
127127

128128
<!--
129-
## Dealing with Disruptions
129+
## Dealing with disruptions
130130
131131
Here are some ways to mitigate involuntary disruptions:
132132
-->
@@ -135,7 +135,7 @@ Here are some ways to mitigate involuntary disruptions:
135135
以下是减轻非自愿干扰的一些方法:
136136

137137
<!--
138-
- Ensure your pod [requests the resources](/docs/tasks/configure-pod-container/assign-cpu-ram-container) it needs.
138+
- Ensure your pod [requests the resources](/docs/tasks/configure-pod-container/assign-memory-resource) it needs.
139139
- Replicate your application if you need higher availability. (Learn about running replicated
140140
[stateless](/docs/tasks/run-application/run-stateless-application-deployment/)
141141
and [stateful](/docs/tasks/run-application/run-replicated-stateful-application/) applications.)
@@ -146,12 +146,12 @@ and [stateful](/docs/tasks/run-application/run-replicated-stateful-application/)
146146
[multi-zone cluster](/docs/setup/multiple-zones).)
147147
-->
148148
- 确保 Pod 在请求中给出[所需资源](/zh-cn/docs/tasks/configure-pod-container/assign-memory-resource/)
149-
- 如果需要更高的可用性,请复制应用程序
149+
- 如果需要更高的可用性,请复制应用
150150
(了解有关运行多副本的[无状态](/zh-cn/docs/tasks/run-application/run-stateless-application-deployment/)
151-
[有状态](/zh-cn/docs/tasks/run-application/run-replicated-stateful-application/)应用程序的信息。)
152-
- 为了在运行复制应用程序时获得更高的可用性,请跨机架(使用
151+
[有状态](/zh-cn/docs/tasks/run-application/run-replicated-stateful-application/)应用的信息。)
152+
- 为了在运行复制应用时获得更高的可用性,请跨机架(使用
153153
[反亲和性](/zh-cn/docs/concepts/scheduling-eviction/assign-pod-node/#affinity-and-anti-affinity)
154-
或跨区域(如果使用[多区域集群](/zh-cn/docs/setup/best-practices/multiple-zones/)扩展应用程序
154+
或跨区域(如果使用[多区域集群](/zh-cn/docs/setup/best-practices/multiple-zones/)扩展应用
155155

156156
<!--
157157
The frequency of voluntary disruptions varies. On a basic Kubernetes cluster, there are
@@ -178,18 +178,18 @@ Kubernetes offers features to help run highly available applications at the same
178178
time as frequent voluntary disruptions. We call this set of features
179179
*Disruption Budgets*.
180180
-->
181-
Kubernetes 提供特性来满足在出现频繁自愿干扰的同时运行高可用的应用程序。我们称这些特性为
182-
*干扰预算(Disruption Budget)*
181+
Kubernetes 提供特性来满足在出现频繁自愿干扰的同时运行高可用的应用。我们称这些特性为
182+
**干扰预算(Disruption Budget)**
183183

184184
<!--
185185
## Pod disruption budgets
186186
187187
Kubernetes offers features to help you run highly available applications even when you
188188
introduce frequent voluntary disruptions.
189189
190-
An Application Owner can create a `PodDisruptionBudget` object (PDB) for each application.
191-
A PDB limits the number of pods of a replicated application that are down simultaneously from
192-
voluntary disruptions. For example, a quorum-based application would
190+
As an application owner, you can create a PodDisruptionBudget (PDB) for each application.
191+
A PDB limits the number of Pods of a replicated application that are down simultaneously from
192+
voluntary disruptions. For example, a quorum-based application would
193193
like to ensure that the number of replicas running is never brought below the
194194
number needed for a quorum. A web front end might want to
195195
ensure that the number of replicas serving load never falls below a certain
@@ -199,18 +199,17 @@ percentage of the total.
199199

200200
{{< feature-state for_k8s_version="v1.21" state="stable" >}}
201201

202-
即使你会经常引入自愿性干扰,Kubernetes 也能够支持你运行高度可用的应用
202+
即使你会经常引入自愿性干扰,Kubernetes 提供的功能也能够支持你运行高度可用的应用
203203

204-
应用程序所有者可以为每个应用程序创建 `PodDisruptionBudget` 对象(PDB)。
205-
PDB 将限制在同一时间因自愿干扰导致的复制应用程序中宕机的 pod 数量。
206-
例如,基于票选机制的应用程序希望确保运行的副本数永远不会低于仲裁所需的数量
204+
作为一个应用的所有者,你可以为每个应用创建一个 `PodDisruptionBudget`(PDB)。
205+
PDB 将限制在同一时间因自愿干扰导致的多副本应用中发生宕机的 Pod 数量。
206+
例如,基于票选机制的应用希望确保运行中的副本数永远不会低于票选所需的数量
207207
Web 前端可能希望确保提供负载的副本数量永远不会低于总数的某个百分比。
208208

209209
<!--
210210
Cluster managers and hosting providers should use tools which
211211
respect PodDisruptionBudgets by calling the [Eviction API](/docs/tasks/administer-cluster/safely-drain-node/#eviction-api)
212-
instead of directly deleting pods or deployments. Examples are the `kubectl drain` command
213-
and the Kubernetes-on-GCE cluster upgrade script (`cluster/gce/upgrade.sh`).
212+
instead of directly deleting pods or deployments.
214213
-->
215214
集群管理员和托管提供商应该使用遵循 PodDisruptionBudgets 的接口
216215
(通过调用[Eviction API](/zh-cn/docs/tasks/administer-cluster/safely-drain-node/#the-eviction-api)),
@@ -219,38 +218,41 @@ and the Kubernetes-on-GCE cluster upgrade script (`cluster/gce/upgrade.sh`).
219218
<!--
220219
For example, the `kubectl drain` subcommand lets you mark a node as going out of
221220
service. When you run `kubectl drain`, the tool tries to evict all of the Pods on
222-
the Node you'are taking out of service. The eviction request may be temporarily rejected,
223-
and the tool periodically retries all failed requests until all pods
224-
are terminated, or until a configurable timeout is reached.
221+
the Node you're taking out of service. The eviction request that `kubectl` submits on
222+
your behalf may be temporarily rejected, so the tool periodically retries all failed
223+
requests until all Pods on the target node are terminated, or until a configurable timeout is reached.
225224
-->
226225
例如,`kubectl drain` 命令可以用来标记某个节点即将停止服务。
227-
运行 `kubectl drain` 命令时,工具会尝试驱逐机器上的所有 Pod。
228-
`kubectl` 所提交的驱逐请求可能会暂时被拒绝,所以该工具会定时重试失败的请求,
229-
直到所有的 Pod 都被终止,或者达到配置的超时时间。
226+
运行 `kubectl drain` 命令时,工具会尝试驱逐你所停服的节点上的所有 Pod。
227+
`kubectl` 代表你所提交的驱逐请求可能会暂时被拒绝,
228+
所以该工具会周期性地重试所有失败的请求,
229+
直到目标节点上的所有的 Pod 都被终止,或者达到配置的超时时间。
230230

231231
<!--
232232
A PDB specifies the number of replicas that an application can tolerate having, relative to how
233233
many it is intended to have. For example, a Deployment which has a `.spec.replicas: 5` is
234234
supposed to have 5 pods at any given time. If its PDB allows for there to be 4 at a time,
235-
then the Eviction API will allow voluntary disruption of one, but not two pods, at a time.
235+
then the Eviction API will allow voluntary disruption of one (but not two) pods at a time.
236236
-->
237-
PDB 指定应用程序可以容忍的副本数量(相当于应该有多少副本)。
237+
PDB 指定应用可以容忍的副本数量(相当于应该有多少副本)。
238238
例如,具有 `.spec.replicas: 5` 的 Deployment 在任何时间都应该有 5 个 Pod。
239-
如果 PDB 允许其在某一时刻有 4 个副本,那么驱逐 API 将允许同一时刻仅有一个而不是两个 Pod 自愿干扰。
239+
如果 PDB 允许其在某一时刻有 4 个副本,那么驱逐 API 将允许同一时刻仅有一个(而不是两个)Pod 自愿干扰。
240240

241241
<!--
242242
The group of pods that comprise the application is specified using a label selector, the same
243243
as the one used by the application's controller (deployment, stateful-set, etc).
244244
-->
245-
使用标签选择器来指定构成应用程序的一组 Pod,这与应用程序的控制器(Deployment,StatefulSet 等)
245+
使用标签选择器来指定构成应用的一组 Pod,这与应用的控制器(Deployment,StatefulSet 等)
246246
选择 Pod 的逻辑一样。
247247

248248
<!--
249-
The "intended" number of pods is computed from the `.spec.replicas` of the pods controller.
250-
The controller is discovered from the pods using the `.metadata.ownerReferences` of the object.
249+
The "intended" number of pods is computed from the `.spec.replicas` of the workload resource
250+
that is managing those pods. The control plane discovers the owning workload resource by
251+
examining the `.metadata.ownerReferences` of the Pod.
251252
-->
252-
Pod 控制器的 `.spec.replicas` 计算“预期的” Pod 数量。
253-
根据 Pod 对象的 `.metadata.ownerReferences` 字段来发现控制器。
253+
Pod 的“预期”数量由管理这些 Pod 的工作负载资源的 `.spec.replicas` 参数计算出来的。
254+
控制平面通过检查 Pod 的
255+
`.metadata.ownerReferences` 来发现关联的工作负载资源。
254256

255257
<!--
256258
[Involuntary disruptions](#voluntary-and-involuntary-disruptions) cannot be prevented by PDBs; however they
@@ -262,13 +264,14 @@ PDB 无法防止[非自愿干扰](#voluntary-and-involuntary-disruptions);
262264

263265
<!--
264266
Pods which are deleted or unavailable due to a rolling upgrade to an application do count
265-
against the disruption budget, but controllers (like deployment and stateful-set)
266-
are not limited by PDBs when doing rolling upgrades - the handling of failures
267-
during application updates is configured in spec for the specific workload resource.
267+
against the disruption budget, but workload resources (such as Deployment and StatefulSet)
268+
are not limited by PDBs when doing rolling upgrades. Instead, the handling of failures
269+
during application updates is configured in the spec for the specific workload resource.
268270
-->
269-
由于应用程序的滚动升级而被删除或不可用的 Pod 确实会计入干扰预算,
270-
但是控制器(如 Deployment 和 StatefulSet)在进行滚动升级时不受 PDB
271-
的限制。应用程序更新期间的故障处理方式是在对应的工作负载资源的 `spec` 中配置的。
271+
由于应用的滚动升级而被删除或不可用的 Pod 确实会计入干扰预算,
272+
但是工作负载资源(如 Deployment 和 StatefulSet)
273+
在进行滚动升级时不受 PDB 的限制。
274+
应用更新期间的故障处理方式是在对应的工作负载资源的 `spec` 中配置的。
272275

273276
<!--
274277
When a pod is evicted using the eviction API, it is gracefully
@@ -282,14 +285,13 @@ hornoring the
282285
中的 `terminationGracePeriodSeconds` 配置值。
283286

284287
<!--
285-
## PDB Example
286-
288+
## PodDisruptionBudget example {#pdb-example}
287289
Consider a cluster with 3 nodes, `node-1` through `node-3`.
288290
The cluster is running several applications. One of them has 3 replicas initially called
289291
`pod-a`, `pod-b`, and `pod-c`. Another, unrelated pod without a PDB, called `pod-x`, is also shown.
290292
Initially, the pods are laid out as follows:
291293
-->
292-
## PDB 例子 {#pdb-example}
294+
## PodDisruptionBudget 例子 {#pdb-example}
293295

294296
假设集群有 3 个节点,`node-1``node-3`。集群上运行了一些应用。
295297
其中一个应用有 3 个副本,分别是 `pod-a``pod-b``pod-c`
@@ -316,7 +318,7 @@ This puts the cluster in this state:
316318
-->
317319

318320
例如,假设集群管理员想要重启系统,升级内核版本来修复内核中的缺陷。
319-
集群管理员首先使用 `kubectl drain` 命令尝试排空 `node-1` 节点。
321+
集群管理员首先使用 `kubectl drain` 命令尝试腾空 `node-1` 节点。
320322
命令尝试驱逐 `pod-a``pod-x`。操作立即就成功了。
321323
两个 Pod 同时进入 `terminating` 状态。这时的集群处于下面的状态:
322324

@@ -426,7 +428,7 @@ can happen, according to:
426428
- the type of controller
427429
- the cluster's resource capacity
428430
-->
429-
- 应用程序需要多少个副本
431+
- 应用需要多少个副本
430432
- 优雅关闭应用实例需要多长时间
431433
- 启动应用新实例需要多长时间
432434
- 控制器的类型
@@ -531,7 +533,7 @@ may make sense in these scenarios:
531533
there is natural specialization of roles
532534
- when third-party tools or services are used to automate cluster management
533535
-->
534-
- 当有许多应用程序团队共用一个 Kubernetes 集群,并且有自然的专业角色
536+
- 当有许多应用团队共用一个 Kubernetes 集群,并且有自然的专业角色
535537
- 当第三方工具或服务用于集群自动化管理
536538

537539
<!--
@@ -573,11 +575,11 @@ the nodes in your cluster, such as a node or system software upgrade, here are s
573575
- 接受升级期间的停机时间。
574576
- 故障转移到另一个完整的副本集群。
575577
- 没有停机时间,但是对于重复的节点和人工协调成本可能是昂贵的。
576-
- 编写可容忍干扰的应用程序和使用 PDB。
578+
- 编写可容忍干扰的应用和使用 PDB。
577579
- 不停机。
578580
- 最小的资源重复。
579581
- 允许更多的集群管理自动化。
580-
- 编写可容忍干扰的应用程序是棘手的,但对于支持容忍自愿干扰所做的工作,和支持自动扩缩和容忍非
582+
- 编写可容忍干扰的应用是棘手的,但对于支持容忍自愿干扰所做的工作,和支持自动扩缩和容忍非
581583
自愿干扰所做工作相比,有大量的重叠
582584

583585
## {{% heading "whatsnext" %}}

0 commit comments

Comments
 (0)