Merge pull request #26126 from tengqm/zh-trans-pid-limiting

k8s-ci-robot · web-flow · commit 697e6832d8fb · 2021-01-17T07:33:44.000-08:00
[zh] Localize concepts/policy/pid-limiting.md
diff --git a/content/zh/docs/concepts/policy/pid-limiting.md b/content/zh/docs/concepts/policy/pid-limiting.md
@@ -0,0 +1,233 @@
+---
+title: 进程 ID 约束与预留
+content_type: concept
+weight: 40
+---
+
+<!--
+reviewers:
+- derekwaynecarr
+title: Process ID Limits And Reservations
+content_type: concept
+weight: 40
+-->
+
+<!-- overview -->
+
+{{< feature-state for_k8s_version="v1.20" state="stable" >}}
+
+<!--
+Kubernetes allow you to limit the number of process IDs (PIDs) that a {{< glossary_tooltip term_id="Pod" text="Pod" >}} can use.
+You can also reserve a number of allocatable PIDs for each {{< glossary_tooltip term_id="node" text="node" >}}
+for use by the operating system and daemons (rather than by Pods).
+-->
+Kubernetes 允许你限制一个 {{< glossary_tooltip term_id="Pod" text="Pod" >}} 中可以使用的
+进程 ID（PID）数目。你也可以为每个 {{< glossary_tooltip term_id="node" text="节点" >}}
+预留一定数量的可分配的 PID，供操作系统和守护进程（而非 Pod）使用。
+
+<!-- body -->
+
+<!--
+Process IDs (PIDs) are a fundamental resource on nodes. It is trivial to hit the
+task limit without hitting any other resource limits, which can then cause
+instability to a host machine.
+-->
+进程 ID（PID）是节点上的一种基础资源。很容易就会在尚未超出其它资源约束的时候就
+已经触及任务个数上限，进而导致宿主机器不稳定。
+
+<!--
+Cluster administrators require mechanisms to ensure that Pods running in the
+cluster cannot induce PID exhaustion that prevents host daemons (such as the
+{{< glossary_tooltip text="kubelet" term_id="kubelet" >}} or
+{{< glossary_tooltip text="kube-proxy" term_id="kube-proxy" >}},
+and potentially also the container runtime) from running.
+In addition, it is important to ensure that PIDs are limited among Pods in order
+to ensure they have limited impact on other workloads on the same node.
+-->
+集群管理员需要一定的机制来确保集群中运行的 Pod 不会导致 PID 资源枯竭，甚而
+造成宿主机上的守护进程（例如
+{{< glossary_tooltip text="kubelet" term_id="kubelet" >}} 或者
+{{< glossary_tooltip text="kube-proxy" term_id="kube-proxy" >}}
+乃至包括容器运行时本身）无法正常运行。
+此外，确保 Pod 中 PID 的个数受限对于保证其不会影响到同一节点上其它负载也很重要。
+
+{{< note >}}
+<!--
+On certain Linux installations, the operating system sets the PIDs limit to a low default,
+such as `32768`. Consider raising the value of `/proc/sys/kernel/pid_max`.
+-->
+在某些 Linux 安装环境中，操作系统会将 PID 约束设置为一个较低的默认值，例如
+`32768`。这时可以考虑提升 `/proc/sys/kernel/pid_max` 的设置值。
+{{< /note >}}
+
+<!--
+You can configure a kubelet to limit the number of PIDs a given Pod can consume.
+For example, if your node's host OS is set to use a maximum of `262144` PIDs and
+expect to host less than `250` Pods, one can give each Pod a budget of `1000`
+PIDs to prevent using up that node's overall number of available PIDs. If the
+admin wants to overcommit PIDs similar to CPU or memory, they may do so as well
+with some additional risks. Either way, a single Pod will not be able to bring
+the whole machine down. This kind of resource limiting helps to prevent simple
+fork bombs from affecting operation of an entire cluster.
+-->
+你可以配置 kubelet 限制给定 Pod 能够使用的 PID 个数。
+例如，如果你的节点上的宿主操作系统被设置为最多可使用 `262144` 个 PID，同时预期
+节点上会运行的 Pod 个数不会超过 `250`，那么你可以为每个 Pod 设置 `1000` 个 PID
+的预算，避免耗尽该节点上可用 PID 的总量。
+如果管理员系统像 CPU 或内存那样允许对 PID 进行过量分配（Overcommit），他们也可以
+这样做，只是会有一些额外的风险。不管怎样，任何一个 Pod 都不可以将整个机器的运行
+状态破坏。这类资源限制有助于避免简单的派生炸弹（Fork
+Bomb）影响到整个集群的运行。
+
+<!--
+Per-Pod PID limiting allows administrators to protect one Pod from another, but
+does not ensure that all Pods scheduled onto that host are unable to impact the node overall.
+Per-Pod limiting also does not protect the node agents themselves from PID exhaustion.
+
+You can also reserve an amount of PIDs for node overhead, separate from the
+allocation to Pods. This is similar to how you can reserve CPU, memory, or other
+resources for use by the operating system and other facilities outside of Pods
+and their containers.
+-->
+在 Pod 级别设置 PID 限制使得管理员能够保护 Pod 之间不会互相伤害，不过无法
+确保所有调度到该宿主机器上的所有 Pod 都不会影响到节点整体。
+Pod 级别的限制也无法保护节点代理任务自身不会受到 PID 耗尽的影响。
+
+你也可以预留一定量的 PID，作为节点的额外开销，与分配给 Pod 的 PID 集合独立。
+这有点类似于在给操作系统和其它设施预留 CPU、内存或其它资源时所做的操作，
+这些任务都在 Pod 及其所包含的容器之外运行。
+
+<!--
+PID limiting is a an important sibling to [compute
+resource](/docs/concepts/configuration/manage-resources-containers/) requests
+and limits. However, you specify it in a different way: rather than defining a
+Pod's resource limit in the `.spec` for a Pod, you configure the limit as a
+setting on the kubelet. Pod-defined PID limits are not currently supported.
+-->
+PID 限制是与[计算资源](/zh/docs/concepts/configuration/manage-resources-containers/)
+请求和限制相辅相成的一种机制。不过，你需要用一种不同的方式来设置这一限制：
+你需要将其设置到 kubelet 上而不是在 Pod 的 `.spec` 中为 Pod 设置资源限制。
+目前还不支持在 Pod 级别设置 PID 限制。
+
+
+{{< caution >}}
+<!--
+This means that the limit that applies to a Pod may be different depending on
+where the Pod is scheduled. To make things simple, it's easiest if all Nodes use
+the same PID resource limits and reservations.
+-->
+这意味着，施加在 Pod 之上的限制值可能因为 Pod 运行所在的节点不同而有差别。
+为了简化系统，最简单的方法是为所有节点设置相同的 PID 资源限制和预留值。
+{{< /caution >}}
+
+<!--
+## Node PID limits
+
+Kubernetes allows you to reserve a number of process IDs for the system use. To
+configure the reservation, use the parameter `pid=<number>` in the
+`--system-reserved` and `--kube-reserved` command line options to the kubelet.
+The value you specified declares that the specified number of process IDs will
+be reserved for the system as a whole and for Kubernetes system daemons
+respectively.
+-->
+## 节点级别 PID 限制   {#node-pid-limits}
+
+Kubernetes 允许你为系统预留一定量的进程 ID。为了配置预留数量，你可以使用
+kubelet 的 `--system-reserved` 和 `--kube-reserved` 命令行选项中的参数
+`pid=<number>`。你所设置的参数值分别用来声明为整个系统和 Kubernetes 系统
+守护进程所保留的进程 ID 数目。
+
+{{< note >}}
+<!--
+Before Kubernetes version 1.20, PID resource limiting with Node-level
+reservations required enabling the [feature
+gate](/docs/reference/command-line-tools-reference/feature-gates/)
+`SupportNodePidsLimit` to work.
+-->
+在 Kubernetes 1.20 版本之前，在节点级别通过 PID 资源限制预留 PID 的能力
+需要启用[特性门控](/zh/docs/reference/command-line-tools-reference/feature-gates/)
+`SupportNodePidsLimit` 才行。
+{{< /note >}}
+
+<!--
+## Pod PID limits
+
+Kubernetes allows you to limit the number of processes running in a Pod. You
+specify this limit at the node level, rather than configuring it as a resource
+limit for a particular Pod. Each Node can have a different PID limit.  
+To configure the limit, you can specify the command line parameter
+`--pod-max-pids` to the kubelet, or set `PodPidsLimit` in the kubelet
+[configuration file](/docs/tasks/administer-cluster/kubelet-config-file/).
+-->
+## Pod 级别 PID 限制   {#pod-pid-limits}
+
+Kubernetes 允许你限制 Pod 中运行的进程个数。你可以在节点级别设置这一限制，
+而不是为特定的 Pod 来将其设置为资源限制。
+每个节点都可以有不同的 PID 限制设置。
+要设置限制值，你可以设置 kubelet 的命令行参数 `--pod-max-pids`，或者
+在 kubelet 的[配置文件](/zh/docs/tasks/administer-cluster/kubelet-config-file/)
+中设置 `PodPidsLimit`。
+
+{{< note >}}
+<!--
+Before Kubernetes version 1.20, PID resource limiting for Pods required enabling
+the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
+`SupportPodPidsLimit` to work.
+-->
+在 Kubernetes 1.20 版本之前，为 Pod 设置 PID 资源限制的能力需要启用
+[特性门控](/zh/docs/reference/command-line-tools-reference/feature-gates/)
+`SupportNodePidsLimit` 才行。
+{{< /note >}}
+
+<!--
+## PID based eviction
+
+You can configure kubelet to start terminating a Pod when it is misbehaving and consuming abnormal amount of resources.
+This feature is called eviction. You can [Configure Out of Resource Handling](/docs/tasks/administer-cluster/out-of-resource) for various eviction signals.
+Use `pid.available` eviction signal to configure the threshold for number of PIDs used by Pod.
+You can set soft and hard eviction policies. However, even with the hard eviction policy, if the number of PIDs growing very fast,
+node can still get into unstable state by hitting the node PIDs limit.
+Eviction signal value is calculated periodically and does NOT enforce the limit.
+-->
+## 基于 PID 的驱逐    {#pid-based-eviction}
+
+你可以配置 kubelet 使之在 Pod 行为不正常或者消耗不正常数量资源的时候将其终止。
+这一特性称作驱逐。你可以针对不同的驱逐信号
+[配置资源不足的处理](/zh/docs/tasks/administer-cluster/out-of-resource)。
+使用 `pid.available` 驱逐信号来配置 Pod 使用的 PID 个数的阈值。
+你可以设置硬性的和软性的驱逐策略。不过，即使使用硬性的驱逐策略，
+如果 PID 个数增长过快，节点仍然可能因为触及节点 PID 限制而进入一种不稳定状态。
+驱逐信号的取值是周期性计算的，而不是一直能够强制实施约束。
+
+<!--
+PID limiting - per Pod and per Node sets the hard limit.
+Once the limit is hit, workload will start experiencing failures when trying to get a new PID.
+It may or may not lead to rescheduling of a Pod,
+depending on how workload reacts on these failures and how liveleness and readiness
+probes are configured for the Pod. However, if limits were set correctly,
+you can guarantee that other Pods workload and system processes will not run out of PIDs
+when one Pod is misbehaving.
+-->
+Pod 级别和节点级别的 PID 限制会设置硬性限制。
+一旦触及限制值，工作负载会在尝试获得新的 PID 时开始遇到问题。
+这可能会也可能不会导致 Pod 被重新调度，取决于工作负载如何应对这类失败
+以及 Pod 的存活性和就绪态探测是如何配置的。
+可是，如果限制值被正确设置，你可以确保其它 Pod 负载和系统进程不会因为某个
+Pod 行为不正常而没有 PID 可用。
+
+## {{% heading "whatsnext" %}}
+
+<!--
+- Refer to the [PID Limiting enhancement document](https://github.com/kubernetes/enhancements/blob/097b4d8276bc9564e56adf72505d43ce9bc5e9e8/keps/sig-node/20190129-pid-limiting.md) for more information.
+- For historical context, read [Process ID Limiting for Stability Improvements in Kubernetes 1.14](/blog/2019/04/15/process-id-limiting-for-stability-improvements-in-kubernetes-1.14/).
+- Read [Managing Resources for Containers](/docs/concepts/configuration/manage-resources-containers/).
+- Learn how to [Configure Out of Resource Handling](/docs/tasks/administer-cluster/out-of-resource).
+-->
+- 参阅 [PID 约束改进文档](https://github.com/kubernetes/enhancements/blob/097b4d8276bc9564e56adf72505d43ce9bc5e9e8/keps/sig-node/20190129-pid-limiting.md)
+  以了解更多信息。
+- 关于历史背景，请阅读
+  [Kubernetes 1.14 中限制进程 ID 以提升稳定性](/blog/2019/04/15/process-id-limiting-for-stability-improvements-in-kubernetes-1.14/)
+  的博文。
+- 请阅读[为容器管理资源](/zh/docs/concepts/configuration/manage-resources-containers/)。
+- 学习如何[配置资源不足情况的处理](/zh/docs/tasks/administer-cluster/out-of-resource)。
+