|
| 1 | +--- |
| 2 | +title: Pod 调度就绪态 |
| 3 | +content_type: concept |
| 4 | +weight: 40 |
| 5 | +--- |
| 6 | + |
| 7 | +<!-- |
| 8 | +title: Pod Scheduling Readiness |
| 9 | +content_type: concept |
| 10 | +weight: 40 |
| 11 | +--> |
| 12 | + |
| 13 | +<!-- overview --> |
| 14 | + |
| 15 | +{{< feature-state for_k8s_version="v1.26" state="alpha" >}} |
| 16 | + |
| 17 | +<!-- |
| 18 | +Pods were considered ready for scheduling once created. Kubernetes scheduler |
| 19 | +does its due diligence to find nodes to place all pending Pods. However, in a |
| 20 | +real-world case, some Pods may stay in a "miss-essential-resources" state for a long period. |
| 21 | +These Pods actually churn the scheduler (and downstream integrators like Cluster AutoScaler) |
| 22 | +in an unnecessary manner. |
| 23 | +
|
| 24 | +By specifying/removing a Pod's `.spec.schedulingGates`, you can control when a Pod is ready |
| 25 | +to be considered for scheduling. |
| 26 | +--> |
| 27 | +Pod 一旦创建就被认为准备好进行调度。 |
| 28 | +Kubernetes 调度程序尽职尽责地寻找节点来放置所有待处理的 Pod。 |
| 29 | +然而,在实际环境中,会有一些 Pod 可能会长时间处于"缺少必要资源"状态。 |
| 30 | +这些 Pod 实际上以一种不必要的方式扰乱了调度器(以及下游的集成方,如 Cluster AutoScaler)。 |
| 31 | + |
| 32 | +通过指定或删除 Pod 的 `.spec.schedulingGates`,可以控制 Pod 何时准备好被纳入考量进行调度。 |
| 33 | + |
| 34 | +<!-- body --> |
| 35 | + |
| 36 | +<!-- |
| 37 | +## Configuring Pod schedulingGates |
| 38 | +
|
| 39 | +The `schedulingGates` field contains a list of strings, and each string literal is perceived as a |
| 40 | +criteria that Pod should be satisfied before considered schedulable. This field can be initialized |
| 41 | +only when a Pod is created (either by the client, or mutated during admission). After creation, |
| 42 | +each schedulingGate can be removed in arbitrary order, but addition of a new scheduling gate is disallowed. |
| 43 | +--> |
| 44 | +## 配置 Pod schedulingGates |
| 45 | + |
| 46 | +`schedulingGates` 字段包含一个字符串列表,每个字符串文字都被视为 Pod 在被认为可调度之前应该满足的标准。 |
| 47 | +该字段只能在创建 Pod 时初始化(由客户端创建,或在准入期间更改)。 |
| 48 | +创建后,每个 schedulingGate 可以按任意顺序删除,但不允许添加新的调度门控。 |
| 49 | + |
| 50 | +<!-- |
| 51 | +{{<mermaid>}} |
| 52 | +stateDiagram-v2 |
| 53 | + s1: 创建 Pod |
| 54 | + s2: pod scheduling gated |
| 55 | + s3: pod scheduling ready |
| 56 | + s4: pod running |
| 57 | + if: empty scheduling gates? |
| 58 | + [*] --> s1 |
| 59 | + s1 --> if |
| 60 | + s2 --> if: scheduling gate removed |
| 61 | + if --> s2: no |
| 62 | + if --> s3: yes |
| 63 | + s3 --> s4 |
| 64 | + s4 --> [*] |
| 65 | +{{< /mermaid >}} |
| 66 | +--> |
| 67 | +{{<mermaid>}} |
| 68 | +stateDiagram-v2 |
| 69 | + s1: 创建 Pod |
| 70 | + s2: Pod 调度门控 |
| 71 | + s3: Pod 调度就绪 |
| 72 | + s4: Pod 运行 |
| 73 | + if: 调度门控为空? |
| 74 | + [*] --> s1 |
| 75 | + s1 --> if |
| 76 | + s2 --> if: 移除了调度门控 |
| 77 | + if --> s2: 否 |
| 78 | + if --> s3: 是 |
| 79 | + s3 --> s4 |
| 80 | + s4 --> [*] |
| 81 | +{{< /mermaid >}} |
| 82 | + |
| 83 | +<!-- |
| 84 | +## Usage example |
| 85 | +
|
| 86 | +To mark a Pod not-ready for scheduling, you can create it with one or more scheduling gates like this: |
| 87 | +
|
| 88 | +{{< codenew file="pods/pod-with-scheduling-gates.yaml" >}} |
| 89 | +
|
| 90 | +After the Pod's creation, you can check its state using: |
| 91 | +--> |
| 92 | +## 用法示例 |
| 93 | + |
| 94 | +要将 Pod 标记为未准备好进行调度,你可以在创建 Pod 时附带一个或多个调度门控,如下所示: |
| 95 | + |
| 96 | +{{< codenew file="pods/pod-with-scheduling-gates.yaml" >}} |
| 97 | + |
| 98 | +Pod 创建后,你可以使用以下方法检查其状态: |
| 99 | + |
| 100 | +```bash |
| 101 | +kubectl get pod test-pod |
| 102 | +``` |
| 103 | + |
| 104 | +<!-- |
| 105 | +The output reveals it's in `SchedulingGated` state: |
| 106 | +--> |
| 107 | +输出显示它处于 `SchedulingGated` 状态: |
| 108 | + |
| 109 | +```none |
| 110 | +NAME READY STATUS RESTARTS AGE |
| 111 | +test-pod 0/1 SchedulingGated 0 7s |
| 112 | +``` |
| 113 | + |
| 114 | +<!-- |
| 115 | +You can also check its `schedulingGates` field by running: |
| 116 | +--> |
| 117 | +你还可以通过运行以下命令检查其 `schedulingGates` 字段: |
| 118 | + |
| 119 | +```bash |
| 120 | +kubectl get pod test-pod -o jsonpath='{.spec.schedulingGates}' |
| 121 | +``` |
| 122 | + |
| 123 | +<!-- |
| 124 | +The output is: |
| 125 | +--> |
| 126 | +输出是: |
| 127 | + |
| 128 | +```none |
| 129 | +[{"name":"foo"},{"name":"bar"}] |
| 130 | +``` |
| 131 | + |
| 132 | +<!-- |
| 133 | +To inform scheduler this Pod is ready for scheduling, you can remove its `schedulingGates` entirely |
| 134 | +by re-applying a modified manifest: |
| 135 | +
|
| 136 | +{{< codenew file="pods/pod-without-scheduling-gates.yaml" >}} |
| 137 | +
|
| 138 | +You can check if the `schedulingGates` is cleared by running: |
| 139 | +--> |
| 140 | +要通知调度程序此 Pod 已准备好进行调度,你可以通过重新应用修改后的清单来完全删除其 `schedulingGates`: |
| 141 | + |
| 142 | +{{< codenew file="pods/pod-without-scheduling-gates.yaml" >}} |
| 143 | + |
| 144 | +你可以通过运行以下命令检查 `schedulingGates` 是否已被清空: |
| 145 | + |
| 146 | +```bash |
| 147 | +kubectl get pod test-pod -o jsonpath='{.spec.schedulingGates}' |
| 148 | +``` |
| 149 | + |
| 150 | +<!-- |
| 151 | +The output is expected to be empty. And you can check its latest status by running: |
| 152 | +--> |
| 153 | +预计输出为空,你可以通过运行下面的命令来检查它的最新状态: |
| 154 | + |
| 155 | +```bash |
| 156 | +kubectl get pod test-pod -o wide |
| 157 | +``` |
| 158 | + |
| 159 | +<!-- |
| 160 | +Given the test-pod doesn't request any CPU/memory resources, it's expected that this Pod's state get |
| 161 | +transited from previous `SchedulingGated` to `Running`: |
| 162 | +--> |
| 163 | +鉴于 test-pod 不请求任何 CPU/内存资源,预计此 Pod 的状态会从之前的 `SchedulingGated` 转变为 `Running`: |
| 164 | + |
| 165 | +```none |
| 166 | +NAME READY STATUS RESTARTS AGE IP NODE |
| 167 | +test-pod 1/1 Running 0 15s 10.0.0.4 node-2 |
| 168 | +``` |
| 169 | + |
| 170 | +<!-- |
| 171 | +## Observability |
| 172 | +
|
| 173 | +The metric `scheduler_pending_pods` comes with a new label `"gated"` to distinguish whether a Pod |
| 174 | +has been tried scheduling but claimed as unschedulable, or explicitly marked as not ready for |
| 175 | +scheduling. You can use `scheduler_pending_pods{queue="gated"}` to check the metric result. |
| 176 | +--> |
| 177 | +## 可观测性 |
| 178 | + |
| 179 | +指标 `scheduler_pending_pods` 带有一个新标签 `"gated"`, |
| 180 | +以区分 Pod 是否已尝试调度但被宣称不可调度,或明确标记为未准备好调度。 |
| 181 | +你可以使用 `scheduler_pending_pods{queue="gated"}` 来检查指标结果。 |
| 182 | + |
| 183 | +## {{% heading "whatsnext" %}} |
| 184 | + |
| 185 | +<!-- |
| 186 | +* Read the [PodSchedulingReadiness KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/3521-pod-scheduling-readiness) for more details |
| 187 | +--> |
| 188 | + |
| 189 | +* 阅读 [PodSchedulingReadiness KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/3521-pod-scheduling-readiness) 了解更多详情 |
0 commit comments