1
1
---
2
2
title : Pod 开销
3
3
content_type : concept
4
- weight : 20
4
+ weight : 30
5
5
---
6
6
7
+ <!--
8
+ ---
9
+ reviewers:
10
+ - dchen1107
11
+ - egernst
12
+ - tallclair
13
+ title: Pod Overhead
14
+ content_type: concept
15
+ weight: 30
16
+ ---
17
+ -->
18
+
7
19
<!-- overview -->
8
20
9
21
{{< feature-state for_k8s_version="v1.18" state="beta" >}}
@@ -58,7 +70,7 @@ across your cluster, and a `RuntimeClass` is utilized which defines the `overhea
58
70
您需要确保在集群中启用了 ` PodOverhead ` [ 特性门控] ( /zh/docs/reference/command-line-tools-reference/feature-gates/ )
59
71
(在 1.18 默认是开启的),以及一个用于定义 ` overhead ` 字段的 ` RuntimeClass ` 。
60
72
61
- <!--
73
+ <!--
62
74
## Usage example
63
75
-->
64
76
## 使用示例
@@ -85,7 +97,7 @@ overhead:
85
97
cpu : " 250m"
86
98
` ` `
87
99
88
- <!--
100
+ <!--
89
101
Workloads which are created which specify the ` kata-fc` RuntimeClass handler will take the memory and
90
102
cpu overheads into account for resource quota calculations, node scheduling, as well as Pod cgroup sizing.
91
103
@@ -119,7 +131,7 @@ spec:
119
131
memory: 100Mi
120
132
` ` `
121
133
122
- <!--
134
+ <!--
123
135
At admission time the RuntimeClass [admission controller](/docs/reference/access-authn-authz/admission-controllers/)
124
136
updates the workload's PodSpec to include the `overhead` as described in the RuntimeClass. If the PodSpec already has this field defined,
125
137
the Pod will be rejected. In the given example, since only the RuntimeClass name is specified, the admission controller mutates the Pod
@@ -129,7 +141,7 @@ to include an `overhead`.
129
141
RuntimeClass 中定义的 `overhead`. 如果 PodSpec 中该字段已定义,该 Pod 将会被拒绝。
130
142
在这个例子中,由于只指定了 RuntimeClass 名称,所以准入控制器更新了 Pod, 包含了一个 `overhead`.
131
143
132
- <!--
144
+ <!--
133
145
After the RuntimeClass admission controller, you can check the updated PodSpec :
134
146
-->
135
147
在 RuntimeClass 准入控制器之后,可以检验一下已更新的 PodSpec :
@@ -138,33 +150,33 @@ After the RuntimeClass admission controller, you can check the updated PodSpec:
138
150
kubectl get pod test-pod -o jsonpath='{.spec.overhead}'
139
151
` ` `
140
152
141
- <!--
153
+ <!--
142
154
The output is :
143
155
-->
144
156
输出:
145
157
```
146
158
map[ cpu:250m memory:120Mi]
147
159
```
148
160
149
- <!--
161
+ <!--
150
162
If a ResourceQuota is defined, the sum of container requests as well as the
151
163
`overhead` field are counted.
152
164
-->
153
165
如果定义了 ResourceQuata, 则容器请求的总量以及 `overhead` 字段都将计算在内。
154
166
155
- <!--
167
+ <!--
156
168
When the kube-scheduler is deciding which node should run a new Pod, the scheduler considers that Pod's
157
169
`overhead` as well as the sum of container requests for that Pod. For this example, the scheduler adds the
158
170
requests and the overhead, then looks for a node that has 2.25 CPU and 320 MiB of memory available.
159
171
-->
160
172
当 kube-scheduler 决定在哪一个节点调度运行新的 Pod 时,调度器会兼顾该 Pod 的 `overhead` 以及该 Pod 的容器请求总量。在这个示例中,调度器将资源请求和开销相加,然后寻找具备 2.25 CPU 和 320 MiB 内存可用的节点。
161
173
162
- <!--
174
+ <!--
163
175
Once a Pod is scheduled to a node, the kubelet on that node creates a new {{< glossary_tooltip text="cgroup" term_id="cgroup" >}}
164
176
for the Pod. It is within this pod that the underlying container runtime will create containers. -->
165
177
一旦 Pod 调度到了某个节点, 该节点上的 kubelet 将为该 Pod 新建一个 {{< glossary_tooltip text="cgroup" term_id="cgroup" >}}. 底层容器运行时将在这个 pod 中创建容器。
166
178
167
- <!--
179
+ <!--
168
180
If the resource has a limit defined for each container (Guaranteed QoS or Bustrable QoS with limits defined),
169
181
the kubelet will set an upper limit for the pod cgroup associated with that resource (cpu.cfs_quota_us for CPU
170
182
and memory.limit_in_bytes memory). This upper limit is based on the sum of the container limits plus the `overhead`
@@ -179,31 +191,31 @@ requests plus the `overhead` defined in the PodSpec.
179
191
-->
180
192
对于 CPU, 如果 Pod 的 QoS 是 Guaranteed 或者 Burstable, kubelet 会基于容器请求总量与 PodSpec 中定义的 `overhead` 之和设置 `cpu.shares`.
181
193
182
- <!--
194
+ <!--
183
195
Looking at our example, verify the container requests for the workload:
184
196
-->
185
197
请看这个例子,验证工作负载的容器请求:
186
198
```bash
187
199
kubectl get pod test-pod -o jsonpath='{.spec.containers[*].resources.limits}'
188
200
```
189
201
190
- <!--
202
+ <!--
191
203
The total container requests are 2000m CPU and 200MiB of memory:
192
204
-->
193
205
容器请求总计 2000m CPU 和 200MiB 内存:
194
206
```
195
207
map[cpu: 500m memory:100Mi] map[cpu:1500m memory:100Mi]
196
208
```
197
209
198
- <!--
210
+ <!--
199
211
Check this against what is observed by the node:
200
212
-->
201
213
对照从节点观察到的情况来检查一下:
202
214
``` bash
203
215
kubectl describe node | grep test-pod -B2
204
216
```
205
217
206
- <!--
218
+ <!--
207
219
The output shows 2250m CPU and 320MiB of memory are requested, which includes PodOverhead:
208
220
-->
209
221
该输出显示请求了 2250m CPU 以及 320MiB 内存,包含了 PodOverhead 在内:
@@ -226,8 +238,9 @@ cgroups directly on the node.
226
238
227
239
First, on the particular node, determine the Pod identifier:
228
240
-->
229
- 在工作负载所运行的节点上检查 Pod 的内存 cgroups. 在接下来的例子中,将在该节点上使用具备 CRI 兼容的容器运行时命令行工具 [ ` crictl ` ] ( https://github.com/kubernetes-sigs/cri-tools/blob/master/docs/crictl.md ) .
230
- 这是一个展示 PodOverhead 行为的进阶示例,用户并不需要直接在该节点上检查 cgroups.
241
+ 在工作负载所运行的节点上检查 Pod 的内存 cgroups. 在接下来的例子中,
242
+ 将在该节点上使用具备 CRI 兼容的容器运行时命令行工具
243
+ [ ` crictl ` ] ( https://github.com/kubernetes-sigs/cri-tools/blob/master/docs/crictl.md ) 。
231
244
232
245
首先在特定的节点上确定该 Pod 的标识符:
233
246
@@ -240,7 +253,7 @@ First, on the particular node, determine the Pod identifier:
240
253
POD_ID=" $( sudo crictl pods --name test-pod -q) "
241
254
```
242
255
243
- <!--
256
+ <!--
244
257
From this, you can determine the cgroup path for the Pod:
245
258
-->
246
259
可以依此判断该 Pod 的 cgroup 路径:
@@ -254,15 +267,15 @@ From this, you can determine the cgroup path for the Pod:
254
267
sudo crictl inspectp -o=json $POD_ID | grep cgroupsPath
255
268
```
256
269
257
- <!--
270
+ <!--
258
271
The resulting cgroup path includes the Pod's `pause` container. The Pod level cgroup is one directory above.
259
272
-->
260
273
执行结果的 cgroup 路径中包含了该 Pod 的 ` pause ` 容器。Pod 级别的 cgroup 即上面的一个目录。
261
274
```
262
275
"cgroupsPath": "/kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2/7ccf55aee35dd16aca4189c952d83487297f3cd760f1bbf09620e206e7d0c27a"
263
276
```
264
277
265
- <!--
278
+ <!--
266
279
In this specific case, the pod cgroup path is `kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2`. Verify the Pod level cgroup setting for memory:
267
280
-->
268
281
在这个例子中,该 pod 的 cgroup 路径是 ` kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2 ` 。验证内存的 Pod 级别 cgroup 设置:
@@ -278,15 +291,15 @@ In this specific case, the pod cgroup path is `kubepods/podd7f4b509-cf94-4951-94
278
291
cat /sys/fs/cgroup/memory/kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2/memory.limit_in_bytes
279
292
```
280
293
281
- <!--
294
+ <!--
282
295
This is 320 MiB, as expected:
283
296
-->
284
297
和预期的一样是 320 MiB
285
298
```
286
299
335544320
287
300
```
288
301
289
- <!--
302
+ <!--
290
303
### Observability
291
304
-->
292
305
### 可观察性
@@ -298,8 +311,11 @@ running with a defined Overhead. This functionality is not available in the 1.9
298
311
kube-state-metrics, but is expected in a following release. Users will need to build kube-state-metrics
299
312
from source in the meantime.
300
313
-->
301
- 在 [ kube-state-metrics] ( https://github.com/kubernetes/kube-state-metrics ) 中可以通过 ` kube_pod_overhead ` 指标来协助确定何时使用 PodOverhead 以及协助观察以一个既定开销运行的工作负载的稳定性。
302
- 该特性在 kube-state-metrics 的 1.9 发行版本中不可用,不过预计将在后续版本中发布。在此之前,用户需要从源代码构建 kube-state-metrics.
314
+ 在 [ kube-state-metrics] ( https://github.com/kubernetes/kube-state-metrics ) 中可以通过
315
+ ` kube_pod_overhead ` 指标来协助确定何时使用 PodOverhead 以及协助观察以一个既定
316
+ 开销运行的工作负载的稳定性。
317
+ 该特性在 kube-state-metrics 的 1.9 发行版本中不可用,不过预计将在后续版本中发布。
318
+ 在此之前,用户需要从源代码构建 kube-state-metrics。
303
319
304
320
## {{% heading "whatsnext" %}}
305
321
@@ -310,4 +326,3 @@ from source in the meantime.
310
326
311
327
* [ RuntimeClass] ( /zh/docs/concepts/containers/runtime-class/ )
312
328
* [ PodOverhead 设计] ( https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/688-pod-overhead )
313
-
0 commit comments