@@ -52,13 +52,17 @@ back-off, and other clients that also work this way.
52
52
53
53
{{< caution >}}
54
54
<!--
55
- Requests classified as "long-running" — primarily watches — are not
56
- subject to the API Priority and Fairness filter. This is also true for
57
- the `--max-requests-inflight` flag without the API Priority and
58
- Fairness feature enabled.
59
- -->
60
- 属于“长时间运行”类型的请求(主要是 ` watch ` )不受 API 优先级和公平性过滤器的约束。
55
+ Some requests classified as "long-running"—such as remote
56
+ command execution or log tailing—are not subject to the API
57
+ Priority and Fairness filter. This is also true for the
58
+ `--max-requests-inflight` flag without the API Priority and Fairness
59
+ feature enabled. API Priority and Fairness _does_ apply to **watch**
60
+ requests. When API Priority and Fairness is disabled, **watch** requests
61
+ are not subject to the `--max-requests-inflight` limit.
62
+ -->
63
+ 属于 “长时间运行” 类型的某些请求(例如远程命令执行或日志拖尾)不受 API 优先级和公平性过滤器的约束。
61
64
如果未启用 APF 特性,即便设置 ` --max-requests-inflight ` 标志,该类请求也不受约束。
65
+ APF ** 不** 适用于 ** watch** 请求。当 APF 被禁用时,** watch** 请求不受 ` --max-requests-inflight ` 限制。
62
66
{{< /caution >}}
63
67
64
68
<!-- body -->
@@ -158,6 +162,68 @@ from succeeding.
158
162
例如,默认配置包括针对领导者选举请求、内置控制器请求和 Pod 请求都单独设置优先级。
159
163
这表示即使异常的 Pod 向 API 服务器发送大量请求,也无法阻止领导者选举或内置控制器的操作执行成功。
160
164
165
+ <!--
166
+ ### Seats Occupied by a Request
167
+
168
+ The above description of concurrency management is the baseline story.
169
+ In it, requests have different durations but are counted equally at
170
+ any given moment when comparing against a priority level's concurrency
171
+ limit. In the baseline story, each request occupies one unit of
172
+ concurrency. The word "seat" is used to mean one unit of concurrency,
173
+ inspired by the way each passenger on a train or aircraft takes up one
174
+ of the fixed supply of seats.
175
+
176
+ But some requests take up more than one seat. Some of these are **list**
177
+ requests that the server estimates will return a large number of
178
+ objects. These have been found to put an exceptionally heavy burden
179
+ on the server, among requests that take a similar amount of time to
180
+ run. For this reason, the server estimates the number of objects that
181
+ will be returned and considers the request to take a number of seats
182
+ that is proportional to that estimated number.
183
+ -->
184
+ ### 请求占用的席位 {#seats-occupied-by-a-request}
185
+
186
+ 上述并发管理的描述是基线情况。其中,各个请求具有不同的持续时间,
187
+ 但在与一个优先级的并发限制进行比较时,这些请求在任何给定时刻都以同等方式进行计数。
188
+ 在这个基线场景中,每个请求占用一个并发单位。
189
+ 我们用 “席位(Seat)” 一词来表示一个并发单位,其灵感来自火车或飞机上每位乘客占用一个固定座位的供应方式。
190
+
191
+ 但有些请求所占用的席位不止一个。有些请求是服务器预估将返回大量对象的 ** list** 请求。
192
+ 和所需运行时间相近的其他请求相比,我们发现这类请求会给服务器带来异常沉重的负担。
193
+ 出于这个原因,服务器估算将返回的对象数量,并认为请求所占用的席位数与估算得到的数量成正比。
194
+
195
+ <!--
196
+ ### Execution time tweaks for watch requests
197
+
198
+ API Priority and Fairness manages **watch** requests, but this involves a
199
+ couple more excursions from the baseline behavior. The first concerns
200
+ how long a **watch** request is considered to occupy its seat. Depending
201
+ on request parameters, the response to a **watch** request may or may not
202
+ begin with **create** notifications for all the relevant pre-existing
203
+ objects. API Priority and Fairness considers a **watch** request to be
204
+ done with its seat once that initial burst of notifications, if any,
205
+ is over.
206
+
207
+ The normal notifications are sent in a concurrent burst to all
208
+ relevant **watch** response streams whenever the server is notified of an
209
+ object create/update/delete. To account for this work, API Priority
210
+ and Fairness considers every write request to spend some additional
211
+ time occupying seats after the actual writing is done. The server
212
+ estimates the number of notifications to be sent and adjusts the write
213
+ request's number of seats and seat occupancy time to include this
214
+ extra work.
215
+ -->
216
+ ### watch 请求的执行时间调整 {#execution-time-tweak-for-watch-requests}
217
+
218
+ APF 管理 ** watch** 请求,但这需要考量基线行为之外的一些情况。
219
+ 第一个关注点是如何判定 ** watch** 请求的席位占用时长。
220
+ 取决于请求参数不同,对 ** watch** 请求的响应可能以针对所有预先存在的对象 ** create** 通知开头,也可能不这样。
221
+ 一旦最初的突发通知(如果有)结束,APF 将认为 ** watch** 请求已经用完其席位。
222
+
223
+ 每当向服务器通知创建/更新/删除一个对象时,正常通知都会以并发突发的方式发送到所有相关的 ** watch** 响应流。
224
+ 为此,APF 认为每个写入请求都会在实际写入完成后花费一些额外的时间来占用席位。
225
+ 服务器估算要发送的通知数量,并调整写入请求的席位数以及包含这些额外工作后的席位占用时间。
226
+
161
227
<!--
162
228
### Queuing
163
229
@@ -386,7 +452,7 @@ FlowSchema in turn, starting with those with numerically lowest ---
386
452
which we take to be the logically highest --- `matchingPrecedence` and
387
453
working onward. The first match wins.
388
454
-->
389
- ### FlowSchema
455
+ ### FlowSchema {#flowschema}
390
456
391
457
FlowSchema 匹配一些入站请求,并将它们分配给优先级。
392
458
每个入站请求都会对所有 FlowSchema 测试是否匹配,
@@ -918,7 +984,7 @@ poorly-behaved workloads that may be harming system health.
918
984
* ` apiserver_flowcontrol_priority_level_request_count_watermarks ` 是一个直方图向量,
919
985
记录请求数的高/低水位线,由标签 ` phase ` (取值为 ` waiting ` 和 ` executing ` )和
920
986
` priority_level ` 拆分;
921
- 标签 ` mark ` 取值为 ` high ` 和 ` low ` 。
987
+ 标签 ` mark ` 取值为 ` high ` 和 ` low ` 。
922
988
` apiserver_flowcontrol_priority_level_request_count_samples ` 向量观察到有值新增,
923
989
则该向量累积。这些水位线显示了样本值的范围。
924
990
@@ -971,7 +1037,7 @@ poorly-behaved workloads that may be harming system health.
971
1037
-->
972
1038
* ` apiserver_flowcontrol_request_wait_duration_seconds ` 是一个直方图向量,
973
1039
记录请求排队的时间,
974
- 由标签 ` flow_schema ` (表示与请求匹配的 FlowSchema ),
1040
+ 由标签 ` flow_schema ` (表示与请求匹配的 FlowSchema),
975
1041
` priority_level ` (表示分配该请求的优先级)
976
1042
和 ` execute ` (表示请求是否开始执行)进一步区分。
977
1043
@@ -995,7 +1061,7 @@ poorly-behaved workloads that may be harming system health.
995
1061
-->
996
1062
* ` apiserver_flowcontrol_request_execution_seconds ` 是一个直方图向量,
997
1063
记录请求实际执行需要花费的时间,
998
- 由标签 ` flow_schema ` (表示与请求匹配的 FlowSchema )和
1064
+ 由标签 ` flow_schema ` (表示与请求匹配的 FlowSchema)和
999
1065
` priority_level ` (表示分配给该请求的优先级)进一步区分。
1000
1066
1001
1067
<!--
@@ -1120,6 +1186,5 @@ or the feature's [slack channel](https://kubernetes.slack.com/messages/api-prior
1120
1186
有关 API 优先级和公平性的设计细节的背景信息,
1121
1187
请参阅[ 增强提案] ( https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1040-priority-and-fairness ) 。
1122
1188
你可以通过 [ SIG API Machinery] ( https://github.com/kubernetes/community/tree/master/sig-api-machinery/ )
1123
- 或特性的 [ Slack 频道] ( https://kubernetes.slack.com/messages/api-priority-and-fairness/ )
1124
- 提出建议和特性请求。
1189
+ 或特性的 [ Slack 频道] ( https://kubernetes.slack.com/messages/api-priority-and-fairness/ ) 提出建议和特性请求。
1125
1190
0 commit comments