You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/en/docs/concepts/architecture/controller.md
+4-3Lines changed: 4 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -159,11 +159,12 @@ You can run your own controller as a set of Pods,
159
159
or externally to Kubernetes. What fits best will depend on what that particular
160
160
controller does.
161
161
162
-
163
-
164
162
## {{% heading "whatsnext" %}}
165
163
166
164
* Read about the [Kubernetes control plane](/docs/concepts/overview/components/#control-plane-components)
167
165
* Discover some of the basic [Kubernetes objects](/docs/concepts/overview/working-with-objects/kubernetes-objects/)
168
166
* Learn more about the [Kubernetes API](/docs/concepts/overview/kubernetes-api/)
169
-
* If you want to write your own controller, see [Extension Patterns](/docs/concepts/extend-kubernetes/extend-cluster/#extension-patterns) in Extending Kubernetes.
define the available isolation classes, the share of the available concurrency
@@ -204,6 +207,7 @@ of the same API group, and it has the same Kinds with the same syntax and
204
207
semantics.
205
208
206
209
### PriorityLevelConfiguration
210
+
207
211
A PriorityLevelConfiguration represents a single isolation class. Each
208
212
PriorityLevelConfiguration has an independent limit on the number of outstanding
209
213
requests, and limitations on the number of queued requests.
@@ -217,6 +221,7 @@ server by restarting `kube-apiserver` with a different value for
217
221
`--max-requests-inflight` (or `--max-mutating-requests-inflight`), and all
218
222
PriorityLevelConfigurations will see their maximum allowed concurrency go up (or
219
223
down) by the same fraction.
224
+
220
225
{{< caution >}}
221
226
With the Priority and Fairness feature enabled, the total concurrency limit for
222
227
the server is set to the sum of `--max-requests-inflight` and
@@ -235,8 +240,8 @@ above the threshold will be queued, with the shuffle sharding and fair queuing t
235
240
to balance progress between request flows.
236
241
237
242
The queuing configuration allows tuning the fair queuing algorithm for a
238
-
priority level. Details of the algorithm can be read in the[enhancement
239
-
proposal](#whats-next), but in short:
243
+
priority level. Details of the algorithm can be read in the
244
+
[enhancement proposal](#whats-next), but in short:
240
245
241
246
* Increasing `queues` reduces the rate of collisions between different flows, at
242
247
the cost of increased memory usage. A value of 1 here effectively disables the
@@ -249,15 +254,15 @@ proposal](#whats-next), but in short:
249
254
* Changing `handSize` allows you to adjust the probability of collisions between
250
255
different flows and the overall concurrency available to a single flow in an
251
256
overload situation.
252
-
{{< note >}}
253
-
A larger `handSize` makes it less likely for two individual flows to collide
254
-
(and therefore for one to be able to starve the other), but more likely that
255
-
a small number of flows can dominate the apiserver. A larger `handSize` also
256
-
potentially increases the amount of latency that a single high-traffic flow
257
-
can cause. The maximum number of queued requests possible from a
258
-
single flow is `handSize * queueLengthLimit`.
259
-
{{< /note >}}
260
257
258
+
{{< note >}}
259
+
A larger `handSize` makes it less likely for two individual flows to collide
260
+
(and therefore for one to be able to starve the other), but more likely that
261
+
a small number of flows can dominate the apiserver. A larger `handSize` also
262
+
potentially increases the amount of latency that a single high-traffic flow
263
+
can cause. The maximum number of queued requests possible from a
264
+
single flow is `handSize * queueLengthLimit`.
265
+
{{< /note >}}
261
266
262
267
Following is a table showing an interesting collection of shuffle
263
268
sharding configurations, showing for each the probability that a
@@ -319,6 +324,7 @@ considered part of a single flow. The correct choice for a given FlowSchema
319
324
depends on the resource and your particular environment.
320
325
321
326
## Diagnostics
327
+
322
328
Every HTTP response from an API server with the priority and fairness feature
323
329
enabled has two extra headers: `X-Kubernetes-PF-FlowSchema-UID` and
324
330
`X-Kubernetes-PF-PriorityLevel-UID`, noting the flow schema that matched the request
@@ -356,13 +362,14 @@ poorly-behaved workloads that may be harming system health.
356
362
matched the request), `priority_level` (indicating the one to which
357
363
the request was assigned), and `reason`. The `reason` label will be
358
364
have one of the following values:
359
-
*`queue-full`, indicating that too many requests were already
360
-
queued,
361
-
*`concurrency-limit`, indicating that the
362
-
PriorityLevelConfiguration is configured to reject rather than
363
-
queue excess requests, or
364
-
*`time-out`, indicating that the request was still in the queue
365
-
when its queuing time limit expired.
365
+
366
+
*`queue-full`, indicating that too many requests were already
367
+
queued,
368
+
*`concurrency-limit`, indicating that the
369
+
PriorityLevelConfiguration is configured to reject rather than
370
+
queue excess requests, or
371
+
*`time-out`, indicating that the request was still in the queue
372
+
when its queuing time limit expired.
366
373
367
374
*`apiserver_flowcontrol_dispatched_requests_total` is a counter
368
375
vector (cumulative since server start) of requests that began
@@ -430,14 +437,15 @@ poorly-behaved workloads that may be harming system health.
430
437
sample to its histogram, reporting the length of the queue immediately
431
438
after the request was added. Note that this produces different
432
439
statistics than an unbiased survey would.
433
-
{{< note >}}
434
-
An outlier value in a histogram here means it is likely that a single flow
435
-
(i.e., requests by one user or for one namespace, depending on
436
-
configuration) is flooding the API server, and being throttled. By contrast,
437
-
if one priority level's histogram shows that all queues for that priority
438
-
level are longer than those for other priority levels, it may be appropriate
439
-
to increase that PriorityLevelConfiguration's concurrency shares.
440
-
{{< /note >}}
440
+
441
+
{{< note >}}
442
+
An outlier value in a histogram here means it is likely that a single flow
443
+
(i.e., requests by one user or for one namespace, depending on
444
+
configuration) is flooding the API server, and being throttled. By contrast,
445
+
if one priority level's histogram shows that all queues for that priority
446
+
level are longer than those for other priority levels, it may be appropriate
447
+
to increase that PriorityLevelConfiguration's concurrency shares.
448
+
{{< /note >}}
441
449
442
450
*`apiserver_flowcontrol_request_concurrency_limit` is a gauge vector
443
451
holding the computed concurrency limit (based on the API server's
@@ -450,12 +458,13 @@ poorly-behaved workloads that may be harming system health.
450
458
`priority_level` (indicating the one to which the request was
451
459
assigned), and `execute` (indicating whether the request started
452
460
executing).
453
-
{{< note >}}
454
-
Since each FlowSchema always assigns requests to a single
455
-
PriorityLevelConfiguration, you can add the histograms for all the
456
-
FlowSchemas for one priority level to get the effective histogram for
457
-
requests assigned to that priority level.
458
-
{{< /note >}}
461
+
462
+
{{< note >}}
463
+
Since each FlowSchema always assigns requests to a single
464
+
PriorityLevelConfiguration, you can add the histograms for all the
465
+
FlowSchemas for one priority level to get the effective histogram for
466
+
requests assigned to that priority level.
467
+
{{< /note >}}
459
468
460
469
*`apiserver_flowcontrol_request_execution_seconds` is a histogram
461
470
vector of how long requests took to actually execute, broken down by
@@ -465,14 +474,19 @@ poorly-behaved workloads that may be harming system health.
465
474
466
475
### Debug endpoints
467
476
468
-
When you enable the API Priority and Fairness feature, the kube-apiserver serves the following additional paths at its HTTP[S] ports.
477
+
When you enable the API Priority and Fairness feature, the `kube-apiserver`
478
+
serves the following additional paths at its HTTP[S] ports.
479
+
480
+
-`/debug/api_priority_and_fairness/dump_priority_levels` - a listing of
481
+
all the priority levels and the current state of each. You can fetch like this:
469
482
470
-
-`/debug/api_priority_and_fairness/dump_priority_levels` - a listing of all the priority levels and the current state of each. You can fetch like this:
471
483
```shell
472
484
kubectl get --raw /debug/api_priority_and_fairness/dump_priority_levels
0 commit comments