@@ -107,19 +107,18 @@ objects mentioned below.
107
107
### Seats Occupied by a Request
108
108
109
109
The above description of concurrency management is the baseline story.
110
- In it, requests have different durations but are counted equally at
111
- any given moment when comparing against a priority level's concurrency
112
- limit. In the baseline story, each request occupies one unit of
113
- concurrency. The word "seat" is used to mean one unit of concurrency,
114
- inspired by the way each passenger on a train or aircraft takes up one
115
- of the fixed supply of seats.
110
+ Requests have different durations but are counted equally at any given
111
+ moment when comparing against a priority level's concurrency limit. In
112
+ the baseline story, each request occupies one unit of concurrency. The
113
+ word "seat" is used to mean one unit of concurrency, inspired by the
114
+ way each passenger on a train or aircraft takes up one of the fixed
115
+ supply of seats.
116
116
117
117
But some requests take up more than one seat. Some of these are ** list**
118
118
requests that the server estimates will return a large number of
119
119
objects. These have been found to put an exceptionally heavy burden
120
- on the server, among requests that take a similar amount of time to
121
- run. For this reason, the server estimates the number of objects that
122
- will be returned and considers the request to take a number of seats
120
+ on the server. For this reason, the server estimates the number of objects
121
+ that will be returned and considers the request to take a number of seats
123
122
that is proportional to that estimated number.
124
123
125
124
### Execution time tweaks for watch requests
@@ -294,10 +293,9 @@ HandSize | Queues | 1 elephant | 4 elephants | 16 elephants
294
293
### FlowSchema
295
294
296
295
A FlowSchema matches some inbound requests and assigns them to a
297
- priority level. Every inbound request is tested against every
298
- FlowSchema in turn, starting with those with numerically lowest ---
299
- which we take to be the logically highest --- ` matchingPrecedence ` and
300
- working onward. The first match wins.
296
+ priority level. Every inbound request is tested against FlowSchemas,
297
+ starting with those with the numerically lowest ` matchingPrecedence ` and
298
+ working upward. The first match wins.
301
299
302
300
{{< caution >}}
303
301
Only the first matching FlowSchema for a given request matters. If multiple
@@ -311,20 +309,19 @@ ensure that no two FlowSchemas have the same `matchingPrecedence`.
311
309
A FlowSchema matches a given request if at least one of its ` rules `
312
310
matches. A rule matches if at least one of its ` subjects ` * and* at least
313
311
one of its ` resourceRules ` or ` nonResourceRules ` (depending on whether the
314
- incoming request is for a resource or non-resource URL) matches the request.
312
+ incoming request is for a resource or non-resource URL) match the request.
315
313
316
314
For the ` name ` field in subjects, and the ` verbs ` , ` apiGroups ` , ` resources ` ,
317
315
` namespaces ` , and ` nonResourceURLs ` fields of resource and non-resource rules,
318
316
the wildcard ` * ` may be specified to match all values for the given field,
319
317
effectively removing it from consideration.
320
318
321
319
A FlowSchema's ` distinguisherMethod.type ` determines how requests matching that
322
- schema will be separated into flows. It may be
323
- either ` ByUser ` , in which case one requesting user will not be able to starve
324
- other users of capacity, or ` ByNamespace ` , in which case requests for resources
325
- in one namespace will not be able to starve requests for resources in other
326
- namespaces of capacity, or it may be blank (or ` distinguisherMethod ` may be
327
- omitted entirely), in which case all requests matched by this FlowSchema will be
320
+ schema will be separated into flows. It may be ` ByUser ` , in which one requesting
321
+ user will not be able to starve other users of capacity; ` ByNamespace ` , in which
322
+ requests for resources in one namespace will not be able to starve requests for
323
+ resources in other namespaces of capacity; or blank (or ` distinguisherMethod ` may be
324
+ omitted entirely), in which all requests matched by this FlowSchema will be
328
325
considered part of a single flow. The correct choice for a given FlowSchema
329
326
depends on the resource and your particular environment.
330
327
@@ -434,7 +431,7 @@ for an annotation with key `apf.kubernetes.io/autoupdate-spec`. If
434
431
there is such an annotation and its value is ` true ` then the
435
432
kube-apiservers control the object. If there is such an annotation
436
433
and its value is ` false ` then the users control the object. If
437
- neither of those condtions holds then the ` metadata.generation ` of the
434
+ neither of those conditions holds then the ` metadata.generation ` of the
438
435
object is consulted. If that is 1 then the kube-apiservers control
439
436
the object. Otherwise the users control the object. These rules were
440
437
introduced in release 1.22 and their consideration of
@@ -513,23 +510,21 @@ poorly-behaved workloads that may be harming system health.
513
510
broken down by the labels ` flow_schema ` (indicating the one that
514
511
matched the request), ` priority_level ` (indicating the one to which
515
512
the request was assigned), and ` reason ` . The ` reason ` label will be
516
- have one of the following values:
513
+ one of the following values:
517
514
518
515
* ` queue-full ` , indicating that too many requests were already
519
- queued,
516
+ queued.
520
517
* ` concurrency-limit ` , indicating that the
521
518
PriorityLevelConfiguration is configured to reject rather than
522
- queue excess requests, or
519
+ queue excess requests.
523
520
* ` time-out ` , indicating that the request was still in the queue
524
521
when its queuing time limit expired.
525
522
* ` cancelled ` , indicating that the request is not purge locked
526
523
and has been ejected from the queue.
527
524
528
525
* ` apiserver_flowcontrol_dispatched_requests_total ` is a counter
529
526
vector (cumulative since server start) of requests that began
530
- executing, broken down by the labels ` flow_schema ` (indicating the
531
- one that matched the request) and ` priority_level ` (indicating the
532
- one to which the request was assigned).
527
+ executing, broken down by ` flow_schema ` and ` priority_level ` .
533
528
534
529
* ` apiserver_current_inqueue_requests ` is a gauge vector of recent
535
530
high water marks of the number of queued requests, grouped by a
@@ -545,23 +540,22 @@ poorly-behaved workloads that may be harming system health.
545
540
nanosecond, of the number of requests broken down by the labels
546
541
` phase ` (which takes on the values ` waiting ` and ` executing ` ) and
547
542
` request_kind ` (which takes on the values ` mutating ` and
548
- ` readOnly ` ). Each observed value is a ratio, between 0 and 1, of a
549
- number of requests divided by the corresponding limit on the number
550
- of requests (queue volume limit for waiting and concurrency limit
551
- for executing).
543
+ ` readOnly ` ). Each observed value is a ratio, between 0 and 1, of
544
+ the number of requests divided by the corresponding limit on the
545
+ number of requests (queue volume limit for waiting and concurrency
546
+ limit for executing).
552
547
553
548
* ` apiserver_flowcontrol_current_inqueue_requests ` is a gauge vector
554
549
holding the instantaneous number of queued (not executing) requests,
555
- broken down by the labels ` priority_level ` and ` flow_schema ` .
550
+ broken down by ` priority_level ` and ` flow_schema ` .
556
551
557
552
* ` apiserver_flowcontrol_current_executing_requests ` is a gauge vector
558
553
holding the instantaneous number of executing (not waiting in a
559
- queue) requests, broken down by the labels ` priority_level ` and
560
- ` flow_schema ` .
554
+ queue) requests, broken down by ` priority_level ` and ` flow_schema ` .
561
555
562
556
* ` apiserver_flowcontrol_request_concurrency_in_use ` is a gauge vector
563
557
holding the instantaneous number of occupied seats, broken down by
564
- the labels ` priority_level ` and ` flow_schema ` .
558
+ ` priority_level ` and ` flow_schema ` .
565
559
566
560
* ` apiserver_flowcontrol_priority_level_request_utilization ` is a
567
561
histogram vector of observations, made at the end of each
@@ -587,11 +581,10 @@ poorly-behaved workloads that may be harming system health.
587
581
588
582
* ` apiserver_flowcontrol_request_queue_length_after_enqueue ` is a
589
583
histogram vector of queue lengths for the queues, broken down by
590
- the labels ` priority_level ` and ` flow_schema ` , as sampled by the
591
- enqueued requests. Each request that gets queued contributes one
592
- sample to its histogram, reporting the length of the queue immediately
593
- after the request was added. Note that this produces different
594
- statistics than an unbiased survey would.
584
+ ` priority_level ` and ` flow_schema ` , as sampled by the enqueued requests.
585
+ Each request that gets queued contributes one sample to its histogram,
586
+ reporting the length of the queue immediately after the request was added.
587
+ Note that this produces different statistics than an unbiased survey would.
595
588
596
589
{{< note >}}
597
590
An outlier value in a histogram here means it is likely that a single flow
@@ -655,13 +648,10 @@ poorly-behaved workloads that may be harming system health.
655
648
holding, for each priority level, the dynamic concurrency limit
656
649
derived in the last adjustment.
657
650
658
-
659
651
* ` apiserver_flowcontrol_request_wait_duration_seconds ` is a histogram
660
652
vector of how long requests spent queued, broken down by the labels
661
- ` flow_schema ` (indicating which one matched the request),
662
- ` priority_level ` (indicating the one to which the request was
663
- assigned), and ` execute ` (indicating whether the request started
664
- executing).
653
+ ` flow_schema ` , ` priority_level ` , and ` execute ` . The ` execute ` label
654
+ indicates whether the request has started executing.
665
655
666
656
{{< note >}}
667
657
Since each FlowSchema always assigns requests to a single
@@ -672,9 +662,7 @@ poorly-behaved workloads that may be harming system health.
672
662
673
663
* ` apiserver_flowcontrol_request_execution_seconds ` is a histogram
674
664
vector of how long requests took to actually execute, broken down by
675
- the labels ` flow_schema ` (indicating which one matched the request)
676
- and ` priority_level ` (indicating the one to which the request was
677
- assigned).
665
+ ` flow_schema ` and ` priority_level ` .
678
666
679
667
* ` apiserver_flowcontrol_watch_count_samples ` is a histogram vector of
680
668
the number of active WATCH requests relevant to a given write,
@@ -686,16 +674,14 @@ poorly-behaved workloads that may be harming system health.
686
674
and ` priority_level ` .
687
675
688
676
* ` apiserver_flowcontrol_request_dispatch_no_accommodation_total ` is a
689
- counter vec of the number of events that in principle could have led
677
+ counter vector of the number of events that in principle could have led
690
678
to a request being dispatched but did not, due to lack of available
691
- concurrency, broken down by ` flow_schema ` and ` priority_level ` . The
692
- relevant sorts of events are arrival of a request and completion of
693
- a request.
679
+ concurrency, broken down by ` flow_schema ` and ` priority_level ` .
694
680
695
681
### Debug endpoints
696
682
697
683
When you enable the API Priority and Fairness feature, the ` kube-apiserver `
698
- serves the following additional paths at its HTTP[ S ] ports.
684
+ serves the following additional paths at its HTTP(S) ports.
699
685
700
686
- ` /debug/api_priority_and_fairness/dump_priority_levels ` - a listing of
701
687
all the priority levels and the current state of each. You can fetch like this:
@@ -785,7 +771,7 @@ request, and it includes the following attributes.
785
771
execution of the request.
786
772
787
773
At higher levels of verbosity there will be log lines exposing details
788
- of how APF handled the request, primarily for debug purposes.
774
+ of how APF handled the request, primarily for debugging purposes.
789
775
790
776
### Response headers
791
777
0 commit comments