Skip to content

Commit c77dcc0

Browse files
authored
Merge pull request #46403 from my-git9/pt-26422
improve cluster-administration/flow-control format
2 parents d3ab795 + 048cb54 commit c77dcc0

File tree

1 file changed

+53
-53
lines changed

1 file changed

+53
-53
lines changed

content/en/docs/concepts/cluster-administration/flow-control.md

Lines changed: 53 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ The API Priority and Fairness feature (APF) is an alternative that improves upon
2222
aforementioned max-inflight limitations. APF classifies
2323
and isolates requests in a more fine-grained way. It also introduces
2424
a limited amount of queuing, so that no requests are rejected in cases
25-
of very brief bursts. Requests are dispatched from queues using a
25+
of very brief bursts. Requests are dispatched from queues using a
2626
fair queuing technique so that, for example, a poorly-behaved
2727
{{< glossary_tooltip text="controller" term_id="controller" >}} need not
2828
starve others (even at the same priority level).
@@ -46,15 +46,15 @@ are not subject to the `--max-requests-inflight` limit.
4646
## Enabling/Disabling API Priority and Fairness
4747

4848
The API Priority and Fairness feature is controlled by a command-line flag
49-
and is enabled by default. See
49+
and is enabled by default. See
5050
[Options](/docs/reference/command-line-tools-reference/kube-apiserver/#options)
5151
for a general explanation of the available kube-apiserver command-line
52-
options and how to enable and disable them. The name of the
53-
command-line option for APF is "--enable-priority-and-fairness". This feature
52+
options and how to enable and disable them. The name of the
53+
command-line option for APF is "--enable-priority-and-fairness". This feature
5454
also involves an {{<glossary_tooltip term_id="api-group" text="API Group" >}}
5555
with: (a) a stable `v1` version, introduced in 1.29, and
5656
enabled by default (b) a `v1beta3` version, enabled by default, and
57-
deprecated in v1.29. You can
57+
deprecated in v1.29. You can
5858
disable the API group beta version `v1beta3` by adding the
5959
following command-line flags to your `kube-apiserver` invocation:
6060

@@ -96,7 +96,7 @@ from succeeding.
9696

9797
The concurrency limits of the priority levels are periodically
9898
adjusted, allowing under-utilized priority levels to temporarily lend
99-
concurrency to heavily-utilized levels. These limits are based on
99+
concurrency to heavily-utilized levels. These limits are based on
100100
nominal limits and bounds on how much concurrency a priority level may
101101
lend and how much it may borrow, all derived from the configuration
102102
objects mentioned below.
@@ -111,29 +111,29 @@ word "seat" is used to mean one unit of concurrency, inspired by the
111111
way each passenger on a train or aircraft takes up one of the fixed
112112
supply of seats.
113113

114-
But some requests take up more than one seat. Some of these are **list**
114+
But some requests take up more than one seat. Some of these are **list**
115115
requests that the server estimates will return a large number of
116-
objects. These have been found to put an exceptionally heavy burden
117-
on the server. For this reason, the server estimates the number of objects
116+
objects. These have been found to put an exceptionally heavy burden
117+
on the server. For this reason, the server estimates the number of objects
118118
that will be returned and considers the request to take a number of seats
119119
that is proportional to that estimated number.
120120

121121
### Execution time tweaks for watch requests
122122

123123
API Priority and Fairness manages **watch** requests, but this involves a
124-
couple more excursions from the baseline behavior. The first concerns
125-
how long a **watch** request is considered to occupy its seat. Depending
126-
on request parameters, the response to a **watch** request may or may not
127-
begin with **create** notifications for all the relevant pre-existing
128-
objects. API Priority and Fairness considers a **watch** request to be
124+
couple more excursions from the baseline behavior. The first concerns
125+
how long a **watch** request is considered to occupy its seat. Depending
126+
on request parameters, the response to a **watch** request may or may not
127+
begin with **create** notifications for all the relevant pre-existing
128+
objects. API Priority and Fairness considers a **watch** request to be
129129
done with its seat once that initial burst of notifications, if any,
130130
is over.
131131

132132
The normal notifications are sent in a concurrent burst to all
133-
relevant **watch** response streams whenever the server is notified of an
134-
object create/update/delete. To account for this work, API Priority
133+
relevant **watch** response streams whenever the server is notified of an
134+
object create/update/delete. To account for this work, API Priority
135135
and Fairness considers every write request to spend some additional
136-
time occupying seats after the actual writing is done. The server
136+
time occupying seats after the actual writing is done. The server
137137
estimates the number of notifications to be sent and adjusts the write
138138
request's number of seats and seat occupancy time to include this
139139
extra work.
@@ -155,7 +155,7 @@ To enable distinct handling of distinct instances, controllers that have
155155
many instances should authenticate with distinct usernames
156156

157157
After classifying a request into a flow, the API Priority and Fairness
158-
feature then may assign the request to a queue. This assignment uses
158+
feature then may assign the request to a queue. This assignment uses
159159
a technique known as {{< glossary_tooltip term_id="shuffle-sharding"
160160
text="shuffle sharding" >}}, which makes relatively efficient use of
161161
queues to insulate low-intensity flows from high-intensity flows.
@@ -203,19 +203,19 @@ go up (or down) by the same fraction.
203203
{{< caution >}}
204204
In the versions before `v1beta3` the relevant
205205
PriorityLevelConfiguration field is named "assured concurrency shares"
206-
rather than "nominal concurrency shares". Also, in Kubernetes release
206+
rather than "nominal concurrency shares". Also, in Kubernetes release
207207
1.25 and earlier there were no periodic adjustments: the
208208
nominal/assured limits were always applied without adjustment.
209209
{{< /caution >}}
210210

211211
The bounds on how much concurrency a priority level may lend and how
212212
much it may borrow are expressed in the PriorityLevelConfiguration as
213-
percentages of the level's nominal limit. These are resolved to
213+
percentages of the level's nominal limit. These are resolved to
214214
absolute numbers of seats by multiplying with the nominal limit /
215-
100.0 and rounding. The dynamically adjusted concurrency limit of a
215+
100.0 and rounding. The dynamically adjusted concurrency limit of a
216216
priority level is constrained to lie between (a) a lower bound of its
217217
nominal limit minus its lendable seats and (b) an upper bound of its
218-
nominal limit plus the seats it may borrow. At each adjustment the
218+
nominal limit plus the seats it may borrow. At each adjustment the
219219
dynamic limits are derived by each priority level reclaiming any lent
220220
seats for which demand recently appeared and then jointly fairly
221221
responding to the recent seat demand on the priority levels, within
@@ -328,9 +328,9 @@ mandatory and suggested.
328328
### Mandatory Configuration Objects
329329

330330
The four mandatory configuration objects reflect fixed built-in
331-
guardrail behavior. This is behavior that the servers have before
331+
guardrail behavior. This is behavior that the servers have before
332332
those objects exist, and when those objects exist their specs reflect
333-
this behavior. The four mandatory objects are as follows.
333+
this behavior. The four mandatory objects are as follows.
334334

335335
* The mandatory `exempt` priority level is used for requests that are
336336
not subject to flow control at all: they will always be dispatched
@@ -352,8 +352,8 @@ this behavior. The four mandatory objects are as follows.
352352
### Suggested Configuration Objects
353353

354354
The suggested FlowSchemas and PriorityLevelConfigurations constitute a
355-
reasonable default configuration. You can modify these and/or create
356-
additional configuration objects if you want. If your cluster is
355+
reasonable default configuration. You can modify these and/or create
356+
additional configuration objects if you want. If your cluster is
357357
likely to experience heavy load then you should consider what
358358
configuration will work best.
359359

@@ -405,33 +405,33 @@ The server refuses to allow a creation or update with a spec that is
405405
inconsistent with the server's guardrail behavior.
406406

407407
Maintenance of suggested configuration objects is designed to allow
408-
their specs to be overridden. Deletion, on the other hand, is not
409-
respected: maintenance will restore the object. If you do not want a
408+
their specs to be overridden. Deletion, on the other hand, is not
409+
respected: maintenance will restore the object. If you do not want a
410410
suggested configuration object then you need to keep it around but set
411-
its spec to have minimal consequences. Maintenance of suggested
411+
its spec to have minimal consequences. Maintenance of suggested
412412
objects is also designed to support automatic migration when a new
413413
version of the `kube-apiserver` is rolled out, albeit potentially with
414414
thrashing while there is a mixed population of servers.
415415

416416
Maintenance of a suggested configuration object consists of creating
417417
it --- with the server's suggested spec --- if the object does not
418-
exist. OTOH, if the object already exists, maintenance behavior
418+
exist. OTOH, if the object already exists, maintenance behavior
419419
depends on whether the `kube-apiservers` or the users control the
420-
object. In the former case, the server ensures that the object's spec
420+
object. In the former case, the server ensures that the object's spec
421421
is what the server suggests; in the latter case, the spec is left
422422
alone.
423423

424424
The question of who controls the object is answered by first looking
425-
for an annotation with key `apf.kubernetes.io/autoupdate-spec`. If
425+
for an annotation with key `apf.kubernetes.io/autoupdate-spec`. If
426426
there is such an annotation and its value is `true` then the
427-
kube-apiservers control the object. If there is such an annotation
428-
and its value is `false` then the users control the object. If
427+
kube-apiservers control the object. If there is such an annotation
428+
and its value is `false` then the users control the object. If
429429
neither of those conditions holds then the `metadata.generation` of the
430-
object is consulted. If that is 1 then the kube-apiservers control
431-
the object. Otherwise the users control the object. These rules were
430+
object is consulted. If that is 1 then the kube-apiservers control
431+
the object. Otherwise the users control the object. These rules were
432432
introduced in release 1.22 and their consideration of
433433
`metadata.generation` is for the sake of migration from the simpler
434-
earlier behavior. Users who wish to control a suggested configuration
434+
earlier behavior. Users who wish to control a suggested configuration
435435
object should set its `apf.kubernetes.io/autoupdate-spec` annotation
436436
to `false`.
437437

@@ -448,7 +448,7 @@ nor suggested but are annotated
448448

449449
The suggested configuration gives no special treatment to the health
450450
check requests on kube-apiservers from their local kubelets --- which
451-
tend to use the secured port but supply no credentials. With the
451+
tend to use the secured port but supply no credentials. With the
452452
suggested config, these requests get assigned to the `global-default`
453453
FlowSchema and the corresponding `global-default` priority level,
454454
where other traffic can crowd them out.
@@ -459,7 +459,7 @@ requests from rate limiting.
459459
{{< caution >}}
460460
Making this change also allows any hostile party to then send
461461
health-check requests that match this FlowSchema, at any volume they
462-
like. If you have a web traffic filter or similar external security
462+
like. If you have a web traffic filter or similar external security
463463
mechanism to protect your cluster's API server from general internet
464464
traffic, you can configure rules to block any health check requests
465465
that originate from outside your cluster.
@@ -489,7 +489,7 @@ poorly-behaved workloads that may be harming system health.
489489
(cumulative since server start) of requests that were rejected,
490490
broken down by the labels `flow_schema` (indicating the one that
491491
matched the request), `priority_level` (indicating the one to which
492-
the request was assigned), and `reason`. The `reason` label will be
492+
the request was assigned), and `reason`. The `reason` label will be
493493
one of the following values:
494494

495495
* `queue-full`, indicating that too many requests were already
@@ -541,7 +541,7 @@ poorly-behaved workloads that may be harming system health.
541541
high water marks of the number of queued requests, grouped by a
542542
label named `request_kind` whose value is `mutating` or `readOnly`.
543543
These high water marks describe the largest number seen in the one
544-
second window most recently completed. These complement the older
544+
second window most recently completed. These complement the older
545545
`apiserver_current_inflight_requests` gauge vector that holds the
546546
last window's high water mark of number of requests actively being
547547
served.
@@ -555,7 +555,7 @@ poorly-behaved workloads that may be harming system health.
555555
nanosecond, of the number of requests broken down by the labels
556556
`phase` (which takes on the values `waiting` and `executing`) and
557557
`request_kind` (which takes on the values `mutating` and
558-
`readOnly`). Each observed value is a ratio, between 0 and 1, of
558+
`readOnly`). Each observed value is a ratio, between 0 and 1, of
559559
the number of requests divided by the corresponding limit on the
560560
number of requests (queue volume limit for waiting and concurrency
561561
limit for executing).
@@ -568,21 +568,21 @@ poorly-behaved workloads that may be harming system health.
568568
histogram vector of observations, made at the end of each
569569
nanosecond, of the number of requests broken down by the labels
570570
`phase` (which takes on the values `waiting` and `executing`) and
571-
`priority_level`. Each observed value is a ratio, between 0 and 1,
571+
`priority_level`. Each observed value is a ratio, between 0 and 1,
572572
of a number of requests divided by the corresponding limit on the
573573
number of requests (queue volume limit for waiting and concurrency
574574
limit for executing).
575575

576576
* `apiserver_flowcontrol_priority_level_seat_utilization` is a
577577
histogram vector of observations, made at the end of each
578578
nanosecond, of the utilization of a priority level's concurrency
579-
limit, broken down by `priority_level`. This utilization is the
580-
fraction (number of seats occupied) / (concurrency limit). This
579+
limit, broken down by `priority_level`. This utilization is the
580+
fraction (number of seats occupied) / (concurrency limit). This
581581
metric considers all stages of execution (both normal and the extra
582582
delay at the end of a write to cover for the corresponding
583583
notification work) of all requests except WATCHes; for those it
584584
considers only the initial stage that delivers notifications of
585-
pre-existing objects. Each histogram in the vector is also labeled
585+
pre-existing objects. Each histogram in the vector is also labeled
586586
with `phase: executing` (there is no seat limit for the waiting
587587
phase).
588588

@@ -603,9 +603,9 @@ poorly-behaved workloads that may be harming system health.
603603
{{< /note >}}
604604

605605
* `apiserver_flowcontrol_request_concurrency_limit` is the same as
606-
`apiserver_flowcontrol_nominal_limit_seats`. Before the
607-
introduction of concurrency borrowing between priority levels, this
608-
was always equal to `apiserver_flowcontrol_current_limit_seats`
606+
`apiserver_flowcontrol_nominal_limit_seats`. Before the
607+
introduction of concurrency borrowing between priority levels,
608+
this was always equal to `apiserver_flowcontrol_current_limit_seats`
609609
(which did not exist as a distinct metric).
610610

611611
* `apiserver_flowcontrol_lower_limit_seats` is a gauge vector holding
@@ -616,8 +616,8 @@ poorly-behaved workloads that may be harming system health.
616616

617617
* `apiserver_flowcontrol_demand_seats` is a histogram vector counting
618618
observations, at the end of every nanosecond, of each priority
619-
level's ratio of (seat demand) / (nominal concurrency limit). A
620-
priority level's seat demand is the sum, over both queued requests
619+
level's ratio of (seat demand) / (nominal concurrency limit).
620+
A priority level's seat demand is the sum, over both queued requests
621621
and those in the initial phase of execution, of the maximum of the
622622
number of seats occupied in the request's initial and final
623623
execution phases.
@@ -791,6 +791,6 @@ Example FlowSchema object to isolate list event requests:
791791

792792
- You can visit flow control [reference doc](/docs/reference/debug-cluster/flow-control/) to learn more about troubleshooting.
793793
- For background information on design details for API priority and fairness, see
794-
the [enhancement proposal](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1040-priority-and-fairness).
794+
the [enhancement proposal](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1040-priority-and-fairness).
795795
- You can make suggestions and feature requests via [SIG API Machinery](https://github.com/kubernetes/community/tree/master/sig-api-machinery)
796-
or the feature's [slack channel](https://kubernetes.slack.com/messages/api-priority-and-fairness).
796+
or the feature's [slack channel](https://kubernetes.slack.com/messages/api-priority-and-fairness).

0 commit comments

Comments
 (0)