Skip to content

Commit 778a848

Browse files
authored
Merge pull request #35791 from MikeSpreitzer/fix-31633
Describe the APF tweaks for LIST and WATCH
2 parents aee5f66 + 4834c69 commit 778a848

File tree

1 file changed

+45
-4
lines changed

1 file changed

+45
-4
lines changed

content/en/docs/concepts/cluster-administration/flow-control.md

Lines changed: 45 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,10 +31,13 @@ use informers and react to failures of API requests with exponential
3131
back-off, and other clients that also work this way.
3232

3333
{{< caution >}}
34-
Requests classified as "long-running" — primarily watches — are not
35-
subject to the API Priority and Fairness filter. This is also true for
36-
the `--max-requests-inflight` flag without the API Priority and
37-
Fairness feature enabled.
34+
Some requests classified as "long-running"&mdash;such as remote
35+
command execution or log tailing&mdash;are not subject to the API
36+
Priority and Fairness filter. This is also true for the
37+
`--max-requests-inflight` flag without the API Priority and Fairness
38+
feature enabled. API Priority and Fairness _does_ apply to **watch**
39+
requests. When API Priority and Fairness is disabled, **watch** requests
40+
are not subject to the `--max-requests-inflight` limit.
3841
{{< /caution >}}
3942

4043
<!-- body -->
@@ -93,6 +96,44 @@ Pods. This means that an ill-behaved Pod that floods the API server with
9396
requests cannot prevent leader election or actions by the built-in controllers
9497
from succeeding.
9598

99+
### Seats Occupied by a Request
100+
101+
The above description of concurrency management is the baseline story.
102+
In it, requests have different durations but are counted equally at
103+
any given moment when comparing against a priority level's concurrency
104+
limit. In the baseline story, each request occupies one unit of
105+
concurrency. The word "seat" is used to mean one unit of concurrency,
106+
inspired by the way each passenger on a train or aircraft takes up one
107+
of the fixed supply of seats.
108+
109+
But some requests take up more than one seat. Some of these are **list**
110+
requests that the server estimates will return a large number of
111+
objects. These have been found to put an exceptionally heavy burden
112+
on the server, among requests that take a similar amount of time to
113+
run. For this reason, the server estimates the number of objects that
114+
will be returned and considers the request to take a number of seats
115+
that is proportional to that estimated number.
116+
117+
### Execution time tweaks for watch requests
118+
119+
API Priority and Fairness manages **watch** requests, but this involves a
120+
couple more excursions from the baseline behavior. The first concerns
121+
how long a **watch** request is considered to occupy its seat. Depending
122+
on request parameters, the response to a **watch** request may or may not
123+
begin with **create** notifications for all the relevant pre-existing
124+
objects. API Priority and Fairness considers a **watch** request to be
125+
done with its seat once that initial burst of notifications, if any,
126+
is over.
127+
128+
The normal notifications are sent in a concurrent burst to all
129+
relevant **watch** response streams whenever the server is notified of an
130+
object create/update/delete. To account for this work, API Priority
131+
and Fairness considers every write request to spend some additional
132+
time occupying seats after the actual writing is done. The server
133+
estimates the number of notifications to be sent and adjusts the write
134+
request's number of seats and seat occupancy time to include this
135+
extra work.
136+
96137
### Queuing
97138

98139
Even within a priority level there may be a large number of distinct sources of

0 commit comments

Comments
 (0)