Skip to content

Commit 053dc48

Browse files
committed
Describe the APF tweaks for LIST and WATCH
1 parent 17c3350 commit 053dc48

File tree

1 file changed

+40
-4
lines changed

1 file changed

+40
-4
lines changed

content/en/docs/concepts/cluster-administration/flow-control.md

Lines changed: 40 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -31,10 +31,12 @@ use informers and react to failures of API requests with exponential
3131
back-off, and other clients that also work this way.
3232

3333
{{< caution >}}
34-
Requests classified as "long-running" — primarily watches — are not
35-
subject to the API Priority and Fairness filter. This is also true for
36-
the `--max-requests-inflight` flag without the API Priority and
37-
Fairness feature enabled.
34+
Some requests classified as "long-running" — such as remote command
35+
execution or log tailing — are not subject to the API Priority and
36+
Fairness filter. This is also true for the `--max-requests-inflight`
37+
flag without the API Priority and Fairness feature enabled. WATCH
38+
requests are considered long-running if API Priority and Fairness is
39+
disabled, NOT long-running if it enabled.
3840
{{< /caution >}}
3941

4042
<!-- body -->
@@ -93,6 +95,40 @@ Pods. This means that an ill-behaved Pod that floods the API server with
9395
requests cannot prevent leader election or actions by the built-in controllers
9496
from succeeding.
9597

98+
### Request Width
99+
100+
The above description of concurrency management is the baseline story.
101+
In it, all requests have equal "width": each takes up one "seat", one
102+
unit of concurrency.
103+
104+
But some requests take up more than one seat. Some of these are LIST
105+
requests that the server estimates will return a large number of
106+
objects. These have been found to put an exceptionally heavy burden
107+
on the server, among requests that take a similar amount of time to
108+
run. For this reason, the server estimates the number of objects that
109+
will be returned and considers the request to take a number of seats
110+
that is proportional to that estimated number.
111+
112+
### Execution Time Tweaks for WATCH
113+
114+
API Priority and Fairness manages WATCH requests but this involves a
115+
couple more excursions from the baseline behavior. The first concerns
116+
how long a WATCH request is considered to occupy its seat. Depending
117+
on request parameters, the response to a WATCH request may or may not
118+
begin with CREATE notifications for all the relevant pre-existing
119+
objects. API Priority and Fairness considers a WATCH request to be
120+
done with its seat once that initial burst of notifications, if any,
121+
is over.
122+
123+
The normal notifications are sent in a concurrent burst to all
124+
relevant WATCH response streams whenever the server is notified of an
125+
object create/update/delete. To account for this work, API Priority
126+
and Fairness consiers every write request to spend some additional
127+
time occupying seats after the actual writing is done. The server
128+
estimates the number of notifications to be sent and adjusts the write
129+
request's number of seats and seat occupancy time to include this
130+
extra work.
131+
96132
### Queuing
97133

98134
Even within a priority level there may be a large number of distinct sources of

0 commit comments

Comments
 (0)