Skip to content

Commit b8cc58a

Browse files
committed
Clean up flow-control.md
1 parent 64b2336 commit b8cc58a

File tree

1 file changed

+75
-74
lines changed

1 file changed

+75
-74
lines changed

content/en/docs/concepts/cluster-administration/flow-control.md

Lines changed: 75 additions & 74 deletions
Original file line numberDiff line numberDiff line change
@@ -784,120 +784,121 @@ APF adds the following two headers to each HTTP response message.
784784

785785
## Good practices for using API Priority and Fairness
786786

787-
When a given priority level exceeds its permitted concurrency, requests can
788-
experience increased latency or be dropped with an HTTP 429 (Too Many Requests)
789-
error. To prevent these side effects of APF, you can modify your workload or
787+
When a given priority level exceeds its permitted concurrency, requests can
788+
experience increased latency or be dropped with an HTTP 429 (Too Many Requests)
789+
error. To prevent these side effects of APF, you can modify your workload or
790790
tweak your APF settings to ensure there are sufficient seats available to serve
791791
your requests.
792792

793793
To detect whether requests are being rejected due to APF, check the following
794794
metrics:
795-
- apiserver_flowcontrol_rejected_requests_total: the total number of requests
796-
rejected per FlowSchema and PriorityLevelConfiguration.
795+
796+
- apiserver_flowcontrol_rejected_requests_total: the total number of requests
797+
rejected per FlowSchema and PriorityLevelConfiguration.
797798
- apiserver_flowcontrol_current_inqueue_requests: the current number of requests
798-
queued per FlowSchema and PriorityLevelConfiguration.
799+
queued per FlowSchema and PriorityLevelConfiguration.
799800
- apiserver_flowcontrol_request_wait_duration_seconds: the latency added to
800-
requests waiting in queues.
801-
- apiserver_flowcontrol_priority_level_seat_utilization: the seat utilization
802-
per PriorityLevelConfiguration.
801+
requests waiting in queues.
802+
- apiserver_flowcontrol_priority_level_seat_utilization: the seat utilization
803+
per PriorityLevelConfiguration.
803804

804805
### Workload modifications {#good-practice-workload-modifications}
805806

806-
To prevent requests from queuing and adding latency or being dropped due to APF,
807+
To prevent requests from queuing and adding latency or being dropped due to APF,
807808
you can optimize your requests by:
808809

809-
- Reducing the rate at which requests are executed. A fewer number of requests
810-
over a fixed period will result in a fewer number of seats being needed at a
811-
given time.
812-
- Avoid issuing a large number of expensive requests concurrently. Requests can
813-
be optimized to use fewer seats or have lower latency so that these requests
814-
hold those seats for a shorter duration. List requests can occupy more than 1
815-
seat depending on the number of objects fetched during the request. Restricting
816-
the number of objects retrieved in a list request, for example by using
817-
pagination, will use less total seats over a shorter period. Furthermore,
818-
replacing list requests with watch requests will require lower total concurrency
819-
shares as watch requests only occupy 1 seat during its initial burst of
820-
notifications. If using streaming lists in versions 1.27 and later, watch
821-
requests will occupy the same number of seats as a list request for its initial
822-
burst of notifications because the entire state of the collection has to be
823-
streamed. Note that in both cases, a watch request will not hold any seats after
824-
this initial phase.
825-
826-
Keep in mind that queuing or rejected requests from APF could be induced by
827-
either an increase in the number of requests or an increase in latency for
810+
- Reducing the rate at which requests are executed. A fewer number of requests
811+
over a fixed period will result in a fewer number of seats being needed at a
812+
given time.
813+
- Avoid issuing a large number of expensive requests concurrently. Requests can
814+
be optimized to use fewer seats or have lower latency so that these requests
815+
hold those seats for a shorter duration. List requests can occupy more than 1
816+
seat depending on the number of objects fetched during the request. Restricting
817+
the number of objects retrieved in a list request, for example by using
818+
pagination, will use less total seats over a shorter period. Furthermore,
819+
replacing list requests with watch requests will require lower total concurrency
820+
shares as watch requests only occupy 1 seat during its initial burst of
821+
notifications. If using streaming lists in versions 1.27 and later, watch
822+
requests will occupy the same number of seats as a list request for its initial
823+
burst of notifications because the entire state of the collection has to be
824+
streamed. Note that in both cases, a watch request will not hold any seats after
825+
this initial phase.
826+
827+
Keep in mind that queuing or rejected requests from APF could be induced by
828+
either an increase in the number of requests or an increase in latency for
828829
existing requests. For example, if requests that normally take 1s to execute
829-
start taking 60s, it is possible that APF will start rejecting requests because
830-
requests are occupying seats for a longer duration than normal due to this
831-
increase in latency. If APF starts rejecting requests across multiple priority
832-
levels without a significant change in workload, it is possible there is an
833-
underlying issue with control plane performance rather than the workload or APF
830+
start taking 60s, it is possible that APF will start rejecting requests because
831+
requests are occupying seats for a longer duration than normal due to this
832+
increase in latency. If APF starts rejecting requests across multiple priority
833+
levels without a significant change in workload, it is possible there is an
834+
underlying issue with control plane performance rather than the workload or APF
834835
settings.
835836

836837
### Priority and fairness settings {#good-practice-apf-settings}
837838

838-
You can also modify the default FlowSchema and PriorityLevelConfiguration
839-
objects or create new objects of these types to better accommodate your
839+
You can also modify the default FlowSchema and PriorityLevelConfiguration
840+
objects or create new objects of these types to better accommodate your
840841
workload.
841842

842843
APF settings can be modified to:
844+
843845
- Give more seats to high priority requests.
844846
- Isolate non-essential or expensive requests that would starve a concurrency
845-
level if it was shared with other flows.
847+
level if it was shared with other flows.
846848

847849
#### Give more seats to high priority requests
848850

849-
1. If possible, the number of seats available across all priority levels for a
850-
particular `kube-apiserver` can be increased by increasing the values for the
851-
`max-requests-inflight` and `max-mutating-requests-inflight` flags. Alternatively,
852-
horizontally scaling the number of `kube-apiserver` instances will increase the
853-
total concurrency per priority level across the cluster assuming there is
854-
sufficient load balancing of requests.
855-
2. You can create a new FlowSchema which references a PriorityLevelConfiguration
856-
with a larger concurrency level. This new PriorityLevelConfiguration could be an
857-
existing level or a new level with its own set of nominal concurrency shares.
858-
For example, a new FlowSchema could be introduced to change the
859-
PriorityLevelConfiguration for your requests from global-default to workload-low
860-
to increase the number of seats available to your user. Creating a new
861-
PriorityLevelConfiguration will reduce the number of seats designated for
862-
existing levels. Recall that editing a default FlowSchema or
863-
PriorityLevelConfiguration will require setting the
864-
`apf.kubernetes.io/autoupdate-spec` annotation to false.
865-
3. You can also increase the NominalConcurrencyShares for the
866-
PriorityLevelConfiguration which is serving your high priority requests.
867-
Alternatively, for versions 1.26 and later, you can increase the LendablePercent
868-
for competing priority levels so that the given priority level has a higher pool
869-
of seats it can borrow.
851+
1. If possible, the number of seats available across all priority levels for a
852+
particular `kube-apiserver` can be increased by increasing the values for the
853+
`max-requests-inflight` and `max-mutating-requests-inflight` flags. Alternatively,
854+
horizontally scaling the number of `kube-apiserver` instances will increase the
855+
total concurrency per priority level across the cluster assuming there is
856+
sufficient load balancing of requests.
857+
1. You can create a new FlowSchema which references a PriorityLevelConfiguration
858+
with a larger concurrency level. This new PriorityLevelConfiguration could be an
859+
existing level or a new level with its own set of nominal concurrency shares.
860+
For example, a new FlowSchema could be introduced to change the
861+
PriorityLevelConfiguration for your requests from global-default to workload-low
862+
to increase the number of seats available to your user. Creating a new
863+
PriorityLevelConfiguration will reduce the number of seats designated for
864+
existing levels. Recall that editing a default FlowSchema or
865+
PriorityLevelConfiguration will require setting the
866+
`apf.kubernetes.io/autoupdate-spec` annotation to false.
867+
1. You can also increase the NominalConcurrencyShares for the
868+
PriorityLevelConfiguration which is serving your high priority requests.
869+
Alternatively, for versions 1.26 and later, you can increase the LendablePercent
870+
for competing priority levels so that the given priority level has a higher pool
871+
of seats it can borrow.
870872

871873
#### Isolate non-essential requests from starving other flows
872874

873-
For request isolation, you can create a FlowSchema whose subject matches the
874-
user making these requests or create a FlowSchema that matches what the request
875-
is (corresponding to the resourceRules). Next, you can map this FlowSchema to a
875+
For request isolation, you can create a FlowSchema whose subject matches the
876+
user making these requests or create a FlowSchema that matches what the request
877+
is (corresponding to the resourceRules). Next, you can map this FlowSchema to a
876878
PriorityLevelConfiguration with a low share of seats.
877879

878-
For example, suppose list event requests from Pods running in the default namespace
879-
are using 10 seats each and execute for 1 minute. To prevent these expensive
880+
For example, suppose list event requests from Pods running in the default namespace
881+
are using 10 seats each and execute for 1 minute. To prevent these expensive
880882
requests from impacting requests from other Pods using the existing service-accounts
881-
FlowSchema, you can apply the following FlowSchema to isolate these list calls
883+
FlowSchema, you can apply the following FlowSchema to isolate these list calls
882884
from other requests.
883885

884886
Example FlowSchema object to isolate list event requests:
885887

886888
{{% code file="priority-and-fairness/list-events-default-service-account.yaml" %}}
887889

888-
- This FlowSchema captures all list event calls made by the default service
889-
account in the default namespace. The matching precedence 8000 is lower than the
890-
value of 9000 used by the existing service-accounts FlowSchema so these list
891-
event calls will match list-events-default-service-account rather than
892-
service-accounts.
893-
- The catch-all PriorityLevelConfiguration is used to isolate these requests.
894-
The catch-all priority level has a very small concurrency share and does not
895-
queue requests.
890+
- This FlowSchema captures all list event calls made by the default service
891+
account in the default namespace. The matching precedence 8000 is lower than the
892+
value of 9000 used by the existing service-accounts FlowSchema so these list
893+
event calls will match list-events-default-service-account rather than
894+
service-accounts.
895+
- The catch-all PriorityLevelConfiguration is used to isolate these requests.
896+
The catch-all priority level has a very small concurrency share and does not
897+
queue requests.
896898

897899
## {{% heading "whatsnext" %}}
898900

899-
900901
For background information on design details for API priority and fairness, see
901902
the [enhancement proposal](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1040-priority-and-fairness).
902-
You can make suggestions and feature requests via [SIG API Machinery](https://github.com/kubernetes/community/tree/master/sig-api-machinery)
903+
You can make suggestions and feature requests via [SIG API Machinery](https://github.com/kubernetes/community/tree/master/sig-api-machinery)
903904
or the feature's [slack channel](https://kubernetes.slack.com/messages/api-priority-and-fairness).

0 commit comments

Comments
 (0)