Introduce limit on borrowing as well as lending

MikeSpreitzer · MikeSpreitzer · commit ccfcab294576 · 2022-08-26T16:51:02.000-04:00
diff --git a/keps/sig-api-machinery/1040-priority-and-fairness/README.md b/keps/sig-api-machinery/1040-priority-and-fairness/README.md
@@ -643,8 +643,8 @@ The concurrency limit of an apiserver is divided among the non-exempt
 priority levels, and they can do a limited amount of borrowing from
 each other.
 
-One field of `LimitedPriorityLevelConfiguration`, introduced in the
-midst of the `v1beta2` lifetime, limits the borrowing.  The field is
+Two fields of `LimitedPriorityLevelConfiguration`, introduced in the
+midst of the `v1beta2` lifetime, limit the borrowing.  The fields are
 added in all the versions (`v1alpha1`, `v1beta1`, and `v1beta2`).  The
 following display shows the new fields along with the updated
 description for the `AssuredConcurrencyShares` field, in `v1beta2`.
@@ -657,9 +657,8 @@ type LimitedPriorityLevelConfiguration struct {
   // This is the number of execution seats available at this priority level.
   // This is used both for requests dispatched from
   // this priority level as well as requests dispatched from other priority
-  // levels borrowing seats from this level.  This does not limit dispatching from
-  // this priority level that borrows seats from other priority levels (those other
-  // levels do that).  The server's concurrency limit (ServerCL) is divided among the
+  // levels borrowing seats from this level.
+  // The server's concurrency limit (ServerCL) is divided among the
   // Limited priority levels in proportion to their ACS values:
   //
   // NominalCL(i)  = ceil( ServerCL * ACS(i) / sum_acs )
@@ -671,22 +670,35 @@ type LimitedPriorityLevelConfiguration struct {
   // +optional
   AssuredConcurrencyShares int32
 
-  // `borrowablePercent` prescribes the fraction of the level's NominalCL that
+  // `lendablePercent` prescribes the fraction of the level's NominalCL that
   // can be borrowed by other priority levels.  This value of this
   // field must be between 0 and 100, inclusive, and it defaults to 0.
   // The number of seats that other levels can borrow from this level, known
-  // as this level's BorrowableConcurrencyLimit (BorrowableCL), is defined as follows.
+  // as this level's LendableConcurrencyLimit (LendableCL), is defined as follows.
   //
-  // BorrowableCL(i) = round( NominalCL(i) * borrowablePercent(i)/100.0 )
+  // LendableCL(i) = round( NominalCL(i) * lendablePercent(i)/100.0 )
   //
   // +optional
-  BorrowablePercent int32
+  LendablePercent int32
+  
+  // `borrowingLimitPercent`, if present, specifies a limit on how many seats
+  // this priority level can borrow from other priority levels.  The limit
+  // is known as this level's BorrowingConcurrencyLimit (BorrowingCL) and
+  // is a limit on the total number of seats that this level may borrow
+  // at any one time.  When this field is non-nil, it must hold a non-negative
+  // integer and the limit is calculated as follows.
+  //
+  // BorrowingCL(i) = round( NominalCL(i) * borrowingLimitPercent(i)/100.0 )
+  //
+  // When this field is left `nil`, the limit is effetively infinite.
+  // +optional
+  BorrowingLimitPercent *int32
 }
 ```
 
 Prior to the introduction of borrowing, the `assuredConcurrencyShares`
 field had two meanings that amounted to the same thing: the total
-shares of the level, and the non-borrowable shares of the level.
+shares of the level, and the non-lendable shares of the level.
 While it is somewhat unnatural to keep the meaning of "total shares"
 for a field named "assured" shares, rolling out the new behavior into
 existing systems will be more continuous if we keep the meaning of
@@ -696,28 +708,34 @@ rename the `AssuredConcurrencyShares` to `NominalConcurrencyShares`.
 The following table shows the current default non-exempt priority
 levels and a proposal for their new configuration.
 
-| Name | Assured Shares | Proposed Borrowable Percent |
-| ---- | -------------: | --------------------------: |
-| leader-election |  10 |   0 |
-| node-high       |  40 |  25 |
-| system          |  30 |  33 |
-| workload-high   |  40 |  50 |
-| workload-low    | 100 |  90 |
-| global-default  |  20 |  50 |
-| catch-all       |   5 |   0 |
+| Name | Assured Shares | Proposed Lendable | Proposed Borrowing Limit |
+| ---- | -------------: | ----------------: | -----------------------: |
+| leader-election |  10 |   0% | none |
+| node-high       |  40 |  25% | none |
+| system          |  30 |  33% | none |
+| workload-high   |  40 |  50% | none |
+| workload-low    | 100 |  90% | none |
+| global-default  |  20 |  50% | none |
+| catch-all       |   5 |   0% | none |
 
 Each non-exempt priority level `i` has two concurrency limits: its
 NominalConcurrencyLimit (`NominalCL(i)`) as defined above by
 configuration, and a CurrentConcurrencyLimit (`CurrentCL(i)`) that is
 used in dispatching requests.  The CurrentCLs are adjusted
 periodically, based on configuration, the current situation at
 adjustment time, and recent observations.  The "borrowing" resides in
-the differences between CurrentCL and NominalCL.  There is a lower
-bound on each non-exempt priority level's CurrentCL: `MinCL(i) =
-NominalCL(i) - BorrowableCL(i)`; the upper limit is imposed only by
-how many seats are available for borrowing from other priority levels.
-The sum of the CurrentCLs is always equal to the server's concurrency
-limit (ServerCL) plus or minus a little for rounding in the adjustment
+the differences between CurrentCL and NominalCL.  There are upper and lower
+bound on each non-exempt priority level's CurrentCL, as follows.
+
+```
+MaxCL(i) = NominalCL(i) + BorrowingCL(i)
+MinCL(i) = NominalCL(i) - LendableCL(i)
+```
+
+Naturally the CurrentCL values are also limited by how many seats are
+available for borrowing from other priority levels.  The sum of the
+CurrentCLs is always equal to the server's concurrency limit
+(ServerCL) plus or minus a little for rounding in the adjustment
 algorithm below.
 
 Dispatching is done independently for each priority level.  Whenever
@@ -751,12 +769,13 @@ line flag `--seat-demand-history-fraction` with a default value of 0.9
 configures A.
 
 Adjustment is also done on configuration change, when a priority level
-is introduced or removed or its NominalCL or BorrowableCL changes.  At
-such a time, the current adjustment period comes to an early end and
-the regular adjustment logic runs; the adjustment timer is reset to
-next fire 10 seconds later.  For a newly introduced priority level, we
-set HighSeatDemand, AvgSeatDemand, and SmoothSeatDemand to
-NominalCL-BorrowableSD/2 and StDevSeatDemand to zero.
+is introduced or removed or its NominalCL, LendableCL, or BorrowingCL
+changes.  At such a time, the current adjustment period comes to an
+early end and the regular adjustment logic runs; the adjustment timer
+is reset to next fire 10 seconds later.  For a newly introduced
+priority level, we set HighSeatDemand, AvgSeatDemand, and
+SmoothSeatDemand to NominalCL-LendableSD/2 and StDevSeatDemand to
+zero.
 
 For adjusting the CurrentCL values, each non-exempt priority level `i`
 has a lower bound (`MinCurrentCL(i)`) for the new value.  It is simply
@@ -773,36 +792,43 @@ the CurrentCL values be floating-point numbers, not necessarily
 integers.
 
 The priority levels would all be fairly happy if we set CurrentCL =
-SmoothSeatDemand for each.  We clip that by the lower bound just
-shown, taking `Target(i) = max(SmoothSeatDemand(i), MinCurrentCL(i))`
-as a first-order target for each non-exempt priority level `i`.
+SmoothSeatDemand for each.  We clip that by the lower bound just shown
+and the configured upper bound.  We define `Target(i)` as follows and
+take it as a first-order target for each non-exempt priority level
+`i`.
+
+```
+Target(i) = min( MaxCL(i), max( MinCurrentCL(i), SmoothSeatDemand(i) ))
+```
 
 Sadly, the sum of the Target values --- let's name that TargetSum ---
-is not necessarily equal to ServerCL.  However, if `TargetSum <=
-ServerCL` then all the Targets can be scaled up in the same proportion
-`FairProp = ServerCL / TargetSum` to get the new concurrency limits.
-That is, `CurrentCL(i) := FairProp * Target(i)` for each non-exempt
-priority level `i`.  This shares the wealth proportionally among the
-priority levels.  Also note, the following computation produces the
-same result.
-
-If `TargetSum > ServerCL` then we can not necessarily scale all the
-Targets down by the same factor --- because that might violate some
-lower bounds.  The problem is to find a proportion `FairProp`, which
-we know must lie somewhere in the range (0,1) when `TargetSum >
-ServerCL`, that can be shared by all the priority levels except those
-whose lower bound forbids that.  This means to find the one value of
-`FairProp` that solves the following conditions, for all the
-non-exempt priority levels `i`, and also makes the CurrentCL values
-sum to ServerCL.
-
-```
-CurrentCL(i) = FairProp * Target(i)  if FairProp * Target(i) >= MinCurrentCL(i)
-CurrentCL(i) = MinCurrentCL(i)       if FairProp * Target(i) <= MinCurrentCL(i)
-```
-
-This is the mirror image of the max-min fairness problem and can be
-solved with the same sort of algorithm, taking O(N log N) time and
+is not necessarily equal to ServerCL.  However, if `TargetSum <
+ServerCL` then all the Targets could be scaled up in the same
+proportion `FairProp = ServerCL / TargetSum` (if that did not violate
+any upper bound) to get the new concurrency limits `CurrentCL(i) :=
+FairProp * Target(i)` for each non-exempt priority level `i`.
+Similarly, if `TargetSum > ServerCL` then all the Targets could be
+scaled down in the same proportion (if that did not violate any lower
+bound) to get the new concurrency limits.  This shares the wealth or
+the pain proportionally among the priority levels.  The following
+computation generalizes this idea to respect the relevant bounds.
+
+We can not necessarily scale all the Targets by the same factor ---
+because that might violate some upper or lower bounds.  The problem is
+to find a proportion `FairProp` that can be shared by all the priority
+levels except those with a bound that forbids it.  This means to find
+a value of `FairProp` that simultaneously solves all the following
+conditions, for the non-exempt priority levels `i`, and also makes the
+CurrentCL values sum to ServerCL.  In some cases there are many
+satisfactory values of `FairProp` --- and that is OK, because they all
+produce the same CurrentCL values.
+
+```
+CurrentCL(i) = min( MaxCL(i), max( MinCurrentCL(i), FairProp * Target(i) ))
+```
+
+This is similar to the max-min fairness problem and can be solved
+using sorting and then a greedy algorithm, taking O(N log N) time and
 O(N) space.
 
 After finding the floating point CurrentCL solutions, each one is