@@ -643,8 +643,8 @@ The concurrency limit of an apiserver is divided among the non-exempt
643
643
priority levels, and they can do a limited amount of borrowing from
644
644
each other.
645
645
646
- One field of ` LimitedPriorityLevelConfiguration ` , introduced in the
647
- midst of the ` v1beta2 ` lifetime, limits the borrowing. The field is
646
+ Two fields of ` LimitedPriorityLevelConfiguration ` , introduced in the
647
+ midst of the ` v1beta2 ` lifetime, limit the borrowing. The fields are
648
648
added in all the versions (` v1alpha1 ` , ` v1beta1 ` , and ` v1beta2 ` ). The
649
649
following display shows the new fields along with the updated
650
650
description for the ` AssuredConcurrencyShares ` field, in ` v1beta2 ` .
@@ -657,9 +657,8 @@ type LimitedPriorityLevelConfiguration struct {
657
657
// This is the number of execution seats available at this priority level.
658
658
// This is used both for requests dispatched from
659
659
// this priority level as well as requests dispatched from other priority
660
- // levels borrowing seats from this level. This does not limit dispatching from
661
- // this priority level that borrows seats from other priority levels (those other
662
- // levels do that). The server's concurrency limit (ServerCL) is divided among the
660
+ // levels borrowing seats from this level.
661
+ // The server's concurrency limit (ServerCL) is divided among the
663
662
// Limited priority levels in proportion to their ACS values:
664
663
//
665
664
// NominalCL(i) = ceil( ServerCL * ACS(i) / sum_acs )
@@ -671,22 +670,35 @@ type LimitedPriorityLevelConfiguration struct {
671
670
// +optional
672
671
AssuredConcurrencyShares int32
673
672
674
- // `borrowablePercent ` prescribes the fraction of the level's NominalCL that
673
+ // `lendablePercent ` prescribes the fraction of the level's NominalCL that
675
674
// can be borrowed by other priority levels. This value of this
676
675
// field must be between 0 and 100, inclusive, and it defaults to 0.
677
676
// The number of seats that other levels can borrow from this level, known
678
- // as this level's BorrowableConcurrencyLimit (BorrowableCL ), is defined as follows.
677
+ // as this level's LendableConcurrencyLimit (LendableCL ), is defined as follows.
679
678
//
680
- // BorrowableCL (i) = round( NominalCL(i) * borrowablePercent (i)/100.0 )
679
+ // LendableCL (i) = round( NominalCL(i) * lendablePercent (i)/100.0 )
681
680
//
682
681
// +optional
683
- BorrowablePercent int32
682
+ LendablePercent int32
683
+
684
+ // `borrowingLimitPercent`, if present, specifies a limit on how many seats
685
+ // this priority level can borrow from other priority levels. The limit
686
+ // is known as this level's BorrowingConcurrencyLimit (BorrowingCL) and
687
+ // is a limit on the total number of seats that this level may borrow
688
+ // at any one time. When this field is non-nil, it must hold a non-negative
689
+ // integer and the limit is calculated as follows.
690
+ //
691
+ // BorrowingCL(i) = round( NominalCL(i) * borrowingLimitPercent(i)/100.0 )
692
+ //
693
+ // When this field is left `nil`, the limit is effetively infinite.
694
+ // +optional
695
+ BorrowingLimitPercent *int32
684
696
}
685
697
```
686
698
687
699
Prior to the introduction of borrowing, the ` assuredConcurrencyShares `
688
700
field had two meanings that amounted to the same thing: the total
689
- shares of the level, and the non-borrowable shares of the level.
701
+ shares of the level, and the non-lendable shares of the level.
690
702
While it is somewhat unnatural to keep the meaning of "total shares"
691
703
for a field named "assured" shares, rolling out the new behavior into
692
704
existing systems will be more continuous if we keep the meaning of
@@ -696,28 +708,34 @@ rename the `AssuredConcurrencyShares` to `NominalConcurrencyShares`.
696
708
The following table shows the current default non-exempt priority
697
709
levels and a proposal for their new configuration.
698
710
699
- | Name | Assured Shares | Proposed Borrowable Percent |
700
- | ---- | -------------: | --------------------------: |
701
- | leader-election | 10 | 0 |
702
- | node-high | 40 | 25 |
703
- | system | 30 | 33 |
704
- | workload-high | 40 | 50 |
705
- | workload-low | 100 | 90 |
706
- | global-default | 20 | 50 |
707
- | catch-all | 5 | 0 |
711
+ | Name | Assured Shares | Proposed Lendable | Proposed Borrowing Limit |
712
+ | ---- | -------------: | ----------------: | ------------- ----------: |
713
+ | leader-election | 10 | 0% | none |
714
+ | node-high | 40 | 25% | none |
715
+ | system | 30 | 33% | none |
716
+ | workload-high | 40 | 50% | none |
717
+ | workload-low | 100 | 90% | none |
718
+ | global-default | 20 | 50% | none |
719
+ | catch-all | 5 | 0% | none |
708
720
709
721
Each non-exempt priority level ` i ` has two concurrency limits: its
710
722
NominalConcurrencyLimit (` NominalCL(i) ` ) as defined above by
711
723
configuration, and a CurrentConcurrencyLimit (` CurrentCL(i) ` ) that is
712
724
used in dispatching requests. The CurrentCLs are adjusted
713
725
periodically, based on configuration, the current situation at
714
726
adjustment time, and recent observations. The "borrowing" resides in
715
- the differences between CurrentCL and NominalCL. There is a lower
716
- bound on each non-exempt priority level's CurrentCL: `MinCL(i) =
717
- NominalCL(i) - BorrowableCL(i)`; the upper limit is imposed only by
718
- how many seats are available for borrowing from other priority levels.
719
- The sum of the CurrentCLs is always equal to the server's concurrency
720
- limit (ServerCL) plus or minus a little for rounding in the adjustment
727
+ the differences between CurrentCL and NominalCL. There are upper and lower
728
+ bound on each non-exempt priority level's CurrentCL, as follows.
729
+
730
+ ```
731
+ MaxCL(i) = NominalCL(i) + BorrowingCL(i)
732
+ MinCL(i) = NominalCL(i) - LendableCL(i)
733
+ ```
734
+
735
+ Naturally the CurrentCL values are also limited by how many seats are
736
+ available for borrowing from other priority levels. The sum of the
737
+ CurrentCLs is always equal to the server's concurrency limit
738
+ (ServerCL) plus or minus a little for rounding in the adjustment
721
739
algorithm below.
722
740
723
741
Dispatching is done independently for each priority level. Whenever
@@ -751,12 +769,13 @@ line flag `--seat-demand-history-fraction` with a default value of 0.9
751
769
configures A.
752
770
753
771
Adjustment is also done on configuration change, when a priority level
754
- is introduced or removed or its NominalCL or BorrowableCL changes. At
755
- such a time, the current adjustment period comes to an early end and
756
- the regular adjustment logic runs; the adjustment timer is reset to
757
- next fire 10 seconds later. For a newly introduced priority level, we
758
- set HighSeatDemand, AvgSeatDemand, and SmoothSeatDemand to
759
- NominalCL-BorrowableSD/2 and StDevSeatDemand to zero.
772
+ is introduced or removed or its NominalCL, LendableCL, or BorrowingCL
773
+ changes. At such a time, the current adjustment period comes to an
774
+ early end and the regular adjustment logic runs; the adjustment timer
775
+ is reset to next fire 10 seconds later. For a newly introduced
776
+ priority level, we set HighSeatDemand, AvgSeatDemand, and
777
+ SmoothSeatDemand to NominalCL-LendableSD/2 and StDevSeatDemand to
778
+ zero.
760
779
761
780
For adjusting the CurrentCL values, each non-exempt priority level ` i `
762
781
has a lower bound (` MinCurrentCL(i) ` ) for the new value. It is simply
@@ -773,36 +792,43 @@ the CurrentCL values be floating-point numbers, not necessarily
773
792
integers.
774
793
775
794
The priority levels would all be fairly happy if we set CurrentCL =
776
- SmoothSeatDemand for each. We clip that by the lower bound just
777
- shown, taking ` Target(i) = max(SmoothSeatDemand(i), MinCurrentCL(i)) `
778
- as a first-order target for each non-exempt priority level ` i ` .
795
+ SmoothSeatDemand for each. We clip that by the lower bound just shown
796
+ and the configured upper bound. We define ` Target(i) ` as follows and
797
+ take it as a first-order target for each non-exempt priority level
798
+ ` i ` .
799
+
800
+ ```
801
+ Target(i) = min( MaxCL(i), max( MinCurrentCL(i), SmoothSeatDemand(i) ))
802
+ ```
779
803
780
804
Sadly, the sum of the Target values --- let's name that TargetSum ---
781
- is not necessarily equal to ServerCL. However, if `TargetSum <=
782
- ServerCL` then all the Targets can be scaled up in the same proportion
783
- ` FairProp = ServerCL / TargetSum ` to get the new concurrency limits.
784
- That is, ` CurrentCL(i) := FairProp * Target(i) ` for each non-exempt
785
- priority level ` i ` . This shares the wealth proportionally among the
786
- priority levels. Also note, the following computation produces the
787
- same result.
788
-
789
- If ` TargetSum > ServerCL ` then we can not necessarily scale all the
790
- Targets down by the same factor --- because that might violate some
791
- lower bounds. The problem is to find a proportion ` FairProp ` , which
792
- we know must lie somewhere in the range (0,1) when `TargetSum >
793
- ServerCL`, that can be shared by all the priority levels except those
794
- whose lower bound forbids that. This means to find the one value of
795
- ` FairProp ` that solves the following conditions, for all the
796
- non-exempt priority levels ` i ` , and also makes the CurrentCL values
797
- sum to ServerCL.
798
-
799
- ```
800
- CurrentCL(i) = FairProp * Target(i) if FairProp * Target(i) >= MinCurrentCL(i)
801
- CurrentCL(i) = MinCurrentCL(i) if FairProp * Target(i) <= MinCurrentCL(i)
802
- ```
803
-
804
- This is the mirror image of the max-min fairness problem and can be
805
- solved with the same sort of algorithm, taking O(N log N) time and
805
+ is not necessarily equal to ServerCL. However, if `TargetSum <
806
+ ServerCL` then all the Targets could be scaled up in the same
807
+ proportion ` FairProp = ServerCL / TargetSum ` (if that did not violate
808
+ any upper bound) to get the new concurrency limits `CurrentCL(i) :=
809
+ FairProp * Target(i)` for each non-exempt priority level ` i`.
810
+ Similarly, if ` TargetSum > ServerCL ` then all the Targets could be
811
+ scaled down in the same proportion (if that did not violate any lower
812
+ bound) to get the new concurrency limits. This shares the wealth or
813
+ the pain proportionally among the priority levels. The following
814
+ computation generalizes this idea to respect the relevant bounds.
815
+
816
+ We can not necessarily scale all the Targets by the same factor ---
817
+ because that might violate some upper or lower bounds. The problem is
818
+ to find a proportion ` FairProp ` that can be shared by all the priority
819
+ levels except those with a bound that forbids it. This means to find
820
+ a value of ` FairProp ` that simultaneously solves all the following
821
+ conditions, for the non-exempt priority levels ` i ` , and also makes the
822
+ CurrentCL values sum to ServerCL. In some cases there are many
823
+ satisfactory values of ` FairProp ` --- and that is OK, because they all
824
+ produce the same CurrentCL values.
825
+
826
+ ```
827
+ CurrentCL(i) = min( MaxCL(i), max( MinCurrentCL(i), FairProp * Target(i) ))
828
+ ```
829
+
830
+ This is similar to the max-min fairness problem and can be solved
831
+ using sorting and then a greedy algorithm, taking O(N log N) time and
806
832
O(N) space.
807
833
808
834
After finding the floating point CurrentCL solutions, each one is
0 commit comments