@@ -74,6 +74,7 @@ SIG Architecture for cross-cutting KEPs).
74
74
- [ Design Details] ( #design-details )
75
75
- [ API] ( #api )
76
76
- [ Implementation] ( #implementation )
77
+ - [ Limits] ( #limits )
77
78
- [ Test Plan] ( #test-plan )
78
79
- [ Prerequisite testing updates] ( #prerequisite-testing-updates )
79
80
- [ Unit tests] ( #unit-tests )
@@ -485,7 +486,33 @@ might pick a different device than expected or fails to allocate any device at a
485
486
think this failure mode is preferable than allowing the scheduler to make allocation decisions
486
487
based on incomplete data.
487
488
488
- ###
489
+ ### Limits
490
+
491
+ For DRA, we have gradually moved away from individual per-slice and per-map limits towards
492
+ aggregating at the higher level. The reason for this is to give users maximum flexibility between
493
+ defining a small number of complex devices or a large number of simple devices in a single
494
+ ResourceSlice without exceeding the limit on the size of Kubernetes objects.
495
+
496
+ For this KEP, we propose taking this to what is essentially the logical conclusion, where
497
+ we enforce most of the limits across all devices, mixins and counter sets in a ResourceSlice,
498
+ rather than setting separate limits for each of them.
499
+
500
+ The ResourceSlice-wide limits will be:
501
+ * Total number of devices is 128.
502
+ * Total combined number of attributes and capacity in a ResourceSlice is 4096 (so with the maximum number of devices, there can be 32 per device).
503
+ * Total number of counters is 256.
504
+ * Total number of consumed counters is 2048 (so with the maximum number of devices, there can be 16 per device).
505
+
506
+ We will still enforce some per-slice limits:
507
+ * The number of mixins that can be referenced from each device, counter set or device counter consumption is 8.
508
+ * The number of taints per device is 4.
509
+
510
+ The limits on the number of counters across counter sets, mixins and device counter consumption in 1.33 for the
511
+ Partitionable Devices KEP will be removed, as those are still in alpha.
512
+ The limit of 32 on the number of attributes and capacities per device will be removed over the next 2 releases (1.34 and 1.35) to
513
+ preserve safe rollbacks.
514
+
515
+ With these limits, the worst-case size for a ResourceSlice increases from 1,107,864 bytes to 1,288,825 bytes.
489
516
490
517
### Test Plan
491
518
@@ -808,15 +835,6 @@ size of the ResourceSlice object. However, it also provides features that
808
835
allows drivers to represent devices and counter sets in a more compact way,
809
836
thereby potentially reducing the size of the ResourceSlice object.
810
837
811
- To manage the worst-case size of the ` ResourceSlice ` object, the following limits
812
- are introduced in addition to the ones already described in the
813
- [ Partitionable Devices KEP] ( https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/4815-dra-partitionable-devices#will-enabling--using-this-feature-result-in-increasing-size-or-count-of-the-existing-api-objects ) :
814
- * The total number of attributes, capacities and counters defined in mixins in a
815
- ` ResourceSlice ` is limited to 256.
816
- * The total number of mixins allowed in ` Device ` , ` CounterSet ` and
817
- ` DeviceCounterConsumption ` is limited to 8.
818
-
819
-
820
838
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
821
839
822
840
Flattening the devices and counter sets will require slightly more work, but
0 commit comments