@@ -505,16 +505,24 @@ the two quota mechanisms above should keep track of the usages of the same
505505class of devices the same way.
506506
507507But currently, the extended resource quota keeps track of the devices provided
508- from device plugin, and DRA resource slice. The resource claim quota currently
509- only keeps track of the devices provided from DRA resource slice. This must be
510- enhanced to have it keep track of the devices from device plugin too.
511-
512- As a device can be requested by resource claim, or by extended resource, the
513- cluster admin MUST create two quotas with the same limit on one class of devices
514- to effectively quota the usage of that device class.
515-
516- For example, a cluster admin plans to allow 10 example.com/gpu devices in a
517- given namespace, they MUST create the following :
508+ from device plugin, and DRA resource slice requested from pod's extended resource
509+ requests. The resource claim quota currently keeps track of the devices provided
510+ from DRA resource slice requested from resource claims.
511+
512+ The extended resource quota usage needs to be adjusted to account for the device
513+ requests from resource claims. On the other side, resource claim quota has
514+ alreadys accounted for the devices requests from pod's extendeded resources, as
515+ scheduler would create a special resource claim for the extended resource requests.
516+
517+ For example, before the adjustment, the quota is as below. The explicit extended
518+ resource quota `requests.example.com/gpu` counts 1 device (e.g. gpu-0) from
519+ device plugin, and 1 device (e.g. gpu-1) from DRA resource slice. The implicit
520+ extended resource quota `request.deviceclass.resource.kubernetes.io/mygpuclass`
521+ counts 1 device (e.g. gpu-2) from DRA resource slice. The resource claim quota
522+ ` gpu.example.com.deviceclass.resource.k8s.io/devices` counts 1 device (e.g. gpu-3)
523+ from a pod resource claim, and 1 device (e.g. gpu-4) from a resource claim template,
524+ in addition it also counts gpu-1 and gpu-2 in, as scheduler generates extended
525+ resource claims for them.
518526
519527` ` ` yaml
520528apiVersion: v1
@@ -524,25 +532,49 @@ metadata:
524532spec:
525533 hard:
526534 requests.example.com/gpu: 10
535+ request.deviceclass.resource.kubernetes.io/mygpuclass: 10
527536 gpu.example.com.deviceclass.resource.k8s.io/devices: 10
537+ used:
538+ requests.example.com/gpu: 2
539+ request.deviceclass.resource.kubernetes.io/mygpuclass: 1
540+ gpu.example.com.deviceclass.resource.k8s.io/devices: 4
528541` ` `
529542
530- Provided that the device class gpu.example.com is mapped to the extended
543+ Provided that the device class mygpuclass is mapped to the extended
531544resource example.com/gpu.
532545` ` ` yaml
533546apiVersion: resource.k8s.io/v1
534547kind: DeviceClass
535548metadata:
536- name: gpu.example.com
549+ name: mygpusclass
537550spec:
538551 extendedResourceName: example.com/gpu
539552` ` `
540553
541- Resource Quota controller reconciles away the differences if any between the
542- usage of the two quota, and ensures their usage are always kept the same. For
543- that, the controller needs to have the permission to list the device classes
544- in the cluster to establish the mapping between device class and extended
545- resource.
554+ For the same example, the explicit extended resource quota `requests.example.com/gpu`
555+ needs to be adjusted to count in the devices requested from implicit extended resource
556+ (e.g. gpu-2) and from resoure claims (e.g gpu-3 and gpu-4). The implicit extended
557+ resource quota `request.deviceclass.resource.kubernetes.io/mygpuclass` needs to be
558+ adjusted to count in the devices requested from resource claims (e.g. gpu-3 and gpu-4),
559+ and the DRA devices requested from explicit extended resources (e.g. gpu-1), but
560+ not the device plugin devices (e.g. gpu-0). The adjusted quota is as below.
561+
562+
563+ ` ` ` yaml
564+ apiVersion: v1
565+ kind: ResourceQuota
566+ metadata:
567+ name: gpu
568+ spec:
569+ hard:
570+ requests.example.com/gpu: 10
571+ request.deviceclass.resource.kubernetes.io/mygpuclass: 10
572+ gpu.example.com.deviceclass.resource.k8s.io/devices: 10
573+ used:
574+ requests.example.com/gpu: 5
575+ request.deviceclass.resource.kubernetes.io/mygpuclass: 4
576+ gpu.example.com.deviceclass.resource.k8s.io/devices: 4
577+ ` ` `
546578
547579# ## Scheduling for Extended Resource backed by DRA
548580
0 commit comments