Skip to content

Commit 670de3e

Browse files
authored
Update README.md
1 parent 519667a commit 670de3e

File tree

1 file changed

+13
-5
lines changed
  • keps/sig-scheduling/5007-device-attach-before-pod-scheduled

1 file changed

+13
-5
lines changed

keps/sig-scheduling/5007-device-attach-before-pod-scheduled/README.md

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -558,21 +558,27 @@ Composable DRA controoler also add a model name and so on into the attributes of
558558
![composable-resourceslice](composable-resourceslice.png)
559559

560560
### Alternative approach
561-
Instead of implementing the solution within the scheduler, we propose using the Cluster Autoscaler to manage the attachment and detachment of fabric devices.
562-
561+
Instead of implementing the solution within the scheduler, we can use "device autoscaler" which is a device version of ClusterAutoscaler(CA).
563562
The key points and main process flow of this alternative proposal are as follows:
564563

565564
The scheduler references only node-local ResourceSlices.
566565
If there are no available resources in the node-local ResourceSlices, the scheduler marks the Pod as unschedulable without waiting in the PreBind phase of the ResourceClaim.
566+
And then, device autoscaler tries to attach new devices.
567+
And it also try to detach devices if they have not been used for a period of time.
568+
This is similar to the concept of CA.
569+
570+
However, if CA and device autoscaler is running independently, CA may add a node with a device at the same time as the device autoscaler attaches the device.
571+
This is a wasted resource addition.
572+
Therefore, there is the following idea that putting device-scale functionality in CA.
567573

568-
To handle fabric resources, we implement the Processor for composable system within CA.
574+
To handle fabric resources in CA, we implement the Processor for composable system within CA.
569575
This Processor identifies unschedulable Pods and determines if attaching a fabric ResourceSlice device to an existing node would make scheduling possible.
570576
If so, the Processor instructs the attachment of the resource, using the composable Operator for the actual attachment process.
571577
If attaching the fabric ResourceSlice does not make scheduling possible, the Processor determines whether to add a new node as usual.
572578

573579
After the device is attached, the vendor DRA updates the node-local ResourceSlices.
574-
The vendor DRA needs a rescan function to update the Pool/ResourceSlice. The scheduler can then assign the node-local ResourceSlice devices to the unschedulable Pod, operating the same as the usual DRA from this point.
575-
580+
The vendor DRA needs a rescan function to update the Pool/ResourceSlice.
581+
The scheduler can then assign the node-local ResourceSlice devices to the unschedulable Pod, operating the same as the usual DRA from this point.
576582

577583
### Test Plan
578584

@@ -665,6 +671,8 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
665671
- Gather feedback from developers and surveys
666672
- Resolove the following issues
667673
- Scheduler does not guarantee to pick up the same node for the Pod after the restart
674+
- If Scheduler picks up another node for the Pod after the restart, devices are unnecessarily left on the original nodes
675+
(Composable DRA controller needs to have the function to detach a device automatically if it is not used by a Pod for a certain period of time)
668676
- Pods which are not bound yet (in api-server) and not unschedulable (in api-server) are not visible by cluster autoscaler, so there is a risk that the node will be turned down
669677
- The in-flight events cache may grow too large when waiting in PreBind
670678
- Additional tests are in Testgrid and linked in KEP

0 commit comments

Comments
 (0)