Skip to content

Commit ff1ac0a

Browse files
committed
Update the Proposal description
1 parent 2d6e1d4 commit ff1ac0a

File tree

2 files changed

+21
-18
lines changed

2 files changed

+21
-18
lines changed

keps/sig-scheduling/5007-device-attach-before-pod-scheduled/README.md

Lines changed: 21 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,7 @@ tags, and then generate with `hack/update-toc.sh`.
9292
- [DRA Scheduler Plugin Design Overview](#dra-scheduler-plugin-design-overview)
9393
- [Device Attribute Additions](#device-attribute-additions)
9494
- [<code>AllocatedDeviceStatus</code> Additions](#allocateddevicestatus-additions)
95+
- [Scheduler DRA plugin Additions](#scheduler-dra-plugin-additions)
9596
- [PreBind Phase Timeout](#prebind-phase-timeout)
9697
- [Handling ResourceSlices Upon Failure of Attachment](#handling-resourceslices-upon-failure-of-attachment)
9798
- [Composable Controller Design Overview](#composable-controller-design-overview)
@@ -163,7 +164,7 @@ Items marked with (R) are required *prior to targeting to a milestone / release*
163164
To achieve efficient management of fabric devices, we propose adding the following features to the Kubernetes scheduler's DRA plugin.
164165
Fabric devices are those that are not directly connected to the server and require attachment to the server for use.
165166

166-
In the DRA current implementation, fabric devices are attached after the scheduling decision, which leads to the following issue:
167+
In the current DRA implementation, fabric devices are attached after the scheduling decision, which leads to the following issue:
167168

168169
Fabric devices may be contested by other clusters.
169170
In scenarios where attachment occurs after scheduling, there is a risk that the resource cannot be attached at the time of attachment, causing the container to remain in the "Container Creating" state.
@@ -180,7 +181,7 @@ Recently, a new server architecture called Composable Disaggregated Infrastructu
180181

181182
In a traditional server, hardware resources such as CPUs, memory, and GPUs reside within the server.
182183
Composable Disaggregated Infrastructure decomposes these hardware resources and makes them available as resource pools.
183-
We can combine these resource by software definition so that we can create custom-made servers.
184+
We can combine these resources by software definition so that we can create custom-made servers.
184185

185186
Composable system is composed of resource pool and Composable Manager software.
186187
In Resource pool all components are connected to PCIe or CXL switches.
@@ -224,25 +225,26 @@ and make progress.
224225

225226
The basic idea is the following:
226227

227-
1. **Add a Ready Flag to ResourceClaim**:
228-
- Add a flag to `ResourceClaim` that indicates the readiness state of the device.
229-
The `PreBind` phase will be held until this flag is set to "Ready".
228+
1. **Adding Attributes to ResourceSlice**:
229+
- Add an attribute to `ResourceSlice` to indicate fabric devices. This key is predefined as part of the attributes.
230230

231-
2. **Wait for Device Attachment Completion in the PreBind() Process**:
232-
The overall flow of the PreBind() process is as follows:
231+
2. **Waiting for Device Attachment in PreBind**:
232+
- For fabric devices, the scheduler waits for the device attachment to complete during the `PreBind` phase.
233233

234-
- **Update ResourceClaim**:
235-
- The scheduler updates the `ResourceClaim` to notify the vendor's driver that the device needs to be prepared.
236-
This process is the same as the existing `PreBind`.
237-
- After updating the `ResourceClaim`, if the flag is set to "Preparing", the completion of the `PreBind` phase will be held until the flag is set to "Ready".
234+
3. **PreBind Process**:
235+
The overall flow of the `PreBind` process is as follows:
236+
237+
- **Updating ResourceClaim**:
238+
- The scheduler DRA plugin updates the `ResourceClaim` to notify the Composable DRA Controllers that device attachment is needed.
239+
This is the same as the existing `PreBind` process.
240+
- In addition to the existing operations, the update to the `ResourceClaim` includes setting the necessary values in the `AllocatedDeviceStatus` conditions.
238241

239242
- **Monitoring and Preparation by Composable DRA Controllers**:
240-
- Composable DRA Controllers monitor the `ResourceClaim`.
241-
If a device that requires preparation is associated with the `ResourceClaim`, they perform the necessary preparations.
242-
Once the preparation is complete, they set the flag to "Ready".
243+
- Composable DRA Controllers monitor the `ResourceClaim`. If a device that requires preparation is associated with the `ResourceClaim`, they perform the necessary preparations.
244+
- Once the preparation is complete, they set the conditions to `true`.
243245

244246
- **Completion of the PreBind Phase**:
245-
- Once the flag is set to "Ready", the `PreBind` phase is completed, and the scheduler proceeds to the next step.
247+
- Once all conditions are met, the `PreBind` phase is completed, and the scheduler proceeds to the next step.
246248

247249
### User Stories (Optional)
248250

@@ -288,8 +290,7 @@ This document outlines the design of the DRA Scheduler Plugin, focusing on the h
288290
Key additions include new attributes for device identification, enhancements to `AllocatedDeviceStatus`, and the process for handling `ResourceSlices` upon attachment failure.
289291
The composable controller design is also discussed, emphasizing efficient utilization of fabric devices.
290292

291-
//TODO update figure
292-
![proposal](proposal2.JPG)
293+
![proposal](proposal.jpg)
293294

294295
#### Device Attribute Additions
295296

@@ -348,13 +349,15 @@ type AllocatedDeviceStatus struct {
348349
// +optional
349350
NodeName string
350351
}
352+
351353
const(
352354
DRADeviceNeedAttachType = "kubernetes.io/needs-attaching"
353355
DRADeviceIsAttachType = "kubernetes.io/is-attached"
354356
DRADeviceAttachFailType = "kubernetes.io/attach-failed"
355357
)
356358
```
357359

360+
#### Scheduler DRA plugin Additions
358361
When `kubernetes.io/needs-attaching: true` is set, the scheduler DRA plugin is expected to do the following:
359362

360363
1. Set `AllocatedDeviceStatus.NodeName`.
@@ -368,7 +371,7 @@ This issue will be addressed separately as outlined in kubernetes/kubernetes#129
368371
#### PreBind Phase Timeout
369372

370373
If the device attachment is successful, we expect it to take no longer than 5 minutes.
371-
Therefore, if we set a fixed timeout for the scheduler, we would like to set it to 10 minutes.
374+
However, to account for potential update lags, we would like to set a fixed timeout for the scheduler to 10 minutes.
372375

373376
Even if the conditions `Type: kubernetes.io/is-attached` or `Type: kubernetes.io/attach-failed` are not updated, setting a timeout will prevent the scheduler from waiting indefinitely in the PreBind phase.
374377

105 KB
Loading

0 commit comments

Comments
 (0)