Skip to content

Commit d3fdfbd

Browse files
authored
Update README.md
Update some description about BindingConditions
1 parent 734244a commit d3fdfbd

File tree

1 file changed

+38
-35
lines changed
  • keps/sig-scheduling/5007-device-attach-before-pod-scheduled

1 file changed

+38
-35
lines changed

keps/sig-scheduling/5007-device-attach-before-pod-scheduled/README.md

Lines changed: 38 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -231,25 +231,26 @@ and make progress.
231231

232232
The basic idea is the following:
233233

234-
1. **Adding Attributes to ResourceSlice**:
235-
- Add an attribute to `ResourceSlice` to indicate fabric devices. This key is predefined as part of the attributes.
234+
1. **Adding BindingConditions and BindingFailureConditions to ResourceSlice**:
235+
- Add conditions to `ResourceSlice` to indicate the device needs some preparation before the scheduler proceeds `Bind` Phase.
236+
- For example, in a composable system, it is necessary to attach devices to nodes.
237+
- DRA driver can set any condition to BindingConditions or BindingFailureConditions depending on the characteristics of the device it manages.
236238

237239
2. **Waiting for Device Attachment in PreBind**:
238-
- For fabric devices, the scheduler waits for the device attachment to complete during the `PreBind` phase.
240+
- The scheduler waits until all Conditions in BindingConditions are True.
241+
- For fabric devices, this means that the scheduler waits for the device attachment to complete during the `PreBind` phase.
239242

240243
3. **PreBind Process**:
241244
The overall flow of the `PreBind` process is as follows:
242245

243246
- **Updating ResourceClaim**:
244-
- The scheduler DRA plugin updates the `ResourceClaim` to notify the Composable DRA Controllers that device attachment is needed.
245-
This is the same as the existing `PreBind` process.
246-
- In addition to the existing operations, the update to the `ResourceClaim` includes setting the necessary values in the `AllocatedDeviceStatus` conditions.
247+
- The scheduler DRA plugin copies `BindingConditions` and `BindingFailureConditions` from `ResourceSlice.Device.Basic` to `AllocatedDeviceStatus.Conditions`.
247248

248249
- **Monitoring and Preparation by Composable DRA Controllers**:
249250
- Composable DRA Controllers monitor the `ResourceClaim`. If a device that requires preparation is associated with the `ResourceClaim`, they perform the necessary preparations.
250251
- Once the preparation is complete, they set the conditions to `true`.
251252
- Please note that the scheduler need to abandon binding after the attach is complete in the case of a composable system.
252-
Therefore, Composable DRA Controller sets the condition in BindingFailureGates to true after the attach is complete.
253+
Therefore, Composable DRA Controller sets the condition in BindingFailureConditions to true after the attach is complete.
253254

254255
- **Completion of the PreBind Phase**:
255256
- Once all conditions are met, the `PreBind` phase is completed, and the scheduler proceeds to the next step.
@@ -366,51 +367,56 @@ type BasicDevice struct {
366367
// +optional
367368
Attributes map[QualifiedName]DeviceAttribute
368369

369-
// BindingGates defines the gates for binding.
370+
// BindingConditions defines the conditions for binding.
370371
//
371372
// +optional
372-
BindingGates []string
373+
BindingConditions []string
373374

374-
// BindingFailureGates defines the gates for binding failure.
375+
// BindingFailureConditions defines the conditions for binding failure.
375376
//
376377
// +optional
377-
BindingFailureGates []string
378+
BindingFailureConditions []string
378379

379380
// UsageRestrictedToNode indicates if the usage of an allocation involving this device
380381
// has to be limited to exactly the node that was chosen when allocating the claim.
381382
//
382383
// +optional
383384
UsageRestrictedToNode bool
384385

385-
// BindingTimeout indicates the prepare timeout period(minute).
386+
// BindingTimeoutSeconds indicates the prepare timeout period.
386387
// If the timeout period is exceeded, the scheduler clears the allocation in the ResourceClaim and reschedules the Pod.
387388
//
388389
// +optional
389-
BindingTimeout int
390+
BindingTimeoutSeconds *metav1.Duration
390391
}
392+
393+
const (
394+
BindingConditionsMaxSize = 4
395+
BindingFailureConditionsMaxSize = 4
396+
)
391397
```
392398

393-
#### AllocatedDeviceStatus Enhancements
399+
#### AllocatedDeviceStatus.Conditions Enhancements
394400

395-
The `BindingGates` and `BindingFailureGates` fields within `AllocatedDeviceStatus` are used to indicate the status of the device attachment.
401+
The `BindingConditions` and `BindingFailureConditions` fields within `AllocatedDeviceStatus.Conditions` are used to indicate the status of the device attachment.
396402
These fields will contain a list of conditions, each representing a specific state or event related to the device.
397403

398404
For this feature, following fields are added:
399405

400406
```go
401-
// AllocatedDeviceStatus contains the status of an allocated device, if the
407+
// AllocatedDeviceStatus.Conditions contains the status of an allocated device, if the
402408
// driver chooses to report it. This may include driver-specific information.
403-
type AllocatedDeviceStatus struct {
409+
type AllocatedDeviceStatus.Conditions struct {
404410
...
405-
// BindingGates defines the gates for binding.
411+
// BindingConditions defines the conditions for binding.
406412
//
407413
// +optional
408-
BindingGates map[string]bool
414+
BindingConditions map[string]bool
409415

410-
// BindingFailureGates defines the gates for binding failure.
416+
// BindingFailureConditions defines the conditions for binding failure.
411417
//
412418
// +optional
413-
BindingFailureGates map[string]bool
419+
BindingFailureConditions map[string]bool
414420
}
415421
```
416422

@@ -420,28 +426,25 @@ When `UsageRestrictedToNode: true` is set, the scheduler DRA plugin will perform
420426

421427
1. **Set NodeSelector**: Before the `PreBind` phase, add the `NodeName` to the `ResourceClaim`'s `NodeSelector`.
422428

423-
If Gates are present, the scheduler DRA plugin will perform the following steps during the `PreBind` phase:
429+
If Conditions are present, the scheduler DRA plugin will perform the following steps during the `PreBind` phase:
424430

425-
2. **Copy Gates**: Copy `BindingGates` and `BindingFailureGates` from `ResourceSlice.Device.Basic` to `AllocatedDeviceStatus`.
431+
2. **Copy Conditions**: Copy `BindingConditions` and `BindingFailureConditions` from `ResourceSlice.Device.Basic` to `AllocatedDeviceStatus`.
426432
3. **Wait for Conditions**: Wait for the following conditions:
427-
- Wait until all conditions in the BindingGates are `True` before proceeding to Bind.
428-
- If any one of the conditions in the BindingFailureGates becomes `True`, clear the allocation in the `ResourceClaim` and reschedule the Pod.
429-
- If the preparation of a device takes longer than the `BindingTimeout` period, clear the allocation in the `ResourceClaim` and reschedule the Pod.
433+
- Wait until all conditions in the BindingConditions are `True` before proceeding to Bind.
434+
- If any one of the conditions in the BindingFailureConditions becomes `True`, clear the allocation in the `ResourceClaim` and reschedule the Pod.
435+
- If the preparation of a device takes longer than the `BindingTimeoutSeconds` period, clear the allocation in the `ResourceClaim` and reschedule the Pod.
430436

431-
To support these steps, for example, a DRA driver can include the following definitions in BindingGates or BindingFailureGates within a ResourceSlice:
437+
To support these steps, for example, a DRA driver can include the following definitions in BindingConditions or BindingFailureConditions within a ResourceSlice:
432438

433439
```go
434440
const (
435-
// NeedToPreparing indicates that this device needs some preparation.
436-
NeedToPreparing = "kubernetes.io/need-to-preparing"
437-
438441
// IsPrepared indicates the device ready state.
439442
// If NeedToPreparing is True and IsPrepared is True, the scheduler proceeds to Bind.
440-
IsPrepared = "kubernetes.io/is-prepared"
443+
IsPrepared = "dra.example.com/is-prepared"
441444

442445
// PreparingFailed indicates the device preparation failed state.
443446
// If PreparingFailed is True, the scheduler will clear the allocation in the ResourceClaim and reschedule the Pod.
444-
PreparingFailed = "kubernetes.io/preparing-failed"
447+
PreparingFailed = "dra.example.com/preparing-failed"
445448
)
446449
```
447450

@@ -464,10 +467,10 @@ During the scheduling cycle, the DRA plugin reserves a `ResourceSlice` for the `
464467
In the binding cycle, the reserved `ResourceSlice` is assigned during `PreBind`.
465468

466469
If a fabric device is selected, the scheduler waits for the device attachment during `PreBind`.
467-
The composable controller performs the attachment operation by checking the flag of the `ResourceClaim`.
470+
The composable controller performs the attachment operation by checking the flag of BindingConditions in the `ResourceClaim`.
468471
If the attachment fails, the following steps are taken:
469472

470-
1. **Update ResourceClaim**: The composable controller updates the `AllocatedDeviceStatus` to indicate the failure of the attachment by setting a condition with `Type: kubernetes.io/attach-failed` and `Status: True`.
473+
1. **Update ResourceClaim**: The composable controller updates the `AllocatedDeviceStatus.Conditions` to indicate the failure of the attachment by setting a condition in BindingFailureConditions to `True`.
471474
2. **Fail the Binding Cycle**: The scheduler detects the failed attachment condition and fails the binding cycle. This prevents the pod from proceeding with an unattached device.
472475
3. **Unbind ResourceClaim and ResourceSlice**: The scheduler DRA plugin unbinds the `ResourceClaim` and `ResourceSlice` in `Unreserve`, clearing the allocation to prevent the fabric device from being used in the `ResourceClaim`.
473476
4. **Retry Scheduling**: In the next scheduling cycle, the scheduler attempts to bind the `ResourceClaim` again.
@@ -767,7 +770,7 @@ well as the [existing list] of feature gates.
767770
-->
768771

769772
- [x] Feature gate (also fill in values in `kep.yaml`)
770-
- Feature gate name: DRAPrebindingGates
773+
- Feature gate name: DRAPrebindingConditions
771774
- Components depending on the feature gate: kube-scheduler
772775
- [ ] Other
773776
- Describe the mechanism:

0 commit comments

Comments
 (0)