You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-scheduling/5007-device-attach-before-pod-scheduled/README.md
+15-15Lines changed: 15 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -330,7 +330,7 @@ This issue needs to be resolved before the beta is released.
330
330
### DRA Scheduler Plugin Design Overview
331
331
332
332
This document outlines the design of the DRA Scheduler Plugin, focusing on the handling of fabric devices.
333
-
Key additions include `BindingConditions` and `BindingFailureConditions` for device identification and preparetion, enhancements to `AllocatedDeviceStatus`, and the process for handling `ResourceSlices` upon attachment failure.
333
+
Key additions include `BindingConditions` and `BindingFailureConditions` for device identification and preparation, enhancements to `AllocatedDeviceStatus`, and the process for handling `ResourceSlices` upon attachment failure.
334
334
The composable controller design is also discussed, emphasizing efficient utilization of fabric devices.
335
335
336
336

@@ -343,7 +343,7 @@ These fields will be used by the controller that exposes the `ResourceSlice` to
343
343
```go
344
344
// BasicDevice represents a basic device instance.
345
345
typeBasicDevicestruct {
346
-
// BindingConditions defines the conditions for binding.
346
+
// BindingConditions defines the conditions for proceeding with binding. All listed conditions must be True to proceed with binding.
347
347
//
348
348
// +optional
349
349
BindingConditions []string
@@ -360,11 +360,11 @@ type BasicDevice struct {
360
360
// +optional
361
361
UsageRestrictedToNodebool
362
362
363
-
//BindingTimeoutSeconds indicates the prepare timeout period.
364
-
// If the timeout period is exceeded, the scheduler clears the allocation in the ResourceClaim and reschedules the Pod.
363
+
//BindingTimeout indicates the prepare timeout period.
364
+
// If the timeout period is exceeded before all BindingConditions reach a True state, the scheduler clears the allocation in the ResourceClaim and reschedules the Pod.
365
365
//
366
366
// +optional
367
-
BindingTimeoutSeconds *metav1.Duration
367
+
BindingTimeout *metav1.Duration
368
368
}
369
369
370
370
const (
@@ -385,28 +385,28 @@ For this feature, following fields are added:
385
385
// driver chooses to report it. This may include driver-specific information.
386
386
typeAllocatedDeviceStatusstruct {
387
387
...
388
-
// BindingConditions defines the conditions for binding.
388
+
// BindingConditions defines the conditions for proceeding with binding. All listed conditions must be True to proceed with binding.
389
389
//
390
390
// +optional
391
391
BindingConditions []string
392
392
393
393
// BindingFailureConditions defines the conditions for binding failure.
394
-
// If true, a binding failure occurred.
394
+
// If any is True, a binding failure occurred.
395
395
//
396
396
// +optional
397
-
BindingFailureConditions []stirng
397
+
BindingFailureConditions []string
398
398
399
399
// UsageRestrictedToNode indicates if the usage of an allocation involving this device
400
400
// has to be limited to exactly the node that was chosen when allocating the claim.
401
401
//
402
402
// +optional
403
403
UsageRestrictedToNodebool
404
404
405
-
//BindingTimeoutSeconds indicates the prepare timeout period.
406
-
// If the timeout period is exceeded, the scheduler clears the allocation in the ResourceClaim and reschedules the Pod.
405
+
//BindingTimeout indicates the prepare timeout period.
406
+
// If the timeout period is exceeded before all BindingConditions reach a True state, the scheduler clears the allocation in the ResourceClaim and reschedules the Pod.
407
407
//
408
408
// +optional
409
-
BindingTimeoutSeconds *metav1.Duration
409
+
BindingTimeout *metav1.Duration
410
410
}
411
411
```
412
412
@@ -418,11 +418,11 @@ When `UsageRestrictedToNode: true` is set, the scheduler DRA plugin will perform
418
418
419
419
If Conditions are present, the scheduler DRA plugin will perform the following steps during the `PreBind` phase:
420
420
421
-
2.**Copy Conditions**: Copy `UsageRestrictedToNode`, `BindingTimeoutSeconds`, `BindingConditions` and `BindingFailureConditions` from `ResourceSlice.Device.Basic` to `AllocatedDeviceStatus`.
421
+
2.**Copy Conditions**: Copy `UsageRestrictedToNode`, `BindingTimeout`, `BindingConditions` and `BindingFailureConditions` from `ResourceSlice.Device.Basic` to `AllocatedDeviceStatus`.
422
422
3.**Wait for Conditions**: Wait for the following conditions:
423
423
- Wait until all conditions in the BindingConditions are `True` before proceeding to Bind.
424
424
- If any one of the conditions in the BindingFailureConditions becomes `True`, clear the allocation in the `ResourceClaim` and reschedule the Pod.
425
-
- If the preparation of a device takes longer than the `BindingTimeoutSeconds` period, clear the allocation in the `ResourceClaim` and reschedule the Pod.
425
+
- If the preparation of a device takes longer than the `BindingTimeout` period, clear the allocation in the `ResourceClaim` and reschedule the Pod.
426
426
427
427
To support these steps, for example, a DRA driver can include the following definitions in BindingConditions or BindingFailureConditions within a ResourceSlice:
428
428
@@ -669,7 +669,7 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
669
669
#### Beta
670
670
671
671
- Gather feedback from developers and surveys
672
-
-Resolove the following issues
672
+
-Resolve the following issues
673
673
- Scheduler does not guarantee to pick up the same node for the Pod after the restart
674
674
- If Scheduler picks up another node for the Pod after the restart, devices are unnecessarily left on the original nodes
675
675
(Composable DRA controller needs to have the function to detach a device automatically if it is not used by a Pod for a certain period of time)
@@ -761,7 +761,7 @@ well as the [existing list] of feature gates.
761
761
-->
762
762
763
763
-[x] Feature gate (also fill in values in `kep.yaml`)
764
-
- Feature gate name: DRAPrebindingConditions
764
+
- Feature gate name: DRADeviceBindingConditions
765
765
- Components depending on the feature gate: kube-scheduler
0 commit comments