Skip to content

Commit 46621da

Browse files
committed
Update: Retitle the KEP, and apply the feedback
1 parent 438e71e commit 46621da

File tree

2 files changed

+27
-13
lines changed

2 files changed

+27
-13
lines changed

keps/sig-scheduling/5007-device-attach-before-pod-scheduled/README.md

Lines changed: 25 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ If none of those approvers are still appropriate, then changes to that list
5858
should be approved by the remaining approvers and/or the owning SIG (or
5959
SIG Architecture for cross-cutting KEPs).
6060
-->
61-
# [KEP-5007](https://github.com/kubernetes/enhancements/issues/5007): DRA Device Attach Before Pod Scheduled
61+
# [KEP-5007](https://github.com/kubernetes/enhancements/issues/5007): DRA: Device Binding Conditions
6262

6363
<!--
6464
This is the title of your KEP. Keep it short, simple, and descriptive. A good
@@ -91,7 +91,7 @@ tags, and then generate with `hack/update-toc.sh`.
9191
- [Design Details](#design-details)
9292
- [DRA Scheduler Plugin Design Overview](#dra-scheduler-plugin-design-overview)
9393
- [BasicDevice Enhancements](#basicdevice-enhancements)
94-
- [AllocatedDeviceStatus.Conditions Enhancements](#allocateddevicestatusconditions-enhancements)
94+
- [AllocatedDeviceStatus Enhancements](#allocateddevicestatus-enhancements)
9595
- [Scheduler DRA Plugin Modifications](#scheduler-dra-plugin-modifications)
9696
- [PreBind Phase Timeout](#prebind-phase-timeout)
9797
- [Handling ResourceSlices Upon Failure of Attachment](#handling-resourceslices-upon-failure-of-attachment)
@@ -244,7 +244,7 @@ The basic idea is the following:
244244
The overall flow of the `PreBind` process is as follows:
245245

246246
- **Updating ResourceClaim**:
247-
- The scheduler DRA plugin copies `BindingConditions` and `BindingFailureConditions` from `ResourceSlice.Device.Basic` to `AllocatedDeviceStatus.Conditions`.
247+
- The scheduler DRA plugin copies `BindingConditions` and `BindingFailureConditions` from `ResourceSlice.Device.Basic` to `AllocatedDeviceStatus`.
248248

249249
- **Monitoring and Preparation by Composable DRA Controllers**:
250250
- Composable DRA Controllers monitor the `ResourceClaim`. If a device that requires preparation is associated with the `ResourceClaim`, they perform the necessary preparations.
@@ -330,7 +330,7 @@ This issue needs to be resolved before the beta is released.
330330
### DRA Scheduler Plugin Design Overview
331331

332332
This document outlines the design of the DRA Scheduler Plugin, focusing on the handling of fabric devices.
333-
Key additions include `BindingConditions` and `BindingFailureConditions for device identification and preparetion, enhancements to `AllocatedDeviceStatus.Conditions`, and the process for handling `ResourceSlices` upon attachment failure.
333+
Key additions include `BindingConditions` and `BindingFailureConditions` for device identification and preparetion, enhancements to `AllocatedDeviceStatus`, and the process for handling `ResourceSlices` upon attachment failure.
334334
The composable controller design is also discussed, emphasizing efficient utilization of fabric devices.
335335

336336
![proposal](proposal.jpg)
@@ -349,6 +349,7 @@ type BasicDevice struct {
349349
BindingConditions []string
350350

351351
// BindingFailureConditions defines the conditions for binding failure.
352+
// If true, a binding failure occurred.
352353
//
353354
// +optional
354355
BindingFailureConditions []string
@@ -372,27 +373,40 @@ const (
372373
)
373374
```
374375

375-
#### AllocatedDeviceStatus.Conditions Enhancements
376+
#### AllocatedDeviceStatus Enhancements
376377

377-
The `BindingConditions` and `BindingFailureConditions` fields within `AllocatedDeviceStatus.Conditions` are used to indicate the status of the device attachment.
378+
The `BindingConditions` and `BindingFailureConditions` fields within `AllocatedDeviceStatus` are used to indicate the status of the device attachment.
378379
These fields will contain a list of conditions, each representing a specific state or event related to the device.
379380

380381
For this feature, following fields are added:
381382

382383
```go
383-
// AllocatedDeviceStatus.Conditions contains the status of an allocated device, if the
384+
// AllocatedDeviceStatus contains the status of an allocated device, if the
384385
// driver chooses to report it. This may include driver-specific information.
385-
type AllocatedDeviceStatus.Conditions struct {
386+
type AllocatedDeviceStatus struct {
386387
...
387388
// BindingConditions defines the conditions for binding.
388389
//
389390
// +optional
390-
BindingConditions map[string]bool
391+
BindingConditions []string
391392

392393
// BindingFailureConditions defines the conditions for binding failure.
394+
// If true, a binding failure occurred.
393395
//
394396
// +optional
395-
BindingFailureConditions map[string]bool
397+
BindingFailureConditions []stirng
398+
399+
// UsageRestrictedToNode indicates if the usage of an allocation involving this device
400+
// has to be limited to exactly the node that was chosen when allocating the claim.
401+
//
402+
// +optional
403+
UsageRestrictedToNode bool
404+
405+
// BindingTimeoutSeconds indicates the prepare timeout period.
406+
// If the timeout period is exceeded, the scheduler clears the allocation in the ResourceClaim and reschedules the Pod.
407+
//
408+
// +optional
409+
BindingTimeoutSeconds *metav1.Duration
396410
}
397411
```
398412

@@ -404,7 +418,7 @@ When `UsageRestrictedToNode: true` is set, the scheduler DRA plugin will perform
404418

405419
If Conditions are present, the scheduler DRA plugin will perform the following steps during the `PreBind` phase:
406420

407-
2. **Copy Conditions**: Copy `BindingConditions` and `BindingFailureConditions` from `ResourceSlice.Device.Basic` to `AllocatedDeviceStatus`.
421+
2. **Copy Conditions**: Copy `UsageRestrictedToNode`, `BindingTimeoutSeconds`, `BindingConditions` and `BindingFailureConditions` from `ResourceSlice.Device.Basic` to `AllocatedDeviceStatus`.
408422
3. **Wait for Conditions**: Wait for the following conditions:
409423
- Wait until all conditions in the BindingConditions are `True` before proceeding to Bind.
410424
- If any one of the conditions in the BindingFailureConditions becomes `True`, clear the allocation in the `ResourceClaim` and reschedule the Pod.

keps/sig-scheduling/5007-device-attach-before-pod-scheduled/kep.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
title: DRA Device Attach Before Pod Scheduled
1+
title: DRA: Device Binding Conditions
22
kep-number: 5007
33
authors:
44
- "@KobayashiD27"
@@ -39,7 +39,7 @@ milestone:
3939
# The following PRR answers are required at alpha release
4040
# List the feature gate name and the components for which it must be enabled
4141
feature-gates:
42-
- name: DRADeviceAttachDuringScheduling
42+
- name: DRAPrebindingConditions
4343
components:
4444
- kube-scheduler
4545
disable-supported: true

0 commit comments

Comments
 (0)