Skip to content

Document DRA Device Binding Conditions in v1.36#54541

Merged
k8s-ci-robot merged 1 commit intokubernetes:dev-1.36from
ttsuuubasa:dev-1.36-dra-device-binding-conditions
Apr 8, 2026
Merged

Document DRA Device Binding Conditions in v1.36#54541
k8s-ci-robot merged 1 commit intokubernetes:dev-1.36from
ttsuuubasa:dev-1.36-dra-device-binding-conditions

Conversation

@ttsuuubasa
Copy link
Copy Markdown
Contributor

@ttsuuubasa ttsuuubasa commented Feb 19, 2026

Description

k/k development PR: kubernetes/kubernetes#137795

Summary

Promotes Device Binding Conditions from alpha to beta status in Kubernetes v1.36.

Changes Made

  1. Documentation Structure Update (dynamic-resource-allocation.md)

    • Moved Device Binding Conditions section from "DRA alpha features" to "DRA beta features"
  2. Feature Gate Lifecycle Update (DRADeviceBindingConditions.md)

    • Updated feature gate stages:
      • Alpha: v1.34 - v1.35 (default: false)
      • Beta: v1.36+ (default: true)

Technical Context

Device Binding Conditions enable the Kubernetes scheduler to delay Pod binding until external resources (such as fabric-attached GPUs or reprogrammable FPGAs) are confirmed ready. This feature:

  • Improves scheduling reliability by avoiding premature binding
  • Enables coordination with external device controllers
  • Implements waiting behavior in the PreBind phase of the scheduling framework
  • Supports configurable timeout (default: 600 seconds)

Impact

  • Users on v1.36+: Device Binding Conditions will be enabled by default
  • Feature stability: Reflects increased production readiness and API stability
  • Documentation accuracy: Ensures docs correctly categorize the feature's maturity level

Issue

k/enhancement issue: kubernetes/enhancements#5007

Closes: #

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 19, 2026
@k8s-ci-robot k8s-ci-robot added this to the 1.36 milestone Feb 19, 2026
@k8s-ci-robot k8s-ci-robot added the language/en Issues or PRs related to English language label Feb 19, 2026
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Feb 19, 2026
@netlify
Copy link
Copy Markdown

netlify bot commented Feb 19, 2026

Pull request preview available for checking

Built without sensitive environment variables

Name Link
🔨 Latest commit 9c2122b
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-io-main-staging/deploys/69d5e69d306f0a0007230295
😎 Deploy Preview https://deploy-preview-54541--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@chadmcrowell
Copy link
Copy Markdown
Contributor

Hi @ttsuuubasa 👋 v1.36 Communications team here,

@mariafromano-25 as author of #54709, I'd like you to be a writing buddy for @ttsuuubasa on this PR.

Please:

  • Review this PR, paying attention to the guidelines and review hints
  • Update your own PR based on any best practices you identify that should be applied
  • Remember to be compassionate with your fellow article author

@kernel-kun
Copy link
Copy Markdown
Contributor

Hello @ttsuuubasa 👋, v1.36 Docs Team here again!

Please take a look at Documenting for a release - PR Ready for Review to get your PR ready for review before Tuesday 31st March 2026.

Please let us know once your PR is fully Ready for Review -- meaning all documentation updates are complete and it's awaiting reviewer feedback -- so we can update our tracking.

Thank you!

@ttsuuubasa
Copy link
Copy Markdown
Contributor Author

/wg device-management

@k8s-ci-robot k8s-ci-robot added the wg/device-management Categorizes an issue or PR as relevant to WG Device Management. label Mar 24, 2026
@ttsuuubasa ttsuuubasa force-pushed the dev-1.36-dra-device-binding-conditions branch from cf3fb15 to 4400fe0 Compare March 24, 2026 08:39
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 24, 2026
@ttsuuubasa ttsuuubasa changed the title Placeholder PR for KEP-5007: DRA Device Binding Conditions in v1.36 Document DRA Device Binding Conditions in v1.36 Mar 24, 2026
@ttsuuubasa ttsuuubasa marked this pull request as ready for review March 24, 2026 09:04
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 24, 2026
@k8s-ci-robot k8s-ci-robot requested a review from lmktfy March 24, 2026 09:05
@ttsuuubasa
Copy link
Copy Markdown
Contributor Author

@pohly
I’ve pushed a documentation update as part of the beta promotion of Device Binding Conditions, and I’d appreciate your review. I’d like to start with a technical review.
The change simply moves the Device Binding Conditions content from the alpha section to the beta section.
Please let me know if there are any other changes needed or additional information that should be added.

Copy link
Copy Markdown
Member

@lmktfy lmktfy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At beta, and especially for features that are enabled by default, we ask for docs that are GA quality.

The binding conditions explanation seems, to me, that it mostly belongs in a page that driver authors would read (we don't yet have that page - we should aim to have one).

Please look at the following feedback in that light.

This ensures that non-admin users cannot misuse the feature.
Starting with Kubernetes v1.34, this label has been updated to `resource.kubernetes.io/admin-access: "true"`.

### Device Binding Conditions {#device-binding-conditions}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### Device Binding Conditions {#device-binding-conditions}
### Device binding conditions

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: when we document (sub)features for DRA, we should place them where they would belong if they were stable.

If we do that, then when features graduate, the docs remain easy to find and use.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to pursue this, I feel that the current sections such as “DRA beta features” and “DRA alpha features” would no longer be appropriate, and that we would need to reconsider the overall structure of this chapter.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @lmktfy; we merged the removal of those artificial sections for exactly that reason in #54648.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-organization of DRA (sub)features may need to be done in a follow-up to this PR
I'll create an issue


{{< feature-state feature_gate_name="DRADeviceBindingConditions" >}}

Device Binding Conditions allow the Kubernetes scheduler to delay Pod binding until
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Device Binding Conditions allow the Kubernetes scheduler to delay Pod binding until
As the author of a DRA driver, you can use
_device binding conditions_ to defer Pod binding
until

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence is now rewritten for DRA driver developers, and I’d like to discuss whether we should proceed this way.

This improves scheduling reliability by avoiding premature binding and enables coordination
with external device controllers.

To use this feature, device drivers (typically managed by driver owners) must publish the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To use this feature, device drivers (typically managed by driver owners) must publish the
To use this ability to delay binding, the DRA driver that
you are writing needs to publish all of the

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this sentence, “you” refers to DRA driver developers, which means this is also written with driver authors in mind.

with external device controllers.

To use this feature, device drivers (typically managed by driver owners) must publish the
following fields in the `Device` section of a `ResourceSlice`. Cluster administrators
Copy link
Copy Markdown
Member

@lmktfy lmktfy Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
following fields in the `Device` section of a `ResourceSlice`. Cluster administrators
following fields in the `device` section of a ResourceSlice. Because this is relies on a beta feature, you should also clearly document that cluster administrators

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this sentence, “you” refers to DRA driver developers, which means this is also written with driver authors in mind.

inside the ResourceClaim, which external controllers can use to perform node-specific
operations such as device attachment or preparation.

All condition types listed in bindingConditions and bindingFailureConditions are evaluated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All condition types listed in bindingConditions and bindingFailureConditions are evaluated
The control plane discovers all the binding conditions (from `bindingConditions` and `bindingFailureConditions`) and evaluates those against the list of observed conditions, taken

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you assuming that binding conditions in the ResourceSlice are compared against and evaluated together with the binding conditions in the ResourceClaim? In practice, I believe an external controller would evaluate only the binding conditions in the ResourceClaim.

In addition, based on our experience, the controller that sets binding conditions is not necessarily limited to the control plane. There are also designs where such controllers are distributed and run on each node. For this reason, rather than explicitly referring to the control plane, I thought it might be better to use a more general term such as an external controller.

The scheduler waits up to **600 seconds** (default) for all `bindingConditions` to become `True`.
If the timeout is reached or any `bindingFailureConditions` are `True`, the scheduler
clears the allocation and reschedules the Pod.
This timeout duration is configurable by the user through `KubeSchedulerConfiguration`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)

Suggested change
This timeout duration is configurable by the user through `KubeSchedulerConfiguration`.
A cluster administration can configure this timeout duration by editing the kube-scheduler configuration file.
#### Example {#device-binding-conditions-example}
Here is an example of a ResourceSlice that you might see in a cluster where there's a DRA driver in use, and that driver supports binding conditions:

(if making this change, check if you need other headings as well so that the new content makes sense)

apiVersion: resource.k8s.io/v1
kind: ResourceSlice
metadata:
name: gpu-slice
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)

Suggested change
name: gpu-slice
name: gpu-slice-1

- External controllers can use the node selector in the ResourceClaim to perform
node-specific setup on the selected node.

An example of configuring this timeout in `KubeSchedulerConfiguration` is given below:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider moving this just after the place where we mention that cluster administrators can configure this.

All condition types listed in bindingConditions and bindingFailureConditions are evaluated
from the `status.conditions` field of the ResourceClaim.
External controllers are responsible for updating these conditions using standard Kubernetes
condition semantics (`type`, `status`, `reason`, `message`, `lastTransitionTime`).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
condition semantics (`type`, `status`, `reason`, `message`, `lastTransitionTime`).
condition semantics (`type`, `status`, `reason`, `message`, `lastTransitionTime`).
If you are the driver author, you may prefer to
provide your own controller, that is custom to the
hardware or other dynamic resource that the driver works with.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Driver-author–focused text has also been added here, and I would like to discuss whether this is necessary.

@ttsuuubasa
Copy link
Copy Markdown
Contributor Author

@lmktfy
Thank you for the prompt review comments.

The binding conditions explanation seems, to me, that it mostly belongs in a page that driver authors would read (we don't yet have that page - we should aim to have one).

I agree that we should have documentation targeted at DRA driver developers.
However, I would like to discuss whether developer‑focused content should be included in this document.

I made a similar suggestion before, and at that time there was a proposal that updating the DRA example driver could be useful. My understanding is that this would mean implementing the DRA example driver so that it publishes BindingConditions, and then letting developers try it out. In that case, the usage and guidance for developers would be explained in places like the README.
kubernetes/enhancements#5007 (comment)

I agree with most of your suggestions, but regarding the text aimed at DRA driver developers, I would like to respond with comments and discuss it further.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 29, 2026
@ttsuuubasa ttsuuubasa force-pushed the dev-1.36-dra-device-binding-conditions branch from 4400fe0 to 050e88b Compare March 31, 2026 07:41
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 31, 2026
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 7, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: 2cb35b36cbbcd09c81a9051a102a4b811a0ac8f8

@dom4ha
Copy link
Copy Markdown
Member

dom4ha commented Apr 7, 2026

/sig scheduling

@k8s-ci-robot k8s-ci-robot added the sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. label Apr 7, 2026
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Scheduling Apr 7, 2026
@dom4ha
Copy link
Copy Markdown
Member

dom4ha commented Apr 7, 2026

/lgtm

Copy link
Copy Markdown
Member

@dom4ha dom4ha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit.

```

Device binding conditions is an *alpha feature* and only enabled when the
Device binding conditions is an *beta feature* and is enabled by default with the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Device binding conditions is an *beta feature* and is enabled by default with the
Device binding conditions is an *beta feature* and is enabled by default, controlled by the

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you called it out: "is a beta feature"... 😉

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pohly @dom4ha
Thank you for the review. I’ve addressed the points you raised. Could you please take another look and re-approve?

Now that you called it out: “is a beta feature”… 😉

I keep missing this every time—sorry about that, and thanks for pointing it out.

@mariafromano-25
Copy link
Copy Markdown

Yup, besides the grammar nit @pohly pointed out above, all else /lgtm !! Good job :)

Signed-off-by: Tsubasa Watanabe <w.tsubasa@fujitsu.com>
@ttsuuubasa ttsuuubasa force-pushed the dev-1.36-dra-device-binding-conditions branch from 2596f2d to 9c2122b Compare April 8, 2026 05:24
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 8, 2026
@k8s-ci-robot k8s-ci-robot requested review from dom4ha and pohly April 8, 2026 05:24
@kernel-kun
Copy link
Copy Markdown
Contributor

Hey team, due to the force-push, the /lgtm label has been removed.
Can someone from Technical Review please review the latest state once, and leave a /lgtm if everything looks good
Thanks!

@pohly
Copy link
Copy Markdown
Contributor

pohly commented Apr 8, 2026

@ttsuuubasa: when you squash commits, please don't rebase: https://github.com/kubernetes/website/compare/2596f2d0151336d6ef889a84a2b50075f0de60ed..9c2122bed593e4aa78c98123acd74176fb3f323b for your last force-push is useless for an incremental review.

Use git rebase -i --keep-base.

Copy link
Copy Markdown
Contributor

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 8, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

LGTM label has been added.

DetailsGit tree hash: 10957afe57844ae7143f1e44499c767e4fa7fb86

@ttsuuubasa
Copy link
Copy Markdown
Contributor Author

when you squash commits, please don't rebase:

Sorry about that, I understand now.

@ttsuuubasa
Copy link
Copy Markdown
Contributor Author

@kernel-kun
The technical review is complete. Could you please ask a Docs approver to take a look and approve it?

@kernel-kun
Copy link
Copy Markdown
Contributor

cc. @tengqm @CodesbyUnnati (this week's PR Wranglers)

@reylejano
Copy link
Copy Markdown
Member

/approve

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pohly, reylejano

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 8, 2026
@k8s-ci-robot k8s-ci-robot merged commit e735d58 into kubernetes:dev-1.36 Apr 8, 2026
6 checks passed
@github-project-automation github-project-automation bot moved this from Needs Triage to Done in SIG Scheduling Apr 8, 2026
@pohly pohly moved this from 👀 In review to ✅ Done in Dynamic Resource Allocation Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. wg/device-management Categorizes an issue or PR as relevant to WG Device Management.

Projects

Status: ✅ Done
Status: Done

Development

Successfully merging this pull request may close these issues.

9 participants