Skip to content

Conversation

MenD32
Copy link
Contributor

@MenD32 MenD32 commented Sep 22, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

Adds support for partitionable devices when calculating DRA utilization

Which issue(s) this PR fixes:

Fixes #8053

Special notes for your reviewer:

Does this PR introduce a user-facing change?

DRA: partitionable devices support

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. area/cluster-autoscaler labels Sep 22, 2025
@k8s-ci-robot k8s-ci-robot added the area/provider/kwok Issues or PRs related to the kwok cloud provider for Cluster Autoscaler label Sep 22, 2025
@k8s-ci-robot k8s-ci-robot requested a review from kgolab September 22, 2025 11:45
@k8s-ci-robot k8s-ci-robot added area/vertical-pod-autoscaler needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 22, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @MenD32. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Sep 22, 2025
@MenD32 MenD32 force-pushed the feat/partitionable-devices-support branch from ba8303d to 4fa5202 Compare September 22, 2025 11:56
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 22, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: MenD32
Once this PR has been reviewed and has the lgtm label, please assign bigdarkclown for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/provider/cluster-api Issues or PRs related to Cluster API provider area/provider/rancher size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 22, 2025
@MenD32
Copy link
Contributor Author

MenD32 commented Sep 22, 2025

@towca I had to revert #8539 in this PR because the partitionable devices feature is only available in k8s.io/api/resource/v1beta1 and has not yet been released into k8s.io/api/resource/v1

@MenD32 MenD32 changed the title Feat/partitionable devices support Feat: partitionable devices support Sep 22, 2025
@MenD32
Copy link
Contributor Author

MenD32 commented Sep 22, 2025

@towca this the same PR as #8160 which I had to close because I had some issues reverting #8539.

I'm not sure if reverting was in fact the right move, but in order to add this feature I think there is no way around it...

@jackfrancis
Copy link
Contributor

@towca I had to revert #8539 in this PR because the partitionable devices feature is only available in k8s.io/api/resource/v1beta1 and has not yet been released into k8s.io/api/resource/v1

@nojnhuh can you remind me how this flywheel works?

@nojnhuh
Copy link
Contributor

nojnhuh commented Oct 2, 2025

@towca I had to revert #8539 in this PR because the partitionable devices feature is only available in k8s.io/api/resource/v1beta1 and has not yet been released into k8s.io/api/resource/v1

@nojnhuh can you remind me how this flywheel works?

Doesn't v1 already include everything necessary for partitionable devices? https://github.com/kubernetes/kubernetes/blob/v1.34.1/staging/src/k8s.io/api/resource/v1/types.go#L157-L179

@MenD32 What was the exact issue you were running into that prompted going back to v1beta1?

@MenD32
Copy link
Contributor Author

MenD32 commented Oct 2, 2025

When I tried to merge with master I had an issue with Device.Basic.ConsumesCounters so I wrongly assumed it wasn't merged into v1 and kept under v1beta1, Now I see that they changed the Device struct to put ConsumesCounters somewhere else... I'll revert the version rollback

@MenD32 MenD32 force-pushed the feat/partitionable-devices-support branch from 1b8c0c8 to 3956443 Compare October 2, 2025 09:29
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 2, 2025
@towca
Copy link
Collaborator

towca commented Oct 2, 2025

@MenD32 thanks a lot for flagging this, haven't bumped into this particular issue before 😅

@jackfrancis @nojnhuh I get that we might be ok for partitionable devices specifically, but will that hold for other features? I.e. can we keep iterating on DRA KEPs in CA while only importing the v1 version of the DRA API? What if there's a KEP that requires some API changes, wouldn't that start in the next beta version? This might be especially painful because only GA APIs get enabled by default, so v1 is the only one we can "rely" on being served for 1.34+.

@nojnhuh
Copy link
Contributor

nojnhuh commented Oct 2, 2025

My understanding is that alpha/beta features that intersect with the existing v1 APIs will be added to v1 and still feature gated, e.g. changes to DeviceClass, ResourceSlice, ResourceClaim(Template).

When a brand new API is introduced, then it will likely land in an alpha/beta API version first, e.g. DeviceTaintRules initially landing in v1alpha3 in 1.33: https://github.com/kubernetes/enhancements/tree/master/keps/sig-scheduling/5055-dra-device-taints-and-tolerations#:~:text=Describe%20the%20mechanism%3A%20resource.k8s.io/v1alpha3%20API%20group

1.35 is the first release cycle where we're adding features since v1 was added, so I'll keep an eye on KEP implementations for this cycle and let you know if that's not actually what happens.

@towca
Copy link
Collaborator

towca commented Oct 3, 2025

Thanks @nojnhuh, that makes sense, I didn't consider feature-gated fields in v1. And in the brand new API case we could import it and have it behind a separate flag.

@jackfrancis
Copy link
Contributor

/release-note-edit

DRA: partitionable devices support

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 3, 2025
@jackfrancis
Copy link
Contributor

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Oct 3, 2025
@jackfrancis jackfrancis removed area/provider/cluster-api Issues or PRs related to Cluster API provider area/provider/rancher area/provider/kwok Issues or PRs related to the kwok cloud provider for Cluster Autoscaler area/vertical-pod-autoscaler labels Oct 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CA DRA: handle partitionable devices (KEP-4815)
8 participants