Skip to content

Commit db5d52d

Browse files
committed
feat: Migrate DRA integration from v1alpha2 to v1 API
Migrate Dynamic Resource Allocation (DRA) integration from the alpha v1alpha2 API to the stable v1 API introduced in Kubernetes 1.34. Major changes: - Update RBAC permissions to access resource.k8s.io API resources (resourceclaims, resourceslices) instead of using kubelet API - Replace kubelet-based DRA resource discovery with direct API queries using new draclient package - Update documentation from ResourceClass to DeviceClass terminology - Change resourceName annotation format to <claim-name>/<request-name> - Update examples from NVIDIA-specific to generic SR-IOV usage - Add comprehensive test coverage for DRA integration - Remove CDI-based device handling in favor of k8s.cni.cncf.io/deviceID attributes Technical details: - Add draclient.GetPodResourceMap() call in k8sclient - Remove getDRAResources() from kubeletclient (now queries API directly) - Update to use ResourceClaimTemplate instead of ResourceClaim - Fix protobuf field naming (CDIDevices -> CdiDevices) - Add 6 new test cases for DRA scenarios in k8sclient_test.go This migration enables Multus to work with the stable DRA API and removes dependency on kubelet's PodResources API for DRA resources.
1 parent b96ffd5 commit db5d52d

File tree

12 files changed

+808
-170
lines changed

12 files changed

+808
-170
lines changed

deployments/multus-daemonset-thick.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,15 @@ rules:
7272
- list
7373
- update
7474
- watch
75+
- apiGroups:
76+
- "resource.k8s.io"
77+
resources:
78+
- resourceclaims
79+
- resourceclaims/status
80+
- resourceslices
81+
verbs:
82+
- get
83+
- list
7584
- apiGroups:
7685
- ""
7786
- events.k8s.io

deployments/multus-daemonset.yml

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,44 @@ rules:
7979
- create
8080
- patch
8181
- update
82+
kind: ClusterRole
83+
apiVersion: rbac.authorization.k8s.io/v1
84+
metadata:
85+
name: multus
86+
rules:
87+
- apiGroups: ["k8s.cni.cncf.io"]
88+
resources:
89+
- '*'
90+
verbs:
91+
- '*'
92+
- apiGroups:
93+
- ""
94+
resources:
95+
- pods
96+
- pods/status
97+
verbs:
98+
- get
99+
- list
100+
- update
101+
- watch
102+
- apiGroups:
103+
- "resource.k8s.io"
104+
resources:
105+
- resourceclaims
106+
- resourceclaims/status
107+
- resourceslices
108+
verbs:
109+
- get
110+
- list
111+
- apiGroups:
112+
- ""
113+
- events.k8s.io
114+
resources:
115+
- events
116+
verbs:
117+
- create
118+
- patch
119+
- update
82120
---
83121
kind: ClusterRoleBinding
84122
apiVersion: rbac.authorization.k8s.io/v1

docs/how-to-use.md

Lines changed: 87 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -645,112 +645,132 @@ If you wish to have auto configuration use the `readinessindicatorfile` in the c
645645

646646
### Run pod with network annotation and Dynamic Resource Allocation driver
647647

648-
> :warning: Dynamic Resource Allocation (DRA) is [currently an alpha](https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/),
649-
> and is subject to change. Please consider this functionality as a preview. The architecture and usage of DRA in
650-
> Multus CNI may change in the future as this technology matures.
651-
>
652-
> The current DRA integration is based on the DRA API for Kubernetes 1.26 to 1.30. With Kubernetes 1.31, the DRA API
653-
> will change and multus doesn't integrate with the new API yet.
654648

655-
Dynamic Resource Allocation is alternative mechanism to device plugin which allows to requests pod and container
656-
resources.
649+
Dynamic Resource Allocation is an alternative mechanism to device plugin which allows pods to request pod and container
650+
resources dynamically.
657651

658-
The following sections describe how to use DRA with multus and NVIDIA DRA driver. Other DRA networking driver vendors
659-
should follow similar concepts to make use of multus DRA support.
652+
The following sections describe how to use DRA with Multus. DRA networking driver vendors should follow similar
653+
concepts to make use of Multus DRA support.
660654

661655
#### Prerequisite
662656

663-
1. Kubernetes 1.27
664-
2. Container Runtime with CDI support enabled
665-
3. Kubernetes runtime-config=resource.k8s.io/v1alpha2
666-
4. Kubernetes feature-gates=DynamicResourceAllocation=True,KubeletPodResourcesDynamicResources=true
657+
1. Kubernetes 1.34+
667658

668659
#### Install DRA driver
669660

670-
The current example uses NVIDIA DRA driver for networking. This DRA driver is not publicly available. An alternative to
671-
this DRA driver is available at [dra-example-driver](https://github.com/kubernetes-sigs/dra-example-driver).
661+
You need to install a DRA driver that provides network devices. For example, you can use the SR-IOV DRA driver or
662+
other DRA networking drivers. Refer to your DRA driver documentation for installation instructions.
672663

673-
#### Create dynamic resource class with NVIDIA network DRA driver
664+
The DRA drive MUST expose the following attribute `k8s.cni.cncf.io/deviceID` containing the device ID
665+
that multus will pass to the CNI
674666

675-
The `ResourceClass` defines the resource pool of `sf-pool-1`.
667+
#### Create network attachment definition with resource name
668+
669+
The `k8s.v1.cni.cncf.io/resourceName` annotation is used to associate a NetworkAttachmentDefinition with DRA resources.
670+
The format is: `<pod-resource-name>/<result-name>` where:
671+
- `pod-resource-name`: The name of the resource claim in the pod's `spec.resourceClaims`
672+
- `result-name`: The name of the device request in the ResourceClaimTemplate's `spec.devices.requests`
673+
674+
Multus queries the ResourceClaim and ResourceSlices APIs to fetch information about allocated DRA devices. When a
675+
NetworkAttachmentDefinition has a `resourceName` annotation that matches a pod's resource claim and result name,
676+
Multus will pass the `k8s.cni.cncf.io/deviceID` to the CNI plugin in the DeviceID field.
677+
678+
##### NetworkAttachmentDefinition for SR-IOV example:
679+
680+
Following command creates a NetworkAttachmentDefinition for SR-IOV. The `resourceName` annotation `sriov/vf` indicates:
681+
- `sriov`: matches the pod's resourceClaim name
682+
- `vf`: matches the device request name in the ResourceClaimTemplate
676683

677684
```
678685
# Execute following command at Kubernetes master
679686
cat <<EOF | kubectl create -f -
680-
apiVersion: resource.k8s.io/v1alpha2
681-
kind: ResourceClass
687+
apiVersion: k8s.cni.cncf.io/v1
688+
kind: NetworkAttachmentDefinition
682689
metadata:
683-
name: sf-pool-1
684-
driverName: net.resource.nvidia.com
690+
name: sriov-net
691+
namespace: default
692+
annotations:
693+
k8s.v1.cni.cncf.io/resourceName: sriov/vf
694+
spec:
695+
config: |-
696+
{
697+
"cniVersion": "1.0.0",
698+
"name": "sriov-net",
699+
"type": "sriov",
700+
"vlan": 0,
701+
"spoofchk": "on",
702+
"trust": "on",
703+
"vlanQoS": 0,
704+
"logLevel": "info",
705+
"ipam": {
706+
"type": "host-local",
707+
"ranges": [
708+
[
709+
{
710+
"subnet": "10.0.2.0/24"
711+
}
712+
]
713+
]
714+
}
715+
}
685716
EOF
686717
```
687718

688-
#### Create network attachment definition with resource name
719+
#### Create Device Class
689720

690-
The `k8s.v1.cni.cncf.io/resourceName` should match the `ResourceClass` name defined in the section above.
691-
In this example it is `sf-pool-1`. Multus query the K8s PodResource API to fetch the `resourceClass` name and also
692-
query the NetworkAttachmentDefinition `k8s.v1.cni.cncf.io/resourceName`. If both has the same name multus send the
693-
CDI device name in the DeviceID argument.
694-
695-
##### NetworkAttachmentDefinition for ovn-kubernetes example:
696-
697-
Following command creates NetworkAttachmentDefinition. CNI config is in `config:` field.
721+
Following command creates a `DeviceClass` for the `ResourceClaimTemplate` to request devices from.
698722

699723
```
700724
# Execute following command at Kubernetes master
701725
cat <<EOF | kubectl create -f -
702-
apiVersion: "k8s.cni.cncf.io/v1"
703-
kind: NetworkAttachmentDefinition
726+
apiVersion: resource.k8s.io/v1
727+
kind: DeviceClass
704728
metadata:
705-
name: default
706-
annotations:
707-
k8s.v1.cni.cncf.io/resourceName: sf-pool-1
729+
name: sriovnetwork.openshift.io
708730
spec:
709-
config: '{
710-
"cniVersion": "0.4.0",
711-
"dns": {},
712-
"ipam": {},
713-
"logFile": "/var/log/ovn-kubernetes/ovn-k8s-cni-overlay.log",
714-
"logLevel": "4",
715-
"logfile-maxage": 5,
716-
"logfile-maxbackups": 5,
717-
"logfile-maxsize": 100,
718-
"name": "ovn-kubernetes",
719-
"type": "ovn-k8s-cni-overlay"
720-
}'
731+
selectors:
732+
- cel:
733+
expression: device.driver == sriovnetwork.openshift.io
721734
EOF
722735
```
723736

724-
#### Create DRA Resource Claim
737+
#### Create DRA Resource Claim Template
725738

726-
Following command creates `ResourceClaim` `sf` which request resource from `ResourceClass` `sf-pool-1`.
739+
Following command creates a `ResourceClaimTemplate` that requests a VF device from the SR-IOV device class.
740+
Note the `name: vf` in the requests section, which corresponds to the second part of the resourceName annotation.
727741

728742
```
729743
# Execute following command at Kubernetes master
730744
cat <<EOF | kubectl create -f -
731-
apiVersion: resource.k8s.io/v1alpha2
732-
kind: ResourceClaim
745+
apiVersion: resource.k8s.io/v1
746+
kind: ResourceClaimTemplate
733747
metadata:
734748
namespace: default
735-
name: sf
749+
name: sriov-template
736750
spec:
737751
spec:
738-
resourceClassName: sf-pool-1
752+
devices:
753+
requests:
754+
- name: vf
755+
deviceClassName: sriovnetwork.openshift.io
739756
EOF
740757
```
741758

742759
#### Launch pod with DRA Resource Claim
743760

744-
Following command Launch a Pod with primiry network `default` and `ResourceClaim` `sf`.
761+
Following command launches a Pod with the secondary network `sriov-net` and a DRA resource claim named `sriov`.
762+
The resourceClaim name `sriov` matches the first part of the NetworkAttachmentDefinition's resourceName annotation.
745763

746764
```
765+
# Execute following command at Kubernetes master
766+
cat <<EOF | kubectl create -f -
747767
apiVersion: v1
748768
kind: Pod
749769
metadata:
750770
namespace: default
751-
name: test-sf-claim
771+
name: sriov-pod
752772
annotations:
753-
v1.multus-cni.io/default-network: default
773+
k8s.v1.cni.cncf.io/networks: sriov-net
754774
spec:
755775
restartPolicy: Always
756776
containers:
@@ -759,9 +779,15 @@ spec:
759779
command: ["/bin/sh", "-ec", "while :; do echo '.'; sleep 5 ; done"]
760780
resources:
761781
claims:
762-
- name: resource
782+
- name: sriov
763783
resourceClaims:
764-
- name: resource
765-
source:
766-
resourceClaimName: sf
784+
- name: sriov
785+
resourceClaimTemplateName: sriov-template
786+
EOF
767787
```
788+
789+
In this example:
790+
- The pod has a resourceClaim named `sriov` that uses the `sriov-template`
791+
- The ResourceClaimTemplate has a device request named `vf`
792+
- The NetworkAttachmentDefinition has `resourceName: sriov/vf` which combines both names
793+
- Multus will match these and provide the allocated deviceID to the SR-IOV CNI plugin

go.mod

Lines changed: 21 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
module gopkg.in/k8snetworkplumbingwg/multus-cni.v4
22

3-
go 1.23.4
3+
go 1.24.0
4+
5+
toolchain go1.24.6
46

57
require (
68
github.com/blang/semver v3.5.1+incompatible
@@ -17,58 +19,60 @@ require (
1719
golang.org/x/sys v0.33.0
1820
google.golang.org/grpc v1.73.0
1921
gopkg.in/natefinch/lumberjack.v2 v2.2.1
20-
k8s.io/api v0.32.5
21-
k8s.io/apimachinery v0.32.5
22-
k8s.io/client-go v0.32.5
22+
k8s.io/api v0.34.1
23+
k8s.io/apimachinery v0.34.1
24+
k8s.io/client-go v0.34.1
2325
k8s.io/klog v1.0.0
2426
k8s.io/klog/v2 v2.130.1
25-
k8s.io/kubelet v0.32.5
27+
k8s.io/kubelet v0.34.1
2628
)
2729

2830
require (
2931
github.com/beorn7/perks v1.0.1 // indirect
3032
github.com/cespare/xxhash/v2 v2.3.0 // indirect
3133
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc // indirect
32-
github.com/emicklei/go-restful/v3 v3.11.0 // indirect
33-
github.com/fxamacker/cbor/v2 v2.7.0 // indirect
34+
github.com/emicklei/go-restful/v3 v3.12.2 // indirect
35+
github.com/fxamacker/cbor/v2 v2.9.0 // indirect
3436
github.com/go-logr/logr v1.4.2 // indirect
3537
github.com/go-openapi/jsonpointer v0.21.0 // indirect
3638
github.com/go-openapi/jsonreference v0.20.2 // indirect
3739
github.com/go-openapi/swag v0.23.0 // indirect
3840
github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
3941
github.com/gogo/protobuf v1.3.2 // indirect
40-
github.com/golang/protobuf v1.5.4 // indirect
41-
github.com/google/gnostic-models v0.6.9-0.20230804172637-c7be7c783f49 // indirect
42+
github.com/google/gnostic-models v0.7.0 // indirect
4243
github.com/google/go-cmp v0.7.0 // indirect
43-
github.com/google/gofuzz v1.2.0 // indirect
4444
github.com/google/pprof v0.0.0-20250403155104-27863c87afa6 // indirect
4545
github.com/google/uuid v1.6.0 // indirect
4646
github.com/josharian/intern v1.0.0 // indirect
4747
github.com/json-iterator/go v1.1.12 // indirect
4848
github.com/mailru/easyjson v0.7.7 // indirect
4949
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
50-
github.com/modern-go/reflect2 v1.0.2 // indirect
50+
github.com/modern-go/reflect2 v1.0.3-0.20250322232337-35a7c28c31ee // indirect
5151
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
5252
github.com/pkg/errors v0.9.1 // indirect
53+
github.com/pmezard/go-difflib v1.0.0 // indirect
5354
github.com/prometheus/client_model v0.6.1 // indirect
5455
github.com/prometheus/common v0.62.0 // indirect
5556
github.com/prometheus/procfs v0.15.1 // indirect
5657
github.com/vishvananda/netns v0.0.5 // indirect
5758
github.com/x448/float16 v0.8.4 // indirect
5859
go.uber.org/automaxprocs v1.6.0 // indirect
60+
go.yaml.in/yaml/v2 v2.4.2 // indirect
61+
go.yaml.in/yaml/v3 v3.0.4 // indirect
5962
golang.org/x/oauth2 v0.28.0 // indirect
6063
golang.org/x/term v0.32.0 // indirect
6164
golang.org/x/text v0.26.0 // indirect
62-
golang.org/x/time v0.7.0 // indirect
65+
golang.org/x/time v0.9.0 // indirect
6366
golang.org/x/tools v0.33.0 // indirect
6467
google.golang.org/genproto/googleapis/rpc v0.0.0-20250324211829-b45e905df463 // indirect
6568
google.golang.org/protobuf v1.36.6 // indirect
6669
gopkg.in/evanphx/json-patch.v4 v4.12.0 // indirect
6770
gopkg.in/inf.v0 v0.9.1 // indirect
6871
gopkg.in/yaml.v3 v3.0.1 // indirect
69-
k8s.io/kube-openapi v0.0.0-20241105132330-32ad38e42d3f // indirect
70-
k8s.io/utils v0.0.0-20241104100929-3ea5e8cea738 // indirect
71-
sigs.k8s.io/json v0.0.0-20241010143419-9aa6b5e7a4b3 // indirect
72-
sigs.k8s.io/structured-merge-diff/v4 v4.4.2 // indirect
73-
sigs.k8s.io/yaml v1.4.0 // indirect
72+
k8s.io/kube-openapi v0.0.0-20250710124328-f3f2b991d03b // indirect
73+
k8s.io/utils v0.0.0-20250604170112-4c0f3b243397 // indirect
74+
sigs.k8s.io/json v0.0.0-20241014173422-cfa47c3a1cc8 // indirect
75+
sigs.k8s.io/randfill v1.0.0 // indirect
76+
sigs.k8s.io/structured-merge-diff/v6 v6.3.0 // indirect
77+
sigs.k8s.io/yaml v1.6.0 // indirect
7478
)

0 commit comments

Comments
 (0)