Skip to content

Commit ebe3909

Browse files
authored
Merge pull request #56593 from sjhala-ccs/cnv-18331
CNV-18331: Cluster DPDK readiness checkup
2 parents 4e2dae9 + 909f849 commit ebe3909

File tree

3 files changed

+299
-0
lines changed

3 files changed

+299
-0
lines changed
Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * virt/support/monitoring/virt-running-cluster-checkups.adoc
4+
5+
:_content-type: PROCEDURE
6+
[id="virt-checking-cluster-dpdk-readiness_{context}"]
7+
= Checking cluster readiness to run DPDK applications with zero packet loss
8+
9+
Use a predefined checkup to verify if your {product-title} cluster node can run a virtual machine (VM) with a Data Plane Development Kit (DPDK) workload. The checkup runs traffic between a traffic generator pod and a VM running a test DPDK application, and checks for packet loss.
10+
11+
.Prerequisites
12+
* You have access to the cluster as a user with `cluster-admin` permissions.
13+
* You have installed the OpenShift CLI (`oc`).
14+
* You have configured the compute nodes to run DPDK applications on VMs with zero packet loss.
15+
16+
.Procedure
17+
. Create a manifest file that contains the `ServiceAccount`, `Role`, and `RoleBinding` objects with permissions that the checkup requires for cluster access:
18+
+
19+
.Example roles manifest
20+
[%collapsible]
21+
====
22+
[source,yaml]
23+
----
24+
---
25+
apiVersion: v1
26+
kind: ServiceAccount
27+
metadata:
28+
name: dpdk-checkup-sa
29+
---
30+
apiVersion: rbac.authorization.k8s.io/v1
31+
kind: Role
32+
metadata:
33+
name: kiagnose-configmap-access
34+
rules:
35+
- apiGroups: [ "" ]
36+
resources: [ "configmaps" ]
37+
verbs: [ "get", "update" ]
38+
---
39+
apiVersion: rbac.authorization.k8s.io/v1
40+
kind: RoleBinding
41+
metadata:
42+
name: kiagnose-configmap-access
43+
subjects:
44+
- kind: ServiceAccount
45+
name: dpdk-checkup-sa
46+
roleRef:
47+
apiGroup: rbac.authorization.k8s.io
48+
kind: Role
49+
name: kiagnose-configmap-access
50+
---
51+
apiVersion: rbac.authorization.k8s.io/v1
52+
kind: Role
53+
metadata:
54+
name: kubevirt-dpdk-checker
55+
rules:
56+
- apiGroups: [ "kubevirt.io" ]
57+
resources: [ "virtualmachineinstances" ]
58+
verbs: [ "create", "get", "delete" ]
59+
- apiGroups: [ "subresources.kubevirt.io" ]
60+
resources: [ "virtualmachineinstances/console" ]
61+
verbs: [ "get" ]
62+
- apiGroups: [ "" ]
63+
resources: [ "pods" ]
64+
verbs: [ "create", "get", "delete" ]
65+
- apiGroups: [ "" ]
66+
resources: [ "pods/exec" ]
67+
verbs: [ "create" ]
68+
- apiGroups: [ "k8s.cni.cncf.io" ]
69+
resources: [ "network-attachment-definitions" ]
70+
verbs: [ "get" ]
71+
---
72+
apiVersion: rbac.authorization.k8s.io/v1
73+
kind: RoleBinding
74+
metadata:
75+
name: kubevirt-dpdk-checker
76+
subjects:
77+
- kind: ServiceAccount
78+
name: dpdk-checkup-sa
79+
roleRef:
80+
apiGroup: rbac.authorization.k8s.io
81+
kind: Role
82+
name: kubevirt-dpdk-checker
83+
----
84+
====
85+
86+
. Apply the checkup roles manifest:
87+
+
88+
[source,terminal]
89+
----
90+
$ oc apply -n <target_namespace> -f <dpdk_roles>.yaml
91+
----
92+
93+
. Create a `ConfigMap` manifest that contains the input parameters for the checkup. The config map also stores the results of the checkup.
94+
+
95+
.Example input config map
96+
[source,yaml]
97+
----
98+
apiVersion: v1
99+
kind: ConfigMap
100+
metadata:
101+
name: dpdk-checkup-config
102+
data:
103+
spec.timeout: 10m
104+
spec.param.networkAttachmentDefinitionName: <network_name> <1>
105+
spec.param.trafficGeneratorRuntimeClassName: <runtimeclass_name> <2>
106+
----
107+
<1> The name of the `NetworkAttachmentDefinition` object.
108+
<2> The `RuntimeClass` resource that the traffic generator pod uses.
109+
110+
. Apply the config map manifest in the target namespace:
111+
+
112+
[source,terminal]
113+
----
114+
$ oc apply -n <target_namespace> -f <dpdk_config_map>.yaml
115+
----
116+
117+
. Create a `Job` object to run the checkup:
118+
+
119+
.Example job manifest
120+
[source,yaml]
121+
----
122+
apiVersion: batch/v1
123+
kind: Job
124+
metadata:
125+
name: dpdk-checkup
126+
spec:
127+
backoffLimit: 0
128+
template:
129+
spec:
130+
serviceAccountName: dpdk-checkup-sa
131+
restartPolicy: Never
132+
containers:
133+
- name: dpdk-checkup
134+
image: brew.registry.redhat.io/rh-osbs/container-native-virtualization-kubevirt-dpdk-checkup-rhel9:v4.13.0
135+
imagePullPolicy: Always
136+
securityContext:
137+
allowPrivilegeEscalation: false
138+
capabilities:
139+
drop: ["ALL"]
140+
runAsNonRoot: true
141+
seccompProfile:
142+
type: "RuntimeDefault"
143+
env:
144+
- name: CONFIGMAP_NAMESPACE
145+
value: <target-namespace>
146+
- name: CONFIGMAP_NAME
147+
value: dpdk-checkup-config
148+
- name: POD_UID
149+
valueFrom:
150+
fieldRef:
151+
fieldPath: metadata.uid
152+
----
153+
154+
. Apply the `Job` manifest:
155+
+
156+
[source,terminal]
157+
----
158+
$ oc apply -n <target_namespace> -f <dpdk_job>.yaml
159+
----
160+
161+
. Wait for the job to complete:
162+
+
163+
[source,terminal]
164+
----
165+
$ oc wait job dpdk-checkup -n <target_namespace> --for condition=complete --timeout 10m
166+
----
167+
168+
. Review the results of the checkup by running the following command:
169+
+
170+
[source,terminal]
171+
----
172+
$ oc get configmap dpdk-checkup-config -n <target_namespace> -o yaml
173+
----
174+
+
175+
.Example output config map (success)
176+
[source,yaml]
177+
----
178+
apiVersion: v1
179+
kind: ConfigMap
180+
metadata:
181+
name: dpdk-checkup-config
182+
data:
183+
spec.timeout: 1h2m
184+
spec.param.NetworkAttachmentDefinitionName: "mlx-dpdk-network-1"
185+
spec.param.trafficGeneratorRuntimeClassName: performance-performance-zeus10
186+
status.succeeded: true
187+
status.failureReason: " "
188+
status.startTimestamp: 2022-12-21T09:33:06+00:00
189+
status.completionTimestamp: 2022-12-21T11:33:06+00:00
190+
status.result.actualTrafficGeneratorTargetNode: worker-dpdk1
191+
status.result.actualDPDKVMTargetNode: worker-dpdk2
192+
status.result.dropRate: 0
193+
----
194+
195+
. Delete the job and config map resources that you previously created by running the following commands:
196+
+
197+
[source,terminal]
198+
----
199+
$ oc delete job -n <target_namespace> dpdk-checkup
200+
----
201+
+
202+
[source,terminal]
203+
----
204+
$ oc delete config-map -n <target_namespace> dpdk-checkup-config
205+
----
206+
207+
. Optional: If you do not plan to run another checkup, delete the checkup roles manifest:
208+
+
209+
[source,terminal]
210+
----
211+
$ oc delete -f <file_name>.yaml
212+
----
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
// Module included in the following assemblies:
2+
//
3+
// * virt/support/monitoring/virt-running-cluster-checkups.adoc
4+
5+
:_content-type: REFERENCE
6+
[id="virt-dpdk-config-map-parameters_{context}"]
7+
= DPDK checkup config map parameters
8+
9+
The following table shows the mandatory and optional parameters that you can set in the `data` stanza of the input `ConfigMap` manifest when you run a cluster DPDK readiness checkup:
10+
11+
.DPDK checkup config map parameters
12+
[cols="1,1,1", options="header"]
13+
|====
14+
|Parameter
15+
|Description
16+
|Is Mandatory
17+
18+
|`spec.timeout`
19+
|The time, in minutes, before the checkup fails.
20+
|True
21+
22+
|`spec.param.networkAttachmentDefinitionName`
23+
|The name of the `NetworkAttachmentDefinition` object of the SR-IOV NICs connected.
24+
|True
25+
26+
|`spec.param.trafficGeneratorRuntimeClassName`
27+
|The RuntimeClass resource that the traffic generator pod uses.
28+
|True
29+
30+
|`spec.param.trafficGeneratorImage`
31+
|The container image for the traffic generator. The default value is `quay.io/kiagnose/kubevirt-dpdk-checkup-traffic-gen:main`.
32+
|False
33+
34+
|`spec.param.trafficGeneratorNodeSelector`
35+
|The node on which the traffic generator pod is to be scheduled. The node should be configured to allow DPDK traffic.
36+
|False
37+
38+
|`spec.param.trafficGeneratorPacketsPerSecond`
39+
|The number of packets per second, in kilo (k) or million(m). The default value is 14m.
40+
|False
41+
42+
|`spec.param.trafficGeneratorEastMacAddress`
43+
|The MAC address of the NIC connected to the traffic generator pod or VM. The default value is a random MAC address in the format `50:xx:xx:xx:xx:01`.
44+
|False
45+
46+
|`spec.param.trafficGeneratorWestMacAddress`
47+
|The MAC address of the NIC connected to the traffic generator pod or VM. The default value is a random MAC address in the format `50:xx:xx:xx:xx:02`.
48+
|False
49+
50+
|`spec.param.vmContainerDiskImage`
51+
|The container disk image for the VM. The default value is `quay.io/kiagnose/kubevirt-dpdk-checkup-vm:main`.
52+
|False
53+
54+
|`spec.param.DPDKLabelSelector`
55+
|The label of the node on which the VM runs. The node should be configured to allow DPDK traffic.
56+
|False
57+
58+
|`spec.param.DPDKEastMacAddress`
59+
|The MAC address of the NIC that is connected to the VM. The default value is a random MAC address in the format `60:xx:xx:xx:xx:01`.
60+
|False
61+
62+
|`spec.param.DPDKWestMacAddress`
63+
|The MAC address of the NIC that is connected to the VM. The default value is a random MAC address in the format `60:xx:xx:xx:xx:02`.
64+
|False
65+
66+
|`spec.param.testDuration`
67+
|The duration, in minutes, for which the traffic generator runs. The default value is 5 minutes.
68+
|False
69+
70+
|`spec.param.portBandwidthGB`
71+
|The maximum bandwidth of the SR-IOV NIC. The default value is 10GB.
72+
|False
73+
74+
|`spec.param.verbose`
75+
|When set to `true`, it increases the verbosity of the checkup log. The default value is `false`.
76+
|False
77+
|====

virt/support/monitoring/virt-running-cluster-checkups.adoc

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,13 @@ include::modules/virt-about-cluster-checkup-framework.adoc[leveloffset=+1]
1515

1616
include::modules/virt-measuring-latency-vm-secondary-network.adoc[leveloffset=+1]
1717

18+
include::modules/virt-checking-cluster-dpdk-readiness.adoc[leveloffset=+1]
19+
20+
include::modules/virt-dpdk-config-map-parameters.adoc[leveloffset=+2]
21+
22+
[role="_additional-resources"]
23+
[id="additional-resources_running-cluster-checkups"]
24+
== Additional resources
25+
* xref:../../../virt/virtual_machines/vm_networking/virt-attaching-vm-multiple-networks.adoc#virt-attaching-vm-multiple-networks[Attaching a virtual machine to multiple networks]
26+
* xref:../../../networking/hardware_networks/using-dpdk-and-rdma.adoc#example-vf-use-in-dpdk-mode-intel_using-dpdk-and-rdma[Using a virtual function in DPDK mode with an Intel NIC]
27+
* xref:../../../networking/hardware_networks/using-dpdk-and-rdma.adoc#nw-example-dpdk-line-rate_using-dpdk-and-rdma[Using SR-IOV and the Node Tuning Operator to achieve a DPDK line rate]

0 commit comments

Comments
 (0)