You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To achieve a specific Data Plane Development Kit (DPDK) line rate, deploy a Node Tuning Operator and configure Single Root I/O Virtualization (SR-IOV). You must also tune the DPDK settings for the following resources:
10
+
11
+
- Isolated CPUs
12
+
- Hugepages
13
+
- The topology scheduler
14
+
15
+
[NOTE]
16
+
====
17
+
In previous versions of {product-title}, the Performance Addon Operator was used to implement automatic tuning to achieve low latency performance for {product-title} applications. In {product-title} 4.11 and later, this functionality is part of the Node Tuning Operator.
18
+
====
19
+
20
+
.DPDK test environment
21
+
The following diagram shows the components of a traffic-testing environment:
22
+
23
+
image::261_OpenShift_DPDK_0722.png[DPDK test environment]
24
+
25
+
- **Traffic generator**: An application that can generate high-volume packet traffic.
26
+
- **SR-IOV-supporting NIC**: A network interface card compatible with SR-IOV. The card runs a number of virtual functions on a physical interface.
27
+
- **Physical Function (PF)**: A PCI Express (PCIe) function of a network adapter that supports the SR-IOV interface.
28
+
- **Virtual Function (VF)**: A lightweight PCIe function on a network adapter that supports SR-IOV. The VF is associated with the PCIe PF on the network adapter. The VF represents a virtualized instance of the network adapter.
29
+
- **Switch**: A network switch. Nodes can also be connected back-to-back.
30
+
- **`testpmd`**: An example application included with DPDK. The `testpmd` application can be used to test the DPDK in a packet-forwarding mode. The `testpmd` application is also an example of how to build a fully-fledged application using the DPDK Software Development Kit (SDK).
31
+
- **worker 0** and **worker 1**: {product-title} nodes.
<1> You can use a different IP Address Management (IPAM) implementation, such as Whereabouts. For more information, see _Dynamic IP address assignment configuration with Whereabouts_.
39
+
<2> You must request the `networkNamespace` where the network attachment definition will be created. You must create the `sriovNetwork` CR under the `openshift-sriov-network-operator` namespace.
40
+
<3> The `resourceName` value must match that of the `resourceName` created under the `sriovNetworkNodePolicy`.
<1> Request the SR-IOV networks you need. Resources for the devices will be injected automatically.
75
+
<2> Disable the CPU and IRQ load balancing base. See _Disabling interrupt processing for individual pods_ for more information.
76
+
<3> Set the `runtimeClass` to `performance-performance`. Do not set the `runtimeClass` to `HostNetwork` or `privileged`.
77
+
<4> Request an equal number of resources for requests and limits to start the pod with `Guaranteed` Quality of Service (QoS).
78
+
79
+
[NOTE]
80
+
====
81
+
Do not start the pod with `SLEEP` and then exec into the pod to start the testpmd or the DPDK workload. This can add additional interrupts as the `exec` process is not pinned to any CPU.
= Using a virtual function in DPDK mode with a Mellanox NIC
8
8
9
+
You can create a network node policy and create a Data Plane Development Kit (DPDK) pod using a virtual function in DPDK mode with a Mellanox NIC.
10
+
9
11
.Prerequisites
10
12
11
-
* Install the OpenShift CLI (`oc`).
12
-
* Install the SR-IOV Network Operator.
13
-
* Log in as a user with `cluster-admin` privileges.
13
+
* You have installed the OpenShift CLI (`oc`).
14
+
* You have installed the Single Root I/O Virtualization (SR-IOV) Network Operator.
15
+
* You have logged in as a user with `cluster-admin` privileges.
14
16
15
17
.Procedure
16
18
17
-
. Create the following `SriovNetworkNodePolicy`object, and then save the YAML in the `mlx-dpdk-node-policy.yaml` file.
19
+
. Save the following `SriovNetworkNodePolicy` YAML configuration to an `mlx-dpdk-node-policy.yaml` file:
18
20
+
19
21
[source,yaml]
20
22
----
@@ -37,16 +39,16 @@ spec:
37
39
deviceType: netdevice <2>
38
40
isRdma: true <3>
39
41
----
40
-
<1> Specify the device hex code of the SR-IOV network device. The only allowed values for Mellanox cards are `1015`,`1017`.
41
-
<2> Specify the driver type for the virtual functions to `netdevice`. Mellanox SR-IOV VF can work in DPDK mode without using the `vfio-pci` device type. VF device appears as a kernel network interface inside a container.
42
-
<3> Enable RDMA mode. This is required by Mellanox cards to work in DPDK mode.
42
+
<1> Specify the device hex code of the SR-IOV network device. The only allowed values for Mellanox cards are `1015` and`1017`.
43
+
<2> Specify the driver type for the virtual functions to `netdevice`. A Mellanox SR-IOV Virtual Function (VF) can work in DPDK mode without using the `vfio-pci` device type. The VF device appears as a kernel network interface inside a container.
44
+
<3> Enable Remote Direct Memory Access (RDMA) mode. This is required for Mellanox cards to work in DPDK mode.
43
45
+
44
46
[NOTE]
45
47
=====
46
-
See the `Configuring SR-IOV network devices` section for detailed explanation on each option in `SriovNetworkNodePolicy`.
48
+
See _Configuring an SR-IOV network device_for a detailed explanation of each option in the `SriovNetworkNodePolicy` object.
47
49
48
-
When applying the configuration specified in a`SriovNetworkNodePolicy` object, the SR-IOV Operator may drain the nodes, and in some cases, reboot nodes.
49
-
It may take several minutes for a configuration change to apply.
50
+
When applying the configuration specified in an`SriovNetworkNodePolicy` object, the SR-IOV Operator might drain the nodes, and in some cases, reboot nodes.
51
+
It might take several minutes for a configuration change to apply.
50
52
Ensure that there are enough available nodes in your cluster to handle the evicted workload beforehand.
51
53
52
54
After the configuration update is applied, all the pods in the `openshift-sriov-network-operator` namespace will change to a `Running` status.
@@ -59,7 +61,7 @@ After the configuration update is applied, all the pods in the `openshift-sriov-
59
61
$ oc create -f mlx-dpdk-node-policy.yaml
60
62
----
61
63
62
-
. Create the following `SriovNetwork`object, and then save the YAML in the `mlx-dpdk-network.yaml` file.
64
+
. Save the following `SriovNetwork` YAML configuration to an `mlx-dpdk-network.yaml` file:
63
65
+
64
66
[source,yaml]
65
67
----
@@ -71,27 +73,27 @@ metadata:
71
73
spec:
72
74
networkNamespace: <target_namespace>
73
75
ipam: |- <1>
74
-
# ...
76
+
...
75
77
vlan: <vlan>
76
78
resourceName: mlxnics
77
79
----
78
-
<1> Specify a configuration object for the ipam CNI plug-in as a YAML block scalar. The plug-in manages IP address assignment for the attachment definition.
80
+
<1> Specify a configuration object for the IP Address Management (IPAM) Container Network Interface (CNI) plug-in as a YAML block scalar. The plug-in manages IP address assignment for the attachment definition.
79
81
+
80
82
[NOTE]
81
83
=====
82
-
See the "Configuring SR-IOV additional network" section for a detailed explanation on each option in `SriovNetwork`.
84
+
See _Configuring an SR-IOV network device_ for a detailed explanation on each option in the `SriovNetwork` object.
83
85
=====
84
86
+
85
-
An optional library, app-netutil, provides several API methods for gathering network information about a container's parent pod.
87
+
The `app-netutil` option library provides several API methods for gathering network information about the parent pod of a container.
86
88
87
-
. Create the `SriovNetworkNodePolicy` object by running the following command:
89
+
. Create the `SriovNetwork` object by running the following command:
88
90
+
89
91
[source,terminal]
90
92
----
91
93
$ oc create -f mlx-dpdk-network.yaml
92
94
----
95
+
. Save the following `Pod` YAML configuration to an `mlx-dpdk-pod.yaml` file:
93
96
94
-
. Create the following `Pod` spec, and then save the YAML in the `mlx-dpdk-pod.yaml` file.
95
97
+
96
98
[source,yaml]
97
99
----
@@ -130,13 +132,13 @@ spec:
130
132
emptyDir:
131
133
medium: HugePages
132
134
----
133
-
<1> Specify the same `target_namespace` where `SriovNetwork` object `mlx-dpdk-network` is created. If you would like to create the pod in a different namespace, change `target_namespace` in both `Pod` spec and `SriovNetowrk` object.
134
-
<2> Specify the DPDK image which includes your application and the DPDK library used by application.
135
+
<1> Specify the same `target_namespace` where `SriovNetwork` object `mlx-dpdk-network` is created. To create the pod in a different namespace, change `target_namespace` in both the `Pod` spec and `SriovNetwork` object.
136
+
<2> Specify the DPDK image which includes your application and the DPDK library used by the application.
135
137
<3> Specify additional capabilities required by the application inside the container for hugepage allocation, system resource allocation, and network interface access.
136
-
<4> Mount the hugepage volume to the DPDK pod under `/dev/hugepages`. The hugepage volume is backed by the emptyDir volume type with the medium being `Hugepages`.
137
-
<5> Optional: Specify the number of DPDK devices allocated to the DPDK pod. This resource request and limit, if not explicitly specified, will be automatically added by SR-IOV network resource injector. The SR-IOV network resource injector is an admission controller component managed by SR-IOV Operator. It is enabled by default and can be disabled by setting the `enableInjector` option to `false` in the default `SriovOperatorConfig` CR.
138
-
<6> Specify the number of CPUs. The DPDK pod usually requires exclusive CPUs be allocated from kubelet. This is achieved by setting CPU Manager policy to `static` and creating a pod with `Guaranteed` QoS.
139
-
<7> Specify hugepage size `hugepages-1Gi` or `hugepages-2Mi` and the quantity of hugepages that will be allocated to DPDK pod. Configure `2Mi` and `1Gi` hugepages separately. Configuring `1Gi`hugepage requires adding kernel arguments to Nodes.
138
+
<4> Mount the hugepage volume to the DPDK pod under `/dev/hugepages`. The hugepage volume is backed by the `emptyDir` volume type with the medium being `Hugepages`.
139
+
<5> Optional: Specify the number of DPDK devices allocated for the DPDK pod. If not explicitly specified, this resource request and limit is automatically added by the SR-IOV network resource injector. The SR-IOV network resource injector is an admission controller component managed by SR-IOV Operator. It is enabled by default and can be disabled by setting the `enableInjector` option to `false` in the default `SriovOperatorConfig` CR.
140
+
<6> Specify the number of CPUs. The DPDK pod usually requires that exclusive CPUs be allocated from the kubelet. To do this, set the CPU Manager policy to `static` and create a pod with `Guaranteed`Quality of Service (QoS).
141
+
<7> Specify hugepage size `hugepages-1Gi` or `hugepages-2Mi` and the quantity of hugepages that will be allocated to the DPDK pod. Configure `2Mi` and `1Gi` hugepages separately. Configuring `1Gi`hugepages requires adding kernel arguments to Nodes.
140
142
141
143
. Create the DPDK pod by running the following command:
dpdk-testpmd -l ${CPU} -a ${PCIDEVICE_OPENSHIFT_IO_DPDK_NIC_1} -a ${PCIDEVICE_OPENSHIFT_IO_DPDK_NIC_2} -n 4 -- -i --nb-cores=15 --rxd=4096 --txd=4096 --rxq=7 --txq=7 --forward-mode=mac --eth-peer=0,50:00:00:00:00:01 --eth-peer=1,50:00:00:00:00:02
19
+
----
20
+
This example uses two different `sriovNetwork` CRs. The environment variable contains the Virtual Function (VF) PCI address that was allocated for the pod. If you use the same network in the pod definition, you must split the `pciAddress`.
21
+
It is important to configure the correct MAC addresses of the traffic generator. This example uses custom MAC addresses.
= Using SR-IOV and the Node Tuning Operator to achieve a DPDK line rate
8
+
You can use the Node Tuning Operator to configure isolated CPUs, hugepages, and a topology scheduler.
9
+
You can then use the Node Tuning Operator with Single Root I/O Virtualization (SR-IOV) to achieve a specific Data Plane Development Kit (DPDK) line rate.
10
+
11
+
.Prerequisites
12
+
13
+
* You have installed the OpenShift CLI (`oc`).
14
+
* You have installed the SR-IOV Network Operator.
15
+
* You have logged in as a user with `cluster-admin` privileges.
16
+
* You have deployed a standalone Node Tuning Operator.
17
+
+
18
+
[NOTE]
19
+
====
20
+
In previous versions of {product-title}, the Performance Addon Operator was used to implement automatic tuning to achieve low latency performance for OpenShift applications. In {product-title} 4.11 and later, this functionality is part of the Node Tuning Operator.
21
+
====
22
+
23
+
.Procedure
24
+
. Create a `PerformanceProfile` object based on the following example:
25
+
+
26
+
[source,yaml]
27
+
----
28
+
apiVersion: performance.openshift.io/v2
29
+
kind: PerformanceProfile
30
+
metadata:
31
+
name: performance
32
+
spec:
33
+
globallyDisableIrqLoadBalancing: true
34
+
cpu:
35
+
isolated: 21-51,73-103 <1>
36
+
reserved: 0-20,52-72 <2>
37
+
hugepages:
38
+
defaultHugepagesSize: 1G <3>
39
+
pages:
40
+
- count: 32
41
+
size: 1G
42
+
net:
43
+
userLevelNetworking: true
44
+
numa:
45
+
topologyPolicy: "single-numa-node"
46
+
nodeSelector:
47
+
node-role.kubernetes.io/worker-cnf: ""
48
+
----
49
+
<1> If hyperthreading is enabled on the system, allocate the relevant symbolic links to the `isolated` and `reserved` CPU groups. If the system contains multiple non-uniform memory access nodes (NUMAs), allocate CPUs from both NUMAs to both groups. You can also use the Performance Profile Creator for this task. For more information, see _Creating a performance profile_.
50
+
<2> You can also specify a list of devices that will have their queues set to the reserved CPU count. For more information, see _Reducing NIC queues using the Node Tuning Operator_.
51
+
<3> Allocate the number and size of hugepages needed. You can specify the NUMA configuration for the hugepages. By default, the system allocates an even number to every NUMA node on the system. If needed, you can request the use of a realtime kernel for the nodes. See _Provisioning a worker with real-time capabilities_ for more information.
52
+
. Save the `yaml` file as `mlx-dpdk-perfprofile-policy.yaml`.
53
+
. Apply the performance profile using the following command:
0 commit comments