|
| 1 | +// Module included in the following assemblies: |
| 2 | +//CNF-1483 (4.8) |
| 3 | +// * scalability_and_performance/cnf-performance-addon-operator-for-low-latency-nodes.adoc |
| 4 | + |
| 5 | + |
| 6 | +[id="reducing-nic-queues-using-the-performance-addon-operator_{context}"] |
| 7 | += Reducing NIC queues using the Performance Addon Operator |
| 8 | + |
| 9 | +The Performance Addon Operator allows you to adjust the Network Interface Card (NIC) queue count for each network device by configuring the performance profile. Device network queues allow packets to be distributed among different physical queues and each queue gets a separate thread for packet processing. |
| 10 | + |
| 11 | +In real-time or low latency systems all the unnecessary interrupt request lines (IRQs) that are pinned to the isolated CPUs must be moved to reserved or housekeeping CPUs. |
| 12 | + |
| 13 | +In deployments with applications that require system, {product-title} networking or in mixed deployments with Data Plane Development Kit (DPDK) workloads, multiple queues are needed to achieve good throughput and the number of NIC queues should either be adapted or remain unchanged. For example, to achieve low latency the number of NIC queues for DPDK based workloads should be reduced to just the number of reserved or housekeeping CPUs. |
| 14 | + |
| 15 | +Too many queues are created by default for each CPU and these do not fit into the interrupt tables for house keeping CPUs when tuning for low latency. Reducing the number of queues makes proper tuning possible. Smaller number of queues means smaller number of interrups which then fit in the IRQ table. |
| 16 | + |
| 17 | +[id="adjusting-nic-queues-with-the-performance-profile_{context}"] |
| 18 | +== Adjusting the NIC queues with the performance profile |
| 19 | + |
| 20 | +The performance profile lets you adjust the queue count for each network device. |
| 21 | + |
| 22 | +Supported network devices are: |
| 23 | + |
| 24 | +* Non virtual network devices |
| 25 | + |
| 26 | +* Network devices that support multiple queues (channels) |
| 27 | + |
| 28 | +Unsupported network devices are: |
| 29 | + |
| 30 | +* Pure software network interfaces |
| 31 | + |
| 32 | +* Block devices |
| 33 | + |
| 34 | +* Intel DPDK virtual functions |
| 35 | + |
| 36 | +.Prerequisites |
| 37 | + |
| 38 | +* Access to the cluster as a user with the `cluster-admin` role. |
| 39 | +* Install the OpenShift CLI (`oc`). |
| 40 | + |
| 41 | + |
| 42 | +.Procedure |
| 43 | + |
| 44 | +. Log in to the OpenShift Container Platform cluster running the Performance Addon Operator as a user with `cluster-admin` privileges. |
| 45 | + |
| 46 | +. Create a performance profile that is appropriate for your hardware and topology. For more information see the Tuning nodes for low latency with the performance profile section. Alternatively, edit an existing performance profile using the following command: |
| 47 | ++ |
| 48 | +[source,terminal] |
| 49 | +---- |
| 50 | +$ oc edit performanceprofile <your_profile_name> |
| 51 | +---- |
| 52 | + |
| 53 | +. Populate the `spec` field profile with the `net` object. The object list can contain two fields: |
| 54 | + |
| 55 | +* `userLevelNetworking` is a required field that specifies a boolean flag which, if `true` sets the queue count to the reserved CPU count for all supported devices. The default is `false`. |
| 56 | +* `Devices` is an optional field which specifies a list of devices that will have the queues set to the reserved CPU count. If the device list is empty the configuration applies to all network devices. The configuration is as follows: |
| 57 | +** `interfaceName`: Name supports shell-style wildcards which can be positive or negative. |
| 58 | +*** Example wildcard syntax is as follows: `<string> .*` |
| 59 | +*** Negative rules are prefixed with an exclamation mark. To apply the net queue changes to all devices other than the excluded list, use `!<device>`; for example, `!eno1`. |
| 60 | +* `vendorID`: Network device vendor ID represented as a 16 bit hexmadecimal number with a 0x prefix. |
| 61 | +* `deviceID`: Network device ID (model) represented as a 16 bit hexmadecimal number with a 0x prefix. |
| 62 | ++ |
| 63 | +[NOTE] |
| 64 | +==== |
| 65 | +When a `deviceID` is specified, the `vendorID` must also be defined. A device that matches all of the device identifiers specified in a device entry `interfaceName`, `vendorID` or a pair of `vendorID` plus `deviceID` representing a network device qualifies as a network device that will have its net queues count set to the reserved CPU count. |
| 66 | +
|
| 67 | +When two or more devices are specified, the net queues count is set to any net device that matches one of them. |
| 68 | +==== |
| 69 | ++ |
| 70 | +An example performance profile configuration is shown below: |
| 71 | ++ |
| 72 | +[source,yaml] |
| 73 | +---- |
| 74 | +apiVersion: performance.openshift.io/v2 |
| 75 | + kind: PerformanceProfile |
| 76 | + metadata: |
| 77 | + name: manual |
| 78 | + spec: |
| 79 | + cpu: |
| 80 | + isolated: 3-51,54-103 |
| 81 | + reserved: 0-2,52-54 |
| 82 | +
|
| 83 | + net: |
| 84 | + - userLevelNetworking: true <1> |
| 85 | + #! |
| 86 | +
|
| 87 | + more examples: |
| 88 | +
|
| 89 | + net: |
| 90 | + userLevelNetworking: true |
| 91 | + devices: |
| 92 | + -interfaceName: “eth0” |
| 93 | + -interfaceName: “eth1” <2> |
| 94 | + -vendorID: “0x1af4” |
| 95 | + deviceID: “0x1000” |
| 96 | +
|
| 97 | + net: |
| 98 | + userLevelNetworking: true <3> |
| 99 | + devices: |
| 100 | + -interfaceName: “eth*” |
| 101 | +
|
| 102 | + net: |
| 103 | + userLevelNetworking: true <4> |
| 104 | + devices: |
| 105 | + -interfaceName: “!eno1” |
| 106 | +
|
| 107 | + net: |
| 108 | + userLevelNetworking: true <5> |
| 109 | + devices: |
| 110 | + -interfaceName: “eth0” |
| 111 | + vendorID: “0x1af4” |
| 112 | + deviceID: “0x1000” |
| 113 | +
|
| 114 | + #! |
| 115 | +
|
| 116 | + nodeSelector: |
| 117 | + node-role.kubernetes.io/worker-cnf: "" |
| 118 | +
|
| 119 | +---- |
| 120 | +Set the queue count to the reserved CPU count for: |
| 121 | +<1> All devices. |
| 122 | +<2> All devices that match any of the defined device identifiers. |
| 123 | +<3> All devices starting with the interface name `eth`. |
| 124 | +<4> All devices with an interface named anything other than `eno1`. |
| 125 | +<5> All devices that have an interface name `eth0`, `vendorID` 0x1af4, and `deviceID` 0x1000. |
| 126 | + |
| 127 | +. Apply the performance profile. |
| 128 | ++ |
| 129 | +[source,terminal] |
| 130 | +---- |
| 131 | +$ oc apply -f <your_profile_name>.yaml |
| 132 | +---- |
| 133 | + |
| 134 | +[id="verify-queue-status_{context}"] |
| 135 | +== Verify the queue status |
| 136 | + |
| 137 | +In this section, a number of examples illustrate different performance profiles and how to verify the changes are applied. |
| 138 | + |
| 139 | +.Example 1 |
| 140 | + |
| 141 | +In this example, the net queue count is set to the reserved CPU count (2) for _all_ supported devices. |
| 142 | + |
| 143 | +The relevant section from the performance profile is: |
| 144 | + |
| 145 | +[source,yaml] |
| 146 | +---- |
| 147 | +apiVersion: performance.openshift.io/v2 |
| 148 | +metadata: |
| 149 | + name: performance |
| 150 | +spec: |
| 151 | + kind: PerformanceProfile |
| 152 | + spec: |
| 153 | + cpu: |
| 154 | + reserved: 0-1 #total = 2 |
| 155 | + Isolated: 2-8 |
| 156 | + net: |
| 157 | + userLevelNetworking: true |
| 158 | + [...] |
| 159 | +---- |
| 160 | + |
| 161 | +Display the status of the queues associated with a device using the command: |
| 162 | +[source,terminal] |
| 163 | +---- |
| 164 | +$ ethtool -l <device> |
| 165 | +---- |
| 166 | +[NOTE] |
| 167 | +==== |
| 168 | +Run this command on the node where the performance profile was applied. |
| 169 | +==== |
| 170 | + |
| 171 | +Before the profile is applied the queue status is: |
| 172 | + |
| 173 | +[source,terminal] |
| 174 | +---- |
| 175 | +# ethtool -l ens4 |
| 176 | +Channel parameters for ens4: |
| 177 | +Pre-set maximums: |
| 178 | +RX: 0 |
| 179 | +TX: 0 |
| 180 | +Other: 0 |
| 181 | +Combined: 4 |
| 182 | +Current hardware settings: |
| 183 | +RX: 0 |
| 184 | +TX: 0 |
| 185 | +Other: 0 |
| 186 | +Combined: 4 |
| 187 | +---- |
| 188 | +After the profile is applied the queue status is: |
| 189 | + |
| 190 | +[source,terminal] |
| 191 | +---- |
| 192 | +# ethtool -l ens4 |
| 193 | +Channel parameters for ens4: |
| 194 | +Pre-set maximums: |
| 195 | +RX: 0 |
| 196 | +TX: 0 |
| 197 | +Other: 0 |
| 198 | +Combined: 4 |
| 199 | +Current hardware settings: |
| 200 | +RX: 0 |
| 201 | +TX: 0 |
| 202 | +Other: 0 |
| 203 | +Combined: 2 <1> |
| 204 | +---- |
| 205 | +<1> The combined channel shows the total count of reserved CPUs for _all_ supported devices is 2. This matches what is configured in the performance profile. |
| 206 | + |
| 207 | +.Example 2 |
| 208 | + |
| 209 | +In this example, the net queue count is set to the reserved CPU count (2) for _all_ supported network devices with a specific `vendorID`. |
| 210 | + |
| 211 | +The relevant section from the performance profile is: |
| 212 | + |
| 213 | +[source,yaml] |
| 214 | +---- |
| 215 | +apiVersion: performance.openshift.io/v2 |
| 216 | +metadata: |
| 217 | + name: performance |
| 218 | +spec: |
| 219 | + kind: PerformanceProfile |
| 220 | + spec: |
| 221 | + cpu: |
| 222 | + reserved: 0-1 #total = 2 |
| 223 | + Isolated: 2-8 |
| 224 | + net: |
| 225 | + userLevelNetworking: true |
| 226 | + devices: |
| 227 | + - vendorID = 0x1af4 |
| 228 | + [...] |
| 229 | +---- |
| 230 | + |
| 231 | +Display the status of the queues associated with a device using the command: |
| 232 | +[source,terminal] |
| 233 | +---- |
| 234 | +$ ethtool -l <device> |
| 235 | +---- |
| 236 | +[NOTE] |
| 237 | +==== |
| 238 | +Run this command on the node where the performance profile was applied. |
| 239 | +==== |
| 240 | + |
| 241 | +Verify the queue status after the profile is applied: |
| 242 | + |
| 243 | +[source,terminal] |
| 244 | +---- |
| 245 | +# ethtool -l ens4 |
| 246 | +Channel parameters for ens4: |
| 247 | +Pre-set maximums: |
| 248 | +RX: 0 |
| 249 | +TX: 0 |
| 250 | +Other: 0 |
| 251 | +Combined: 4 |
| 252 | +Current hardware settings: |
| 253 | +RX: 0 |
| 254 | +TX: 0 |
| 255 | +Other: 0 |
| 256 | +Combined: 2 <1> |
| 257 | +---- |
| 258 | + |
| 259 | +<1> The total count of reserved CPUs for all supported devices with `vendorID=0x1af4` is 2. |
| 260 | +For example, if there is another network device `ens2` with `vendorID=0x1af4` it will also have total net queues of 2. This matches what is configured in the performance profile. |
| 261 | + |
| 262 | +.Example 3 |
| 263 | + |
| 264 | +In this example, the net queue count is set to the reserved CPU count (2) for _all_ supported network devices that match any of the defined device identifiers. |
| 265 | + |
| 266 | +The command `udevadm info` provides a detailed report on a device. In this example the devices are: |
| 267 | + |
| 268 | +[source,terminal] |
| 269 | +---- |
| 270 | +# udevadm info -p /sys/class/net/ens4 |
| 271 | +... |
| 272 | +E: ID_MODEL_ID=0x1000 |
| 273 | +E: ID_VENDOR_ID=0x1af4 |
| 274 | +E: INTERFACE=ens4 |
| 275 | +… |
| 276 | +---- |
| 277 | + |
| 278 | +[source,terminal] |
| 279 | +---- |
| 280 | +# udevadm info -p /sys/class/net/eth0 |
| 281 | +... |
| 282 | +E: ID_MODEL_ID=0x1002 |
| 283 | +E: ID_VENDOR_ID=0x1001 |
| 284 | +E: INTERFACE=eth0 |
| 285 | +... |
| 286 | +---- |
| 287 | + |
| 288 | +Set the net queues to 2 for a device with `interfaceName` equal to `eth0` and any devices that have a `vendorID=0x1af4` with the following performance profile: |
| 289 | + |
| 290 | +[source,yaml] |
| 291 | +---- |
| 292 | +apiVersion: performance.openshift.io/v2 |
| 293 | +metadata: |
| 294 | + name: performance |
| 295 | +spec: |
| 296 | + kind: PerformanceProfile |
| 297 | + spec: |
| 298 | + cpu: |
| 299 | + reserved: 0-1 #total = 2 |
| 300 | + Isolated: 2-8 |
| 301 | + net: |
| 302 | + userLevelNetworking: true |
| 303 | + devices: |
| 304 | + - interfaceName = eth0 |
| 305 | + - vendorID = 0x1af4 |
| 306 | + [...] |
| 307 | +---- |
| 308 | + |
| 309 | +Verify the queue status after the profile is applied: |
| 310 | + |
| 311 | +[source,terminal] |
| 312 | +---- |
| 313 | +# ethtool -l ens4 |
| 314 | +Channel parameters for ens4: |
| 315 | +Pre-set maximums: |
| 316 | +RX: 0 |
| 317 | +TX: 0 |
| 318 | +Other: 0 |
| 319 | +Combined: 4 |
| 320 | +Current hardware settings: |
| 321 | +RX: 0 |
| 322 | +TX: 0 |
| 323 | +Other: 0 |
| 324 | +Combined: 2 <1> |
| 325 | +---- |
| 326 | + |
| 327 | +<1> The total count of reserved CPUs for all supported devices with `vendorID=0x1af4` is set to 2. |
| 328 | +For example, if there is another network device `ens2` with `vendorID=0x1af4`, it will also have the total net queues set to 2. Similarly, a device with `interfaceName` equal to `eth0` will have total net queues set to 2. |
| 329 | + |
| 330 | +[id="logging-associated-with-adjusting-nic-queues_{context}"] |
| 331 | +== Logging associated with adjusting NIC queues |
| 332 | + |
| 333 | +Log messages detailing the assigned devices are recorded in the respective tuned daemon logs. The following messages may be recorded to `/var/log/tuned/tuned.log`: |
| 334 | + |
| 335 | +* An INFO message is recorded detailing the successfully assigned devices: |
| 336 | ++ |
| 337 | +[source, terminal] |
| 338 | +---- |
| 339 | +INFO tuned.plugins.base: instance net_test (net): assigning devices ens1, ens2, ens3 |
| 340 | +---- |
| 341 | +* A WARNING message is recorded if none of the devices can be assigned: |
| 342 | ++ |
| 343 | +[source, terminal] |
| 344 | +---- |
| 345 | +WARNING tuned.plugins.base: instance net_test: no matching devices available |
| 346 | +---- |
0 commit comments