You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### Low throughput with use of --vdev="net_vdev_netvsc0,iface=eth1"
152
152
153
153
Failover configuration of either the `net_failsafe` or `net_vdev_netvsc` poll-mode-drivers isn't recommended for high performance on Azure. The netvsc configuration with DPDK version 20.11 or higher may give better results. For optimal performance, ensure your Linux kernel, rdma-core, and DPDK packages meet the listed requirements for DPDK and MANA.
154
+
155
+
### Version mismatch for rdma-core
156
+
Mismatches in rdma-core and the linux kernel can occur any time a user is building some combination of rdma-core, DPDK, and the linux kernel from source. This error can cause a number of issues, on MANA it will likely result in a failed probe of the MANA virtual function (VF).
mana_pci_probe_mac(): Probe device name mana_0 dev_name uverbs0 ibdev_path /sys/class/infiniband/mana_0
163
+
mana_probe_port(): device located port 2 address 00:0D:3A:76:3B:D0
164
+
mana_probe_port(): ibv_alloc_parent_domain failed port 2
165
+
mana_pci_probe_mac(): Probe on IB port 2 failed -12
166
+
EAL: Requested device 7870:00:00.0 cannot be used
167
+
EAL: Bus (pci) probe failed.
168
+
hn_vf_attach(): Couldn't find port for VF
169
+
hn_vf_add(): RNDIS reports VF but device not found, retrying
170
+
171
+
```
172
+
This likely results from using a kernel with backported patches for mana_ib with a newer version of rdma-core. The root cause is an interaction between the kernel rdma drivers and userspace rdma-core libraries.
173
+
174
+
The Linux kernel uapi for rdma has a list of rdma provider ids, in backported versions of the kernel this ID value can differ from the version in the rdma-core libraries.
175
+
> {!NOTE}
176
+
> Example snippets are from [Ubuntu 5.150-1045 linux-azure](https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/focal/tree/include/uapi/rdma/ib_user_ioctl_verbs.h?h=azure-5.15-next) and [rdma-core v46.0](https://github.com/linux-rdma/rdma-core/blob/4cce53f5be035137c9d31d28e204502231a56382/kernel-headers/rdma/ib_user_ioctl_verbs.h#L220)
177
+
```c
178
+
// Linux kernel header
179
+
// include/uapi/rdma/ib_user_ioctl_verbs.h
180
+
enum rdma_driver_id {
181
+
RDMA_DRIVER_UNKNOWN,
182
+
RDMA_DRIVER_MLX5,
183
+
RDMA_DRIVER_MLX4,
184
+
RDMA_DRIVER_CXGB3,
185
+
RDMA_DRIVER_CXGB4,
186
+
RDMA_DRIVER_MTHCA,
187
+
RDMA_DRIVER_BNXT_RE,
188
+
RDMA_DRIVER_OCRDMA,
189
+
RDMA_DRIVER_NES,
190
+
RDMA_DRIVER_I40IW,
191
+
RDMA_DRIVER_IRDMA = RDMA_DRIVER_I40IW,
192
+
RDMA_DRIVER_VMW_PVRDMA,
193
+
RDMA_DRIVER_QEDR,
194
+
RDMA_DRIVER_HNS,
195
+
RDMA_DRIVER_USNIC,
196
+
RDMA_DRIVER_RXE,
197
+
RDMA_DRIVER_HFI1,
198
+
RDMA_DRIVER_QIB,
199
+
RDMA_DRIVER_EFA,
200
+
RDMA_DRIVER_SIW,
201
+
RDMA_DRIVER_MANA, //<- Note MANA added as last member of enum
202
+
};
203
+
204
+
// Example mismatched rdma-core ioctl verbs header
205
+
// on github: kernel-headers/rdma/ib_user_ioctl_verbs.h
206
+
// or in release tar.gz: include/rdma/ib_user_ioctl_verbs.h
207
+
enum rdma_driver_id {
208
+
RDMA_DRIVER_UNKNOWN,
209
+
RDMA_DRIVER_MLX5,
210
+
RDMA_DRIVER_MLX4,
211
+
RDMA_DRIVER_CXGB3,
212
+
RDMA_DRIVER_CXGB4,
213
+
RDMA_DRIVER_MTHCA,
214
+
RDMA_DRIVER_BNXT_RE,
215
+
RDMA_DRIVER_OCRDMA,
216
+
RDMA_DRIVER_NES,
217
+
RDMA_DRIVER_I40IW,
218
+
RDMA_DRIVER_IRDMA = RDMA_DRIVER_I40IW,
219
+
RDMA_DRIVER_VMW_PVRDMA,
220
+
RDMA_DRIVER_QEDR,
221
+
RDMA_DRIVER_HNS,
222
+
RDMA_DRIVER_USNIC,
223
+
RDMA_DRIVER_RXE,
224
+
RDMA_DRIVER_HFI1,
225
+
RDMA_DRIVER_QIB,
226
+
RDMA_DRIVER_EFA,
227
+
RDMA_DRIVER_SIW,
228
+
RDMA_DRIVER_ERDMA, // <- This upstream has two additional providers
229
+
RDMA_DRIVER_MANA, // <- So MANA's ID in the enum does not match
230
+
};
231
+
```
232
+
233
+
This mismatch will result in the MANA provider code failing to load. If you use `gdb` to trace the execution in this example you will find the provider for edrma is loaded instead. Either removing the erdma provider from the rdma-core source or forcing the ordering of the provider IDs will allow the MANA provider to load correctly.
0 commit comments