Skip to content

Conversation

shreeya-patel98
Copy link
Collaborator

@shreeya-patel98 shreeya-patel98 commented Sep 1, 2025

Commit Message

    net: mana: Switch to page pool for jumbo frames
    
    jira LE-3907
    commit-author Haiyang Zhang <[email protected]>
    commit fa37a8849634db2dd3545116873da8cf4b1e67c6
    
    Frag allocators, such as netdev_alloc_frag(), were not designed to
    work for fragsz > PAGE_SIZE.
    
    So, switch to page pool for jumbo frames instead of using page frag
    allocators. This driver is using page pool for smaller MTUs already.
    
-----------------------------------------------------------------------------

    net: mana: Support holes in device list reply msg
    
    jira LE-3903
    commit-author Haiyang Zhang <[email protected]>
    commit 2fc8a346625eb1abfe202062c7e6a13d76cde5ea
    
    According to GDMA protocol, holes (zeros) are allowed at the beginning
    or middle of the gdma_list_devices_resp message. The existing code
    cannot properly handle this, and may miss some devices in the list.
    
    To fix, scan the entire list until the num_of_devs are found, or until
    the end of the list.
    
-----------------------------------------------------------------------------

    RDMA/mana_ib: Handle net event for pointing to the current netdev
    
    jira LE-3893
    commit-author Long Li <[email protected]>
    commit bee35b7161aaaed9831e2f14876c374b9c566952
    upstream-diff There were conflicts when applying this patch
    due to the following missing commits :-
    79bccd746132 ("RDMA/mana_ib: Add port statistics support")
    df91c470d9e5 ("RDMA/mana_ib: create/destroy AH")
    
    When running under Hyper-V, the master device to the RDMA device is always
    bonded to this RDMA device. This is not user-configurable.
    
    The master device can be unbind/bind from the kernel. During those events,
    the RDMA device should set to the current netdev to reflect the change of
    master device from those events.
    
-----------------------------------------------------------------------------

    net: mana: Change the function signature of mana_get_primary_netdev_rcu
    
    jira LE-3893
    commit-author Long Li <[email protected]>
    commit a8445cfec101c42e9d64cdb2dac13973b22c205c
    
    Change mana_get_primary_netdev_rcu() to mana_get_primary_netdev(), and
    return the ndev with refcount held. The caller is responsible for dropping
    the refcount.
    
    Also drop the check for IFF_SLAVE as it is not necessary if the upper
    device is present.
    
    Note :- Modpost reported errors due to quotes around NET_MANA
MODPOST modules-only.symvers
ERROR: modpost: module mana_ib uses symbol mana_get_primary_netdev from namespace "NET_MANA", but does not import it.
make[1]: *** [scripts/Makefile.modpost:128: modules-only.symvers] Error 1
make: *** [Makefile:1861: modules] Error 2

Kernel Build

/home/rocky/workspace/kernel-src-tree
Skipping make mrproper
[TIMER]{MRPROPER}: 0s
x86_64 architecture detected, copying config
'configs/kernel-x86_64-rhel.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-HEAD-901c28ff4d62"
Making olddefconfig
#
# configuration written to .config
#
Starting Build
  SYNC    include/config/auto.conf.cmd
mkdir -p /home/rocky/workspace/kernel-src-tree/tools/objtool && make O=/home/rocky/workspace/kernel-src-tree subdir=tools/objtool --no-print-directory -C objtool 
mkdir -p /home/rocky/workspace/kernel-src-tree/tools/bpf/resolve_btfids && make O=/home/rocky/workspace/kernel-src-tree subdir=tools/bpf/resolve_btfids --no-print-directory -C bpf/resolve_btfids 
  INSTALL libsubcmd_headers
  CALL    scripts/atomic/check-atomics.sh
warning: generated include/linux/atomic/atomic-instrumented.h has been modified.
  CALL    scripts/checksyscalls.sh
  CHK     include/generated/compile.h
  CHK     kernel/kheaders_data.tar.xz
  TEST    posttest
arch/x86/tools/insn_decoder_test: success: Decoded and checked 6993310 instructions
  TEST    posttest
arch/x86/tools/insn_sanity: Success: decoded and checked 1000000 random instructions with 0 errors (seed:0xc8873efd)
Kernel: arch/x86/boot/bzImage is ready  (#1)
[TIMER]{BUILD}: 34s
Making Modules
  INSTALL /lib/modules/5.14.0-HEAD-901c28ff4d62+/kernel/arch/x86/crypto/blake2s-x86_64.ko
  INSTALL /lib/modules/5.14.0-HEAD-901c28ff4d62+/kernel/arch/x86/crypto/blowfish-x86_64.ko
  INSTALL /lib/modules/5.14.0-HEAD-901c28ff4d62+/kernel/arch/x86/crypto/camellia-aesni-avx-x86_64.ko
  <--snip--> 
  SIGN    /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-f2df81b7d46+/kernel/sound/virtio/virtio_snd.ko
  SIGN    /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-f2df81b7d46+/kernel/sound/xen/snd_xen_front.ko
  SIGN    /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-f2df81b7d46+/kernel/sound/x86/snd-hdmi-lpe-audio.ko
  DEPMOD  /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-f2df81b7d46+
[TIMER]{MODULES}: 8s
Making Install
sh ./arch/x86/boot/install.sh 5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-f2df81b7d46+ \
	arch/x86/boot/bzImage System.map "/boot"
[TIMER]{INSTALL}: 26s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-f2df81b7d46+ and Index to 2
The default is /boot/loader/entries/ae477367830943118aa3354df0e829b2-5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-f2df81b7d46+.conf with index 2 and kernel /boot/vmlinuz-5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-f2df81b7d46+
The default is /boot/loader/entries/ae477367830943118aa3354df0e829b2-5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-f2df81b7d46+.conf with index 2 and kernel /boot/vmlinuz-5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-f2df81b7d46+
Generating grub configuration file ...
Adding boot menu entry for UEFI Firmware Settings ...
done
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 0s
[TIMER]{BUILD}: 314s
[TIMER]{MODULES}: 8s
[TIMER]{INSTALL}: 26s
[TIMER]{TOTAL} 351s
Rebooting in 10 seconds

kernel-build.log

Kselftest

[rocky@shreeya-scn9 workspace]$ grep '^ok ' kselftest-before.log | wc -l && grep '^ok ' kselftest-after.log | wc -l
379
379
[rocky@shreeya-scn9 workspace]$ grep '^not ok ' kselftest-before.log | wc -l && grep '^not ok ' kselftest-after.log | wc -l
92
92

kselftest-after.log
kselftest-before.log

Testing

[rocky@shreeya-scn9 kernel-src-tree]$ sudo dmesg | grep mana
[    4.563476] mana 7870:00:00.0: enabling device (0000 -> 0002)
[    4.582772] mana 7870:00:00.0: Microsoft Azure Network Adapter protocol version: 0.1.1
[    4.604309] mana 7870:00:00.0 eth1: joined to eth0
[    5.374651] mana 7870:00:00.0 eth1: Configured vPort 0 PD 18 DB 16
[    5.382692] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[    5.473061] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[    5.654248] mana 7870:00:00.0 eth1: Configured vPort 0 PD 18 DB 16
[    5.666330] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64

[rocky@shreeya-scn9 kernel-src-tree]$ lspci
7870:00:00.0 Ethernet controller: Microsoft Corporation Device 00ba
9f87:00:00.0 Non-Volatile memory controller: Microsoft Corporation Device b111 (rev 01)
c05b:00:00.0 Non-Volatile memory controller: Microsoft Corporation Device 00a9

[rocky@shreeya-scn9 kernel-src-tree]$ ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 60:45:bd:ef:7a:84 brd ff:ff:ff:ff:ff:ff
    alias Network Device
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master eth0 state UP mode DEFAULT group default qlen 1000
    link/ether 60:45:bd:ef:7a:84 brd ff:ff:ff:ff:ff:ff
    
[rocky@shreeya-scn9 kernel-src-tree]$ ethtool -S eth0 | grep -E "^[ \t]+vf"
     vf_rx_packets: 50578
     vf_rx_bytes: 7487477
     vf_tx_packets: 91037
     vf_tx_bytes: 18115964
     vf_tx_dropped: 0
     
[rocky@shreeya-scn9 kernel-src-tree]$ lsmod | grep mana
mana_ib                61440  0
ib_uverbs             208896  1 mana_ib
ib_core               565248  2 mana_ib,ib_uverbs
mana                  118784  1 mana_ib

[rocky@shreeya-scn9 kernel-src-tree]$ sudo rmmod mana_ib
[rocky@shreeya-scn9 kernel-src-tree]$ sudo rmmod mana

[rocky@shreeya-scn9 kernel-src-tree]$ lsmod | grep mana

[rocky@shreeya-scn9 kernel-src-tree]$ sudo modprobe mana
[rocky@shreeya-scn9 kernel-src-tree]$ sudo modprobe mana_ib

[rocky@shreeya-scn9 kernel-src-tree]$ lsmod | grep mana
mana_ib                61440  0
mana                  118784  1 mana_ib
ib_uverbs             208896  1 mana_ib
ib_core               565248  2 mana_ib,ib_uverbs

[rocky@shreeya-scn9 kernel-src-tree]$ sudo dmesg | grep mana
[    4.563476] mana 7870:00:00.0: enabling device (0000 -> 0002)
[    4.582772] mana 7870:00:00.0: Microsoft Azure Network Adapter protocol version: 0.1.1
[    4.604309] mana 7870:00:00.0 eth1: joined to eth0
[    5.374651] mana 7870:00:00.0 eth1: Configured vPort 0 PD 18 DB 16
[    5.382692] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[    5.473061] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[    5.654248] mana 7870:00:00.0 eth1: Configured vPort 0 PD 18 DB 16
[    5.666330] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[ 8167.543218] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[ 8188.615764] mana 7870:00:00.0: Microsoft Azure Network Adapter protocol version: 0.1.1
[ 8188.622764] mana 7870:00:00.0 eth1: joined to eth0
[ 8188.723636] mana 7870:00:00.0 eth1: Configured vPort 0 PD 18 DB 16
[ 8188.732300] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
    

jira LE-3893
commit-author Long Li <[email protected]>
commit a8445cf

Change mana_get_primary_netdev_rcu() to mana_get_primary_netdev(), and
return the ndev with refcount held. The caller is responsible for dropping
the refcount.

Also drop the check for IFF_SLAVE as it is not necessary if the upper
device is present.

	Signed-off-by: Long Li <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Leon Romanovsky <[email protected]>
(cherry picked from commit a8445cf)
	Signed-off-by: Shreeya Patel <[email protected]>
jira LE-3893
commit-author Long Li <[email protected]>
commit bee35b7
upstream-diff There were conflicts when applying this patch
due to the following missing commits :-
79bccd7 ("RDMA/mana_ib: Add port statistics support")
df91c47 ("RDMA/mana_ib: create/destroy AH")

When running under Hyper-V, the master device to the RDMA device is always
bonded to this RDMA device. This is not user-configurable.

The master device can be unbind/bind from the kernel. During those events,
the RDMA device should set to the current netdev to reflect the change of
master device from those events.

	Signed-off-by: Long Li <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Leon Romanovsky <[email protected]>
(cherry picked from commit bee35b7)
	Signed-off-by: Shreeya Patel <[email protected]>
Signed-off-by: Shreeya Patel <[email protected]>
jira LE-3903
commit-author Haiyang Zhang <[email protected]>
commit 2fc8a34

According to GDMA protocol, holes (zeros) are allowed at the beginning
or middle of the gdma_list_devices_resp message. The existing code
cannot properly handle this, and may miss some devices in the list.

To fix, scan the entire list until the num_of_devs are found, or until
the end of the list.

	Cc: [email protected]
Fixes: ca9c54d ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
	Signed-off-by: Haiyang Zhang <[email protected]>
	Reviewed-by: Long Li <[email protected]>
	Reviewed-by: Shradha Gupta <[email protected]>
	Reviewed-by: Michal Swiatkowski <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Paolo Abeni <[email protected]>

(cherry picked from commit 2fc8a34)
	Signed-off-by: Shreeya Patel <[email protected]>
jira LE-3907
commit-author Haiyang Zhang <[email protected]>
commit fa37a88

Frag allocators, such as netdev_alloc_frag(), were not designed to
work for fragsz > PAGE_SIZE.

So, switch to page pool for jumbo frames instead of using page frag
allocators. This driver is using page pool for smaller MTUs already.

	Cc: [email protected]
Fixes: 80f6215 ("net: mana: Add support for jumbo frame")
	Signed-off-by: Haiyang Zhang <[email protected]>
	Reviewed-by: Long Li <[email protected]>
	Reviewed-by: Shradha Gupta <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit fa37a88)
	Signed-off-by: Shreeya Patel <[email protected]>
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Copy link

@jdieter jdieter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@shreeya-patel98 shreeya-patel98 merged commit 5cb8d67 into sig-cloud-9/5.14.0-570.33.2.el9_6 Sep 2, 2025
4 checks passed
@shreeya-patel98 shreeya-patel98 deleted the {shreeya}_sig-cloud-9/5.14.0-570.33.2.el9_6 branch September 2, 2025 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants