Skip to content

Conversation

shreeya-patel98
Copy link
Collaborator

Commit Messages

    net: mana: Handle Reset Request from MANA NIC
    
    jira LE-3923
    commit-author Haiyang Zhang <[email protected]>
    commit fbe346ce9d626680a4dd0f079e17c7b5dd32ffad
    upstream-diff There were conflicts seen when applying this
    patch due to the following missing commits :-
    ca8ac489ca33 ("net: mana: Handle unsupported HWC commands")
    505cc26bcae0 ("net: mana: Add support for auxiliary device servicing
    events")
    
    Upon receiving the Reset Request, pause the connection and clean up
    queues, wait for the specified period, then resume the NIC.
    In the cleanup phase, the HWC is no longer responding, so set hwc_timeout
    to zero to skip waiting on the response.
    
------------------------–––––––––––––––––––––––––––––––––––––––––––––––––

    net: mana: Add handler for hardware servicing events
    
    jira LE-3919
    commit-author Haiyang Zhang <[email protected]>
    commit 7768c5f417336fa58dbfef9bb7ecd7eeec6d8886
    
    To collaborate with hardware servicing events, upon receiving the special
    EQE notification from the HW channel, remove the devices on this bus.
    Then, after a waiting period based on the device specs, rescan the parent
    bus to recover the devices.
    
------------------------–––––––––––––––––––––––––––––––––––––––––––––––––

    net: mana: Expose additional hardware counters for drop and TC via ethtool.
    
    jira LE-3915
    commit-author Dipayaan Roy <[email protected]>
    commit c09ef59e17c6921c577d54bc8da4331b955d01a7
    
    Add support for reporting additional hardware counters for drop and
    TC using the ethtool -S interface.
    
    These counters include:
    
    - Aggregate Rx/Tx drop counters
    - Per-TC Rx/Tx packet counters
    - Per-TC Rx/Tx byte counters
    - Per-TC Rx/Tx pause frame counters
    
    The counters are exposed using ethtool_ops->get_ethtool_stats and
    ethtool_ops->get_strings. This feature/counters are not available
    to all versions of hardware.
    

Kernel Build

/home/rocky/workspace/kernel-src-tree
Skipping make mrproper
[TIMER]{MRPROPER}: 0s
x86_64 architecture detected, copying config
'configs/kernel-x86_64-rhel.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f"
Making olddefconfig
#
# configuration written to .config
#
Starting Build
  SYNC    include/config/auto.conf.cmd
mkdir -p /home/rocky/workspace/kernel-src-tree/tools/objtool && make O=/home/rocky/workspace/kernel-src-tree subdir=tools/objtool --no-print-directory -C objtool 
mkdir -p /home/rocky/workspace/kernel-src-tree/tools/bpf/resolve_btfids && make O=/home/rocky/workspace/kernel-src-tree subdir=tools/bpf/resolve_btfids --no-print-directory -C bpf/resolve_btfids 
  INSTALL libsubcmd_headers
  CALL    scripts/atomic/check-atomics.sh
warning: generated include/linux/atomic/atomic-instrumented.h has been modified.
  CALL    scripts/checksyscalls.sh
  CHK     include/generated/compile.h
  CHK     kernel/kheaders_data.tar.xz
  TEST    posttest
arch/x86/tools/insn_decoder_test: success: Decoded and checked 6993310 instructions
  TEST    posttest
arch/x86/tools/insn_sanity: Success: decoded and checked 1000000 random instructions with 0 errors (seed:0x9a983b02)
Kernel: arch/x86/boot/bzImage is ready  (#9)
[TIMER]{BUILD}: 35s
Making Modules
  INSTALL /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+/kernel/arch/x86/crypto/blake2s-x86_64.ko
  INSTALL /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+/kernel/arch/x86/crypto/blowfish-x86_64.ko
  <--snip-->  
  sound/usb/usx2y/snd-usb-usx2y.ko
  SIGN    /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+/kernel/sound/xen/snd_xen_front.ko
  SIGN    /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+/kernel/sound/virtio/virtio_snd.ko
  DEPMOD  /lib/modules/5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+
[TIMER]{MODULES}: 23s
Making Install
sh ./arch/x86/boot/install.sh 5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+ \
	arch/x86/boot/bzImage System.map "/boot"
[TIMER]{INSTALL}: 26s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+ and Index to 3
The default is /boot/loader/entries/ae477367830943118aa3354df0e829b2-5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+.conf with index 3 and kernel /boot/vmlinuz-5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+
The default is /boot/loader/entries/ae477367830943118aa3354df0e829b2-5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+.conf with index 3 and kernel /boot/vmlinuz-5.14.0-shreeya_sig-cloud-9_5.14.0-570.33.2.el9_6-ca56096680f+
Generating grub configuration file ...
Adding boot menu entry for UEFI Firmware Settings ...
done
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 0s
[TIMER]{BUILD}: 35s
[TIMER]{MODULES}: 23s
[TIMER]{INSTALL}: 26s
[TIMER]{TOTAL} 86s
Rebooting in 10 seconds

kernel-build.log

Kselftest

[rocky@shreeya-scn9 workspace]$ grep '^ok ' kselftest-before.log | wc -l && grep '^ok ' kselftest-after.log | wc -l
379
378
[rocky@shreeya-scn9 workspace]$ grep '^not ok ' kselftest-before.log | wc -l && grep '^not ok ' kselftest-after.log | wc -l
92
93

Note :- Failure is not related to the changes

kselftest-after.log
kselftest-before.log

Testing

[rocky@shreeya-scn9 workspace]$ sudo dmesg | grep mana
[    4.589249] mana 7870:00:00.0: enabling device (0000 -> 0002)
[    4.604961] mana 7870:00:00.0: Microsoft Azure Network Adapter protocol version: 0.1.1
[    4.614259] mana 7870:00:00.0 eth1: joined to eth0
[    5.370762] mana 7870:00:00.0 eth1: Configured vPort 0 PD 18 DB 16
[    5.379008] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[    5.474676] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[    5.665998] mana 7870:00:00.0 eth1: Configured vPort 0 PD 18 DB 16
[    5.673970] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64

[rocky@shreeya-scn9 workspace]$ lspci
7870:00:00.0 Ethernet controller: Microsoft Corporation Device 00ba
9f87:00:00.0 Non-Volatile memory controller: Microsoft Corporation Device b111 (rev 01)
c05b:00:00.0 Non-Volatile memory controller: Microsoft Corporation Device 00a9

[rocky@shreeya-scn9 workspace]$ ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 60:45:bd:ef:7a:84 brd ff:ff:ff:ff:ff:ff
    alias Network Device
3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master eth0 state UP mode DEFAULT group default qlen 1000
    link/ether 60:45:bd:ef:7a:84 brd ff:ff:ff:ff:ff:ff
    
[rocky@shreeya-scn9 workspace]$ ethtool -S eth0 | grep -E "^[ \t]+vf"
     vf_rx_packets: 12927
     vf_rx_bytes: 2662247
     vf_tx_packets: 14675
     vf_tx_bytes: 7031249
     vf_tx_dropped: 0
[rocky@shreeya-scn9 workspace]$ sudo rmmod mana_ib
[rocky@shreeya-scn9 workspace]$ sudo rmmod mana

[rocky@shreeya-scn9 workspace]$ ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 60:45:bd:ef:7a:84 brd ff:ff:ff:ff:ff:ff
    alias Network Device
    
[rocky@shreeya-scn9 workspace]$ sudo modprobe mana
[rocky@shreeya-scn9 workspace]$ sudo modprobe mana_ib

[rocky@shreeya-scn9 workspace]$ ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/ether 60:45:bd:ef:7a:84 brd ff:ff:ff:ff:ff:ff
    alias Network Device
4: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master eth0 state UP mode DEFAULT group default qlen 1000
    link/ether 60:45:bd:ef:7a:84 brd ff:ff:ff:ff:ff:ff
    
[rocky@shreeya-scn9 workspace]$ sudo dmesg | grep mana
[    4.589249] mana 7870:00:00.0: enabling device (0000 -> 0002)
[    4.604961] mana 7870:00:00.0: Microsoft Azure Network Adapter protocol version: 0.1.1
[    4.614259] mana 7870:00:00.0 eth1: joined to eth0
[    5.370762] mana 7870:00:00.0 eth1: Configured vPort 0 PD 18 DB 16
[    5.379008] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[    5.474676] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[    5.665998] mana 7870:00:00.0 eth1: Configured vPort 0 PD 18 DB 16
[    5.673970] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[ 3523.141417] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[ 3539.771212] mana 7870:00:00.0: Microsoft Azure Network Adapter protocol version: 0.1.1
[ 3539.777774] mana 7870:00:00.0 eth1: joined to eth0
[ 3539.882109] mana 7870:00:00.0 eth1: Configured vPort 0 PD 18 DB 16
[ 3539.891611] mana 7870:00:00.0 eth1: Configured steering vPort 0 entries 64
[rocky@shreeya-scn9 workspace]$ 
    

…htool.

jira LE-3915
commit-author Dipayaan Roy <[email protected]>
commit c09ef59

Add support for reporting additional hardware counters for drop and
TC using the ethtool -S interface.

These counters include:

- Aggregate Rx/Tx drop counters
- Per-TC Rx/Tx packet counters
- Per-TC Rx/Tx byte counters
- Per-TC Rx/Tx pause frame counters

The counters are exposed using ethtool_ops->get_ethtool_stats and
ethtool_ops->get_strings. This feature/counters are not available
to all versions of hardware.

	Signed-off-by: Dipayaan Roy <[email protected]>
	Reviewed-by: Subbaraya Sundeep <[email protected]>
	Reviewed-by: Haiyang Zhang <[email protected]>
Link: https://patch.msgid.link/20250609100103.GA7102@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit c09ef59)
	Signed-off-by: Shreeya Patel <[email protected]>
jira LE-3919
commit-author Haiyang Zhang <[email protected]>
commit 7768c5f

To collaborate with hardware servicing events, upon receiving the special
EQE notification from the HW channel, remove the devices on this bus.
Then, after a waiting period based on the device specs, rescan the parent
bus to recover the devices.

	Signed-off-by: Haiyang Zhang <[email protected]>
	Reviewed-by: Shradha Gupta <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 7768c5f)
	Signed-off-by: Shreeya Patel <[email protected]>
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you may have picked up a couple of lines accidently in one patch (commented)

if (err || resp->status) {
if (req->req.msg_type != MANA_QUERY_PHY_STAT)
if (err == -EOPNOTSUPP)
return err;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was adding these two lines intentional? I don't think they are part of this changeset

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I added them because of the conflict caused due to missing this patch ca8ac48

I tried to add that patch but it was causing more conflicts so I ended up adding just these two lines. Now I am thinking maybe I should just drop these lines instead?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these two lines hurt anything, but it feels like they don't belong with this changeset. I think I'd remove them.

Copy link
Collaborator

@PlaidCat PlaidCat Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what where the conflicts this seems pretty valid if the underlaying hardware isn't supported it would error out and the code doesn't' seem super complex ...

Copy link
Collaborator Author

@shreeya-patel98 shreeya-patel98 Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@PlaidCat @bmastbergen conflicts were related to mana_query_link_cfg and mana_set_bw_clamp function since these functions are not present in the current SCN 9/10 kernel, patch ca8ac48 failed to apply cleanly. So what I did was to just add the part of that patch to this mana_send_request function.

@PlaidCat Brett mentioned to remove this part as it is not related to this changeset. What is your suggestion?
I'm okay with either of the options

Copy link
Collaborator

@PlaidCat PlaidCat Sep 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets do what @bmastbergen suggested

We could probably do the backport and just exclude the missing stuff.

HOWEVER we might want to cut a ticket as that commit you referenced seems like a good patch-set to provide in future releases
https://lore.kernel.org/all/[email protected]/

jira LE-3923
commit-author Haiyang Zhang <[email protected]>
commit fbe346c
upstream-diff There were conflicts seen when applying this
patch due to the following missing commits :-
ca8ac48 ("net: mana: Handle unsupported HWC commands")
505cc26 ("net: mana: Add support for auxiliary device servicing
events")

Upon receiving the Reset Request, pause the connection and clean up
queues, wait for the specified period, then resume the NIC.
In the cleanup phase, the HWC is no longer responding, so set hwc_timeout
to zero to skip waiting on the response.

	Signed-off-by: Haiyang Zhang <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit fbe346c)
	Signed-off-by: Shreeya Patel <[email protected]>
@shreeya-patel98 shreeya-patel98 force-pushed the {shreeya}_sig-cloud-9/5.14.0-570.33.2.el9_6 branch from 784a7bf to 2dc9cfa Compare September 5, 2025 13:13
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@shreeya-patel98 shreeya-patel98 merged commit 7e4fcbb into sig-cloud-9/5.14.0-570.33.2.el9_6 Sep 5, 2025
4 checks passed
@shreeya-patel98 shreeya-patel98 deleted the {shreeya}_sig-cloud-9/5.14.0-570.33.2.el9_6 branch September 5, 2025 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants