Skip to content

Conversation

PlaidCat
Copy link
Collaborator

General Process:

Contains the following: http://download.rockylinux.org/pub/rocky/9.5/BaseOS/source/tree/Packages/k/

BUILD

/mnt/code/kernel-src-tree-build
no .config file found, moving on
[TIMER]{MRPROPER}: 0s
x86_64 architecture detected, copying config
'configs/kernel-x86_64-rhel.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky9_5_rebuild-e6683295364e"
Making olddefconfig
  HOSTCC  scripts/kconfig/conf.o
  HOSTCC  scripts/kconfig/confdata.o
  HOSTCC  scripts/kconfig/expr.o
  LEX     scripts/kconfig/lexer.lex.c
  YACC    scripts/kconfig/parser.tab.[ch]
  HOSTCC  scripts/kconfig/lexer.lex.o
  HOSTCC  scripts/kconfig/menu.o
  HOSTCC  scripts/kconfig/parser.tab.o
  HOSTCC  scripts/kconfig/preprocess.o
  HOSTCC  scripts/kconfig/symbol.o
  HOSTLD  scripts/kconfig/conf
#
# configuration written to .config
#
Starting Build
  SYNC    include/config/auto.conf.cmd
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_64.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_x32.h

[SNIP]

  LD [M]  sound/xen/snd_xen_front.ko
  BTF [M] sound/x86/snd-hdmi-lpe-audio.ko
  BTF [M] sound/xen/snd_xen_front.ko
[TIMER]{BUILD}: 1880s
Making Modules
  INSTALL /lib/modules/5.14.0-rocky9_5_rebuild-e6683295364e/kernel/arch/x86/crypto/blake2s-x86_64.ko
  INSTALL /lib/modules/5.14.0-rocky9_5_rebuild-e6683295364e/kernel/arch/x86/crypto/blowfish-x86_64.ko
  INSTALL /lib/modules/5.14.0-rocky9_5_rebuild-e6683295364e/kernel/arch/x86/crypto/camellia-aesni-avx-x86_64.ko

[SNIP]

  STRIP   /lib/modules/5.14.0-rocky9_5_rebuild-e6683295364e/kernel/sound/xen/snd_xen_front.ko
  SIGN    /lib/modules/5.14.0-rocky9_5_rebuild-e6683295364e/kernel/sound/xen/snd_xen_front.ko
  DEPMOD  /lib/modules/5.14.0-rocky9_5_rebuild-e6683295364e
[TIMER]{MODULES}: 12s
Making Install
sh ./arch/x86/boot/install.sh 5.14.0-rocky9_5_rebuild-e6683295364e \
        arch/x86/boot/bzImage System.map "/boot"
[TIMER]{INSTALL}: 23s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-5.14.0-rocky9_5_rebuild-e6683295364e and Index to 2
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 0s
[TIMER]{BUILD}: 1880s
[TIMER]{MODULES}: 12s
[TIMER]{INSTALL}: 23s
[TIMER]{TOTAL} 1921s
Rebooting in 10 seconds

Boot

[maple@r9-sigcloud-builder code]$ uname -a
Linux r9-sigcloud-builder 5.14.0-rocky9_5_rebuild-e6683295364e #1 SMP PREEMPT_DYNAMIC Tue Mar 11 21:23:29 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

KSELFTEST

[maple@r9-sigcloud-builder code]$ ./run_kerselftests.sh 2                                                                                                                                   
Starting Test Loop 1                                                                                                                                                                        
Test Loop 1 Done                                                                                                                                                                            
Starting Test Loop 2                                                                                                                                                                        
Test Loop 2 Done

This kernel_5.14.0-rocky9_5_rebuild_2025-03-05_iteration_1.log was the previous .26.1

[jmaple@devbox code]$ grep ^ok kernel_5.14.0-rocky9_5_rebuild-e6683295364e_iteration_1_nocomments.log | wc -l
258
[jmaple@devbox code]$ grep ^ok kernel_5.14.0-rocky9_5_rebuild_2025-03-05_iteration_1.log | wc -l
257

PlaidCat added 30 commits March 11, 2025 16:46
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paolo Abeni <[email protected]>
commit e32d262

Bugged peer implementation can send corrupted DSS options, consistently
hitting a few warning in the data path. Use DEBUG_NET assertions, to
avoid the splat on some builds and handle consistently the error, dumping
related MIBs and performing fallback and/or reset according to the
subflow type.

Fixes: 6771bfd ("mptcp: update mptcp ack sequence from work queue")
	Cc: [email protected]
	Signed-off-by: Paolo Abeni <[email protected]>
	Reviewed-by: Matthieu Baerts (NGI0) <[email protected]>
	Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit e32d262)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paolo Abeni <[email protected]>
commit 4dabcdf
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.29.1.el9_5/4dabcdf5.failed

Syzkaller was able to trigger a DSS corruption:

  TCP: request_sock_subflow_v4: Possible SYN flooding on port [::]:20002. Sending cookies.
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 5227 at net/mptcp/protocol.c:695 __mptcp_move_skbs_from_subflow+0x20a9/0x21f0 net/mptcp/protocol.c:695
  Modules linked in:
  CPU: 0 UID: 0 PID: 5227 Comm: syz-executor350 Not tainted 6.11.0-syzkaller-08829-gaf9c191ac2a0 #0
  Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
  RIP: 0010:__mptcp_move_skbs_from_subflow+0x20a9/0x21f0 net/mptcp/protocol.c:695
  Code: 0f b6 dc 31 ff 89 de e8 b5 dd ea f5 89 d8 48 81 c4 50 01 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 98 da ea f5 90 <0f> 0b 90 e9 47 ff ff ff e8 8a da ea f5 90 0f 0b 90 e9 99 e0 ff ff
  RSP: 0018:ffffc90000006db8 EFLAGS: 00010246
  RAX: ffffffff8ba9df18 RBX: 00000000000055f0 RCX: ffff888030023c00
  RDX: 0000000000000100 RSI: 00000000000081e5 RDI: 00000000000055f0
  RBP: 1ffff110062bf1ae R08: ffffffff8ba9cf12 R09: 1ffff110062bf1b8
  R10: dffffc0000000000 R11: ffffed10062bf1b9 R12: 0000000000000000
  R13: dffffc0000000000 R14: 00000000700cec61 R15: 00000000000081e5
  FS:  000055556679c380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000020287000 CR3: 0000000077892000 CR4: 00000000003506f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <IRQ>
   move_skbs_to_msk net/mptcp/protocol.c:811 [inline]
   mptcp_data_ready+0x29c/0xa90 net/mptcp/protocol.c:854
   subflow_data_ready+0x34a/0x920 net/mptcp/subflow.c:1490
   tcp_data_queue+0x20fd/0x76c0 net/ipv4/tcp_input.c:5283
   tcp_rcv_established+0xfba/0x2020 net/ipv4/tcp_input.c:6237
   tcp_v4_do_rcv+0x96d/0xc70 net/ipv4/tcp_ipv4.c:1915
   tcp_v4_rcv+0x2dc0/0x37f0 net/ipv4/tcp_ipv4.c:2350
   ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205
   ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233
   NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
   NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
   __netif_receive_skb_one_core net/core/dev.c:5662 [inline]
   __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5775
   process_backlog+0x662/0x15b0 net/core/dev.c:6107
   __napi_poll+0xcb/0x490 net/core/dev.c:6771
   napi_poll net/core/dev.c:6840 [inline]
   net_rx_action+0x89b/0x1240 net/core/dev.c:6962
   handle_softirqs+0x2c5/0x980 kernel/softirq.c:554
   do_softirq+0x11b/0x1e0 kernel/softirq.c:455
   </IRQ>
   <TASK>
   __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
   local_bh_enable include/linux/bottom_half.h:33 [inline]
   rcu_read_unlock_bh include/linux/rcupdate.h:919 [inline]
   __dev_queue_xmit+0x1764/0x3e80 net/core/dev.c:4451
   dev_queue_xmit include/linux/netdevice.h:3094 [inline]
   neigh_hh_output include/net/neighbour.h:526 [inline]
   neigh_output include/net/neighbour.h:540 [inline]
   ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:236
   ip_local_out net/ipv4/ip_output.c:130 [inline]
   __ip_queue_xmit+0x118c/0x1b80 net/ipv4/ip_output.c:536
   __tcp_transmit_skb+0x2544/0x3b30 net/ipv4/tcp_output.c:1466
   tcp_transmit_skb net/ipv4/tcp_output.c:1484 [inline]
   tcp_mtu_probe net/ipv4/tcp_output.c:2547 [inline]
   tcp_write_xmit+0x641d/0x6bf0 net/ipv4/tcp_output.c:2752
   __tcp_push_pending_frames+0x9b/0x360 net/ipv4/tcp_output.c:3015
   tcp_push_pending_frames include/net/tcp.h:2107 [inline]
   tcp_data_snd_check net/ipv4/tcp_input.c:5714 [inline]
   tcp_rcv_established+0x1026/0x2020 net/ipv4/tcp_input.c:6239
   tcp_v4_do_rcv+0x96d/0xc70 net/ipv4/tcp_ipv4.c:1915
   sk_backlog_rcv include/net/sock.h:1113 [inline]
   __release_sock+0x214/0x350 net/core/sock.c:3072
   release_sock+0x61/0x1f0 net/core/sock.c:3626
   mptcp_push_release net/mptcp/protocol.c:1486 [inline]
   __mptcp_push_pending+0x6b5/0x9f0 net/mptcp/protocol.c:1625
   mptcp_sendmsg+0x10bb/0x1b10 net/mptcp/protocol.c:1903
   sock_sendmsg_nosec net/socket.c:730 [inline]
   __sock_sendmsg+0x1a6/0x270 net/socket.c:745
   ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2603
   ___sys_sendmsg net/socket.c:2657 [inline]
   __sys_sendmsg+0x2aa/0x390 net/socket.c:2686
   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
   do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
   entry_SYSCALL_64_after_hwframe+0x77/0x7f
  RIP: 0033:0x7fb06e9317f9
  Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
  RSP: 002b:00007ffe2cfd4f98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
  RAX: ffffffffffffffda RBX: 00007fb06e97f468 RCX: 00007fb06e9317f9
  RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000005
  RBP: 00007fb06e97f446 R08: 0000555500000000 R09: 0000555500000000
  R10: 0000555500000000 R11: 0000000000000246 R12: 00007fb06e97f406
  R13: 0000000000000001 R14: 00007ffe2cfd4fe0 R15: 0000000000000003
   </TASK>

Additionally syzkaller provided a nice reproducer. The repro enables
pmtu on the loopback device, leading to tcp_mtu_probe() generating
very large probe packets.

tcp_can_coalesce_send_queue_head() currently does not check for
mptcp-level invariants, and allowed the creation of cross-DSS probes,
leading to the mentioned corruption.

Address the issue teaching tcp_can_coalesce_send_queue_head() about
mptcp using the tcp_skb_can_collapse(), also reducing the code
duplication.

Fixes: 8571248 ("tcp: coalesce/collapse must respect MPTCP extensions")
	Cc: [email protected]
	Reported-by: [email protected]
Closes: multipath-tcp/mptcp_net-next#513
	Signed-off-by: Paolo Abeni <[email protected]>
	Acked-by: Matthieu Baerts (NGI0) <[email protected]>
	Signed-off-by: Matthieu Baerts (NGI0) <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 4dabcdf)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	net/ipv4/tcp_output.c
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Aleksandr Loktionov <[email protected]>
commit 07af482

Implement "mdd-auto-reset-vf" priv-flag to handle Tx and Rx MDD events for VFs.
This flag is also used in other network adapters like ICE.

Usage:
- "on"  - The problematic VF will be automatically reset
	  if a malformed descriptor is detected.
- "off" - The problematic VF will be disabled.

In cases where a VF sends malformed packets classified as malicious, it can
cause the Tx queue to freeze, rendering it unusable for several minutes. When
an MDD event occurs, this new implementation allows for a graceful VF reset to
quickly restore operational state.

Currently, VF queues are disabled if an MDD event occurs. This patch adds the
ability to reset the VF if a Tx or Rx MDD event occurs. It also includes MDD
event logging throttling to avoid dmesg pollution and unifies the format of
Tx and Rx MDD messages.

Note: Standard message rate limiting functions like dev_info_ratelimited()
do not meet our requirements. Custom rate limiting is implemented,
please see the code for details.

Co-developed-by: Jan Sokolowski <[email protected]>
	Signed-off-by: Jan Sokolowski <[email protected]>
Co-developed-by: Padraig J Connolly <[email protected]>
	Signed-off-by: Padraig J Connolly <[email protected]>
	Signed-off-by: Aleksandr Loktionov <[email protected]>
	Reviewed-by: Michal Schmidt <[email protected]>
	Tested-by: Rafal Romanowski <[email protected]>
	Signed-off-by: Tony Nguyen <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 07af482)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author David Lamparter <[email protected]>
commit d7c31cb
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.29.1.el9_5/d7c31cbd.failed

The IPv6 multicast routing code previously implemented only the dump
variant of RTM_GETROUTE.  Implement single MFC item retrieval by copying
and adapting the respective IPv4 code.

Tested against FRRouting's IPv6 PIM stack.

	Signed-off-by: David Lamparter <[email protected]>
	Reviewed-by: Nikolay Aleksandrov <[email protected]>
	Reviewed-by: David Ahern <[email protected]>
	Cc: Jakub Kicinski <[email protected]>
	Signed-off-by: David S. Miller <[email protected]>
(cherry picked from commit d7c31cb)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	net/ipv6/ip6mr.c
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paolo Abeni <[email protected]>
commit 11b6e70

The multicast route tables lifecycle, for both ipv4 and ipv6, is
protected by RCU using the RTNL lock for write access. In many
places a table pointer escapes the RCU (or RTNL) protected critical
section, but such scenarios are actually safe because tables are
deleted only at namespace cleanup time or just after allocation, in
case of default rule creation failure.

Tables freed at namespace cleanup time are assured to be alive for the
whole netns lifetime; tables freed just after creation time are never
exposed to other possible users.

Ensure that the free conditions are respected in ip{,6}mr_free_table, to
document the locking schema and to prevent future possible introduction
of 'table del' operation from breaking it.

	Reviewed-by: David Ahern <[email protected]>
	Signed-off-by: Paolo Abeni <[email protected]>
(cherry picked from commit 11b6e70)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paolo Abeni <[email protected]>
commit 50b9420

Eric reported a syzkaller-triggered splat caused by recent ipmr changes:

WARNING: CPU: 2 PID: 6041 at net/ipv6/ip6mr.c:419
ip6mr_free_table+0xbd/0x120 net/ipv6/ip6mr.c:419
Modules linked in:
CPU: 2 UID: 0 PID: 6041 Comm: syz-executor183 Not tainted
6.12.0-syzkaller-10681-g65ae975e97d5 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:ip6mr_free_table+0xbd/0x120 net/ipv6/ip6mr.c:419
Code: 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c
02 00 75 58 49 83 bc 24 c0 0e 00 00 00 74 09 e8 44 ef a9 f7 90 <0f> 0b
90 e8 3b ef a9 f7 48 8d 7b 38 e8 12 a3 96 f7 48 89 df be 0f
RSP: 0018:ffffc90004267bd8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88803c710000 RCX: ffffffff89e4d844
RDX: ffff88803c52c880 RSI: ffffffff89e4d87c RDI: ffff88803c578ec0
RBP: 0000000000000001 R08: 0000000000000005 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88803c578000
R13: ffff88803c710000 R14: ffff88803c710008 R15: dead000000000100
FS: 00007f7a855ee6c0(0000) GS:ffff88806a800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7a85689938 CR3: 000000003c492000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
ip6mr_rules_exit+0x176/0x2d0 net/ipv6/ip6mr.c:283
ip6mr_net_exit_batch+0x53/0xa0 net/ipv6/ip6mr.c:1388
ops_exit_list+0x128/0x180 net/core/net_namespace.c:177
setup_net+0x4fe/0x860 net/core/net_namespace.c:394
copy_net_ns+0x2b4/0x6b0 net/core/net_namespace.c:500
create_new_namespaces+0x3ea/0xad0 kernel/nsproxy.c:110
unshare_nsproxy_namespaces+0xc0/0x1f0 kernel/nsproxy.c:228
ksys_unshare+0x45d/0xa40 kernel/fork.c:3334
__do_sys_unshare kernel/fork.c:3405 [inline]
__se_sys_unshare kernel/fork.c:3403 [inline]
__x64_sys_unshare+0x31/0x40 kernel/fork.c:3403
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7a856332d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7a855ee238 EFLAGS: 00000246 ORIG_RAX: 0000000000000110
RAX: ffffffffffffffda RBX: 00007f7a856bd308 RCX: 00007f7a856332d9
RDX: 00007f7a8560f8c6 RSI: 0000000000000000 RDI: 0000000062040200
RBP: 00007f7a856bd300 R08: 00007fff932160a7 R09: 00007f7a855ee6c0
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f7a856bd30c
R13: 0000000000000000 R14: 00007fff93215fc0 R15: 00007fff932160a8
</TASK>

The root cause is a network namespace creation failing after successful
initialization of the ipmr subsystem. Such a case is not currently
matched by the ipmr_can_free_table() helper.

New namespaces are zeroed on allocation and inserted into net ns list
only after successful creation; when deleting an ipmr table, the list
next pointer can be NULL only on netns initialization failure.

Update the ipmr_can_free_table() checks leveraging such condition.

	Reported-by: Eric Dumazet <[email protected]>
	Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=6e8cb445d4b43d006e0c
Fixes: 11b6e70 ("ipmr: add debug check for mr table cleanup")
	Signed-off-by: Paolo Abeni <[email protected]>
	Reviewed-by: Eric Dumazet <[email protected]>
Link: https://patch.msgid.link/8bde975e21bbca9d9c27e36209b2dd4f1d7a3f00.1733212078.git.pabeni@redhat.com
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 50b9420)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paolo Abeni <[email protected]>
commit f1553c9
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.29.1.el9_5/f1553c98.failed

Several places call ip6mr_get_table() with no RCU nor RTNL lock.
Add RCU protection inside such helper and provide a lockless variant
for the few callers that already acquired the relevant lock.

Note that some users additionally reference the table outside the RCU
lock. That is actually safe as the table deletion can happen only
after all table accesses are completed.

Fixes: e2d5776 ("net: Provide compat support for SIOCGETMIFCNT_IN6 and SIOCGETSGCNT_IN6.")
Fixes: d7c31cb ("net: ip6mr: add RTM_GETROUTE netlink op")
	Reviewed-by: David Ahern <[email protected]>
	Signed-off-by: Paolo Abeni <[email protected]>
(cherry picked from commit f1553c9)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	net/ipv6/ip6mr.c
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Pavan Kumar Linga <[email protected]>
commit 81d2fb4

When the device control plane is removed or the platform
running device control plane is rebooted, a reset is detected
on the driver. On driver reset, it releases the resources and
waits for the reset to complete. If the reset fails, it takes
the error path and releases the vport lock. At this time if the
monitoring tools tries to access link settings, it call traces
for accessing released vport pointer.

To avoid it, move link_speed_mbps to netdev_priv structure
which removes the dependency on vport pointer and the vport lock
in idpf_get_link_ksettings. Also use netif_carrier_ok()
to check the link status and adjust the offsetof to use link_up
instead of link_speed_mbps.

Fixes: 02cbfba ("idpf: add ethtool callbacks")
	Cc: [email protected] # 6.7+
	Reviewed-by: Tarun K Singh <[email protected]>
	Signed-off-by: Pavan Kumar Linga <[email protected]>
	Tested-by: Krishneil Singh <[email protected]>
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 81d2fb4)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Pavan Kumar Linga <[email protected]>
commit 9b58031

In an event where the platform running the device control plane
is rebooted, reset is detected on the driver. It releases
all the resources and waits for the reset to complete. Once the
reset is done, it tries to build the resources back. At this
time if the device control plane is not yet started, then
the driver timeouts on the virtchnl message and retries to
establish the mailbox again.

In the retry flow, mailbox is deinitialized but the mailbox
workqueue is still alive and polling for the mailbox message.
This results in accessing the released control queue leading to
null-ptr-deref. Fix it by unrolling the work queue cancellation
and mailbox deinitialization in the reverse order which they got
initialized.

Fixes: 4930fbf ("idpf: add core init and interrupt request")
Fixes: 34c21fa ("idpf: implement virtchnl transaction manager")
	Cc: [email protected] # 6.9+
	Reviewed-by: Tarun K Singh <[email protected]>
	Signed-off-by: Pavan Kumar Linga <[email protected]>
	Tested-by: Krishneil Singh <[email protected]>
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 9b58031)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Trond Myklebust <[email protected]>
commit b326df4

It appears that in certain cases, RDMA capable transports can benefit
from the ability to establish multiple connections to increase their
throughput. This patch therefore enables the use of the "nconnect" mount
option for those use cases.

	Signed-off-by: Trond Myklebust <[email protected]>
(cherry picked from commit b326df4)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit be7a6a7

It isn't guaranteed that NETWORK_INTERFACE_INFO::LinkSpeed will always
be set by the server, so the client must handle any values and then
prevent oopses like below from happening:

Oops: divide error: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 1323 Comm: cat Not tainted 6.13.0-rc7 #2
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-3.fc41
04/01/2014
RIP: 0010:cifs_debug_data_proc_show+0xa45/0x1460 [cifs] Code: 00 00 48
89 df e8 3b cd 1b c1 41 f6 44 24 2c 04 0f 84 50 01 00 00 48 89 ef e8
e7 d0 1b c1 49 8b 44 24 18 31 d2 49 8d 7c 24 28 <48> f7 74 24 18 48 89
c3 e8 6e cf 1b c1 41 8b 6c 24 28 49 8d 7c 24
RSP: 0018:ffffc90001817be0 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88811230022c RCX: ffffffffc041bd99
RDX: 0000000000000000 RSI: 0000000000000567 RDI: ffff888112300228
RBP: ffff888112300218 R08: fffff52000302f5f R09: ffffed1022fa58ac
R10: ffff888117d2c566 R11: 00000000fffffffe R12: ffff888112300200
R13: 000000012a15343f R14: 0000000000000001 R15: ffff888113f2db58
FS: 00007fe27119e740(0000) GS:ffff888148600000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe2633c5000 CR3: 0000000124da0000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
 <TASK>
 ? __die_body.cold+0x19/0x27
 ? die+0x2e/0x50
 ? do_trap+0x159/0x1b0
 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs]
 ? do_error_trap+0x90/0x130
 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs]
 ? exc_divide_error+0x39/0x50
 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs]
 ? asm_exc_divide_error+0x1a/0x20
 ? cifs_debug_data_proc_show+0xa39/0x1460 [cifs]
 ? cifs_debug_data_proc_show+0xa45/0x1460 [cifs]
 ? seq_read_iter+0x42e/0x790
 seq_read_iter+0x19a/0x790
 proc_reg_read_iter+0xbe/0x110
 ? __pfx_proc_reg_read_iter+0x10/0x10
 vfs_read+0x469/0x570
 ? do_user_addr_fault+0x398/0x760
 ? __pfx_vfs_read+0x10/0x10
 ? find_held_lock+0x8a/0xa0
 ? __pfx_lock_release+0x10/0x10
 ksys_read+0xd3/0x170
 ? __pfx_ksys_read+0x10/0x10
 ? __rcu_read_unlock+0x50/0x270
 ? mark_held_locks+0x1a/0x90
 do_syscall_64+0xbb/0x1d0
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7fe271288911
Code: 00 48 8b 15 01 25 10 00 f7 d8 64 89 02 b8 ff ff ff ff eb bd e8
20 ad 01 00 f3 0f 1e fa 80 3d b5 a7 10 00 00 74 13 31 c0 0f 05 <48> 3d
00 f0 ff ff 77 4f c3 66 0f 1f 44 00 00 55 48 89 e5 48 83 ec
RSP: 002b:00007ffe87c079d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 0000000000040000 RCX: 00007fe271288911
RDX: 0000000000040000 RSI: 00007fe2633c6000 RDI: 0000000000000003
RBP: 00007ffe87c07a00 R08: 0000000000000000 R09: 00007fe2713e6380
R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000040000
R13: 00007fe2633c6000 R14: 0000000000000003 R15: 0000000000000000
 </TASK>

Fix this by setting cifs_server_iface::speed to a sane value (1Gbps)
by default when link speed is unset.

	Cc: Shyam Prasad N <[email protected]>
	Cc: Tom Talpey <[email protected]>
Fixes: a6d8fb5 ("cifs: distribute channels across interfaces based on speed")
	Reported-by: Frank Sorenson <[email protected]>
	Reported-by: Jay Shin <[email protected]>
	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit be7a6a7)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit ec68680

Just ignore reparse points that the client can't parse rather than
bailing out and not opening the file or directory.

	Reported-by: Marc <[email protected]>
Closes: https://lore.kernel.org/r/CAMHwNVv-B+Q6wa0FEXrAuzdchzcJRsPKDDRrNaYZJd6X-+iJzw@mail.gmail.com
Fixes: 539aad7 ("smb: client: introduce ->parse_reparse_point()")
	Tested-by: Anthony Nandaa (Microsoft) <[email protected]>
	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit ec68680)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Michael Chan <[email protected]>
commit 1e9614c

Instead of passing the 2nd parameter phc_cfg to bnxt_ptp_init().
Store it in bp->ptp_cfg so that the caller doesn't need to know what
the value should be.

In the next patch, we'll need to call bnxt_ptp_init() in bnxt_resume()
and this will make it easier.

	Reviewed-by: Somnath Kotur <[email protected]>
	Reviewed-by: Pavan Chebbi <[email protected]>
	Reviewed-by: Kalesh AP <[email protected]>
	Signed-off-by: Michael Chan <[email protected]>
	Signed-off-by: Paolo Abeni <[email protected]>

(cherry picked from commit 1e9614c)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Michael Chan <[email protected]>
commit 3661c05

If we go through the PCI shutdown or suspend path, we shutdown the
NIC but PTP remains registered.  If the kernel continues to run for
a little bit, the periodic PTP .do_aux_work() function may be called
and it will read the PHC from the BAR register.  Since the device
has already been disabled, it will cause a PCIe completion timeout.
Fix it by calling bnxt_ptp_clear() in the PCI shutdown/suspend
handlers.  bnxt_ptp_clear() will unregister from PTP and
.do_aux_work() will be canceled.

In bnxt_resume(), we need to re-initialize PTP.

Fixes: a521c8a ("bnxt_en: Move bnxt_ptp_init() from bnxt_open() back to bnxt_init_one()")
	Cc: Richard Cochran <[email protected]>
	Reviewed-by: Somnath Kotur <[email protected]>
	Reviewed-by: Pavan Chebbi <[email protected]>
	Reviewed-by: Kalesh AP <[email protected]>
	Signed-off-by: Michael Chan <[email protected]>
	Signed-off-by: Paolo Abeni <[email protected]>

(cherry picked from commit 3661c05)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Steve French <[email protected]>
commit 9ed9d83

This client was only requesting READ caching, not READ and HANDLE caching
in the LeaseState on the open requests we send for directories.  To
delay closing a handle (e.g. for caching directory contents) we should
be requesting HANDLE as well as READ (as we already do for deferred
close of files).   See MS-SMB2 3.3.1.4 e.g.

	Cc: [email protected]
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 9ed9d83)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
cve CVE-2024-53178
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paul Aurich <[email protected]>
commit 7afb867

open_cached_dir() may either race with the tcon reconnection even before
compound_send_recv() or directly trigger a reconnection via
SMB2_open_init() or SMB_query_info_init().

The reconnection process invokes invalidate_all_cached_dirs() via
cifs_mark_open_files_invalid(), which removes all cfids from the
cfids->entries list but doesn't drop a ref if has_lease isn't true. This
results in the currently-being-constructed cfid not being on the list,
but still having a refcount of 2. It leaks if returned from
open_cached_dir().

Fix this by setting cfid->has_lease when the ref is actually taken; the
cfid will not be used by other threads until it has a valid time.

Addresses these kmemleaks:

unreferenced object 0xffff8881090c4000 (size 1024):
  comm "bash", pid 1860, jiffies 4295126592
  hex dump (first 32 bytes):
    00 01 00 00 00 00 ad de 22 01 00 00 00 00 ad de  ........".......
    00 ca 45 22 81 88 ff ff f8 dc 4f 04 81 88 ff ff  ..E"......O.....
  backtrace (crc 6f58c20f):
    [<ffffffff8b895a1e>] __kmalloc_cache_noprof+0x2be/0x350
    [<ffffffff8bda06e3>] open_cached_dir+0x993/0x1fb0
    [<ffffffff8bdaa750>] cifs_readdir+0x15a0/0x1d50
    [<ffffffff8b9a853f>] iterate_dir+0x28f/0x4b0
    [<ffffffff8b9a9aed>] __x64_sys_getdents64+0xfd/0x200
    [<ffffffff8cf6da05>] do_syscall_64+0x95/0x1a0
    [<ffffffff8d00012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
unreferenced object 0xffff8881044fdcf8 (size 8):
  comm "bash", pid 1860, jiffies 4295126592
  hex dump (first 8 bytes):
    00 cc cc cc cc cc cc cc                          ........
  backtrace (crc 10c106a9):
    [<ffffffff8b89a3d3>] __kmalloc_node_track_caller_noprof+0x363/0x480
    [<ffffffff8b7d7256>] kstrdup+0x36/0x60
    [<ffffffff8bda0700>] open_cached_dir+0x9b0/0x1fb0
    [<ffffffff8bdaa750>] cifs_readdir+0x15a0/0x1d50
    [<ffffffff8b9a853f>] iterate_dir+0x28f/0x4b0
    [<ffffffff8b9a9aed>] __x64_sys_getdents64+0xfd/0x200
    [<ffffffff8cf6da05>] do_syscall_64+0x95/0x1a0
    [<ffffffff8d00012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

And addresses these BUG splats when unmounting the SMB filesystem:

BUG: Dentry ffff888140590ba0{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
WARNING: CPU: 3 PID: 3433 at fs/dcache.c:1536 umount_check+0xd0/0x100
Modules linked in:
CPU: 3 UID: 0 PID: 3433 Comm: bash Not tainted 6.12.0-rc4-g850925a8133c-dirty #49
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
RIP: 0010:umount_check+0xd0/0x100
Code: 8d 7c 24 40 e8 31 5a f4 ff 49 8b 54 24 40 41 56 49 89 e9 45 89 e8 48 89 d9 41 57 48 89 de 48 c7 c7 80 e7 db ac e8 f0 72 9a ff <0f> 0b 58 31 c0 5a 5b 5d 41 5c 41 5d 41 5e 41 5f e9 2b e5 5d 01 41
RSP: 0018:ffff88811cc27978 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff888140590ba0 RCX: ffffffffaaf20bae
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff8881f6fb6f40
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed1023984ee3
R10: ffff88811cc2771f R11: 00000000016cfcc0 R12: ffff888134383e08
R13: 0000000000000002 R14: ffff8881462ec668 R15: ffffffffaceab4c0
FS:  00007f23bfa98740(0000) GS:ffff8881f6f80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000556de4a6f808 CR3: 0000000123c80000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 d_walk+0x6a/0x530
 shrink_dcache_for_umount+0x6a/0x200
 generic_shutdown_super+0x52/0x2a0
 kill_anon_super+0x22/0x40
 cifs_kill_sb+0x159/0x1e0
 deactivate_locked_super+0x66/0xe0
 cleanup_mnt+0x140/0x210
 task_work_run+0xfb/0x170
 syscall_exit_to_user_mode+0x29f/0x2b0
 do_syscall_64+0xa1/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f23bfb93ae7
Code: ff ff ff ff c3 66 0f 1f 44 00 00 48 8b 0d 11 93 0d 00 f7 d8 64 89 01 b8 ff ff ff ff eb bf 0f 1f 44 00 00 b8 50 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e9 92 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffee9138598 EFLAGS: 00000246 ORIG_RAX: 0000000000000050
RAX: 0000000000000000 RBX: 0000558f1803e9a0 RCX: 00007f23bfb93ae7
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000558f1803e9a0
RBP: 0000558f1803e600 R08: 0000000000000007 R09: 0000558f17fab610
R10: d91d5ec34ab757b0 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000015 R15: 0000000000000000
 </TASK>
irq event stamp: 1163486
hardirqs last  enabled at (1163485): [<ffffffffac98d344>] _raw_spin_unlock_irqrestore+0x34/0x60
hardirqs last disabled at (1163486): [<ffffffffac97dcfc>] __schedule+0xc7c/0x19a0
softirqs last  enabled at (1163482): [<ffffffffab79a3ee>] __smb_send_rqst+0x3de/0x990
softirqs last disabled at (1163480): [<ffffffffac2314f1>] release_sock+0x21/0xf0
---[ end trace 0000000000000000 ]---

VFS: Busy inodes after unmount of cifs (cifs)
------------[ cut here ]------------
kernel BUG at fs/super.c:661!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 1 UID: 0 PID: 3433 Comm: bash Tainted: G        W          6.12.0-rc4-g850925a8133c-dirty #49
Tainted: [W]=WARN
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
RIP: 0010:generic_shutdown_super+0x290/0x2a0
Code: e8 15 7c f7 ff 48 8b 5d 28 48 89 df e8 09 7c f7 ff 48 8b 0b 48 89 ee 48 8d 95 68 06 00 00 48 c7 c7 80 7f db ac e8 00 69 af ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 90 90 90 90 90
RSP: 0018:ffff88811cc27a50 EFLAGS: 00010246
RAX: 000000000000003e RBX: ffffffffae994420 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffffffffab06180e RDI: ffff8881f6eb18c8
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed103edd6319
R10: ffff8881f6eb18cb R11: 00000000016d3158 R12: ffff8881462ec9c0
R13: ffff8881462ec050 R14: 0000000000000001 R15: 0000000000000000
FS:  00007f23bfa98740(0000) GS:ffff8881f6e80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8364005d68 CR3: 0000000123c80000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 kill_anon_super+0x22/0x40
 cifs_kill_sb+0x159/0x1e0
 deactivate_locked_super+0x66/0xe0
 cleanup_mnt+0x140/0x210
 task_work_run+0xfb/0x170
 syscall_exit_to_user_mode+0x29f/0x2b0
 do_syscall_64+0xa1/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f23bfb93ae7
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:generic_shutdown_super+0x290/0x2a0
Code: e8 15 7c f7 ff 48 8b 5d 28 48 89 df e8 09 7c f7 ff 48 8b 0b 48 89 ee 48 8d 95 68 06 00 00 48 c7 c7 80 7f db ac e8 00 69 af ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 90 90 90 90 90
RSP: 0018:ffff88811cc27a50 EFLAGS: 00010246
RAX: 000000000000003e RBX: ffffffffae994420 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffffffffab06180e RDI: ffff8881f6eb18c8
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed103edd6319
R10: ffff8881f6eb18cb R11: 00000000016d3158 R12: ffff8881462ec9c0
R13: ffff8881462ec050 R14: 0000000000000001 R15: 0000000000000000
FS:  00007f23bfa98740(0000) GS:ffff8881f6e80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8364005d68 CR3: 0000000123c80000 CR4: 0000000000350ef0

This reproduces eventually with an SMB mount and two shells running
these loops concurrently

- while true; do
      cd ~; sleep 1;
      for i in {1..3}; do cd /mnt/test/subdir;
          echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1;
      done;
      echo ...;
  done
- while true; do
      iptables -F OUTPUT; mount -t cifs -a;
      for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done;
      iptables -I OUTPUT -p tcp --dport 445 -j DROP;
      sleep 10
      echo "unmounting"; umount -l -t cifs -a; echo "done unmounting";
      sleep 20
      echo "recovering"; iptables -F OUTPUT;
      sleep 10;
  done

Fixes: ebe98f1 ("cifs: enable caching of directories for which a lease is held")
Fixes: 5c86919 ("smb: client: fix use-after-free in smb2_query_info_compound()")
	Cc: [email protected]
	Signed-off-by: Paul Aurich <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 7afb867)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
cve CVE-2024-53177
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paul Aurich <[email protected]>
commit a9685b4

If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty #87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

	Cc: [email protected]
	Signed-off-by: Paul Aurich <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit a9685b4)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Henrique Carvalho <[email protected]>
commit f6e8883

Checks inside open_cached_dir() can be removed because if dir caching is
disabled then tcon->cfids is necessarily NULL. Therefore, all other checks
are redundant.

	Signed-off-by: Henrique Carvalho <[email protected]>
	Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]>
	Reviewed-by: Enzo Matsumiya <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit f6e8883)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Henrique Carvalho <[email protected]>
commit ceaf145

Setting dir_cache_timeout to zero should disable the caching of
directory contents. Currently, even when dir_cache_timeout is zero,
some caching related functions are still invoked, which is unintended
behavior.

Fix the issue by setting tcon->nohandlecache to true when
dir_cache_timeout is zero, ensuring that directory handle caching
is properly disabled.

Fixes: 238b351 ("smb3: allow controlling length of time directory entries are cached with dir leases")
	Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]>
	Reviewed-by: Enzo Matsumiya <[email protected]>
	Signed-off-by: Henrique Carvalho <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit ceaf145)
	Signed-off-by: Jonathan Maple <[email protected]>
…fids

jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Henrique Carvalho <[email protected]>
commit 07bdf92

Change return value from -ENOENT to -EOPNOTSUPP to maintain consistency
with the return value of open_cached_dir() for the same case. This
change is safe as the only calling function does not differentiate
between these return values.

	Reviewed-by: Paulo Alcantara (Red Hat) <[email protected]>
	Reviewed-by: Enzo Matsumiya <[email protected]>
	Signed-off-by: Henrique Carvalho <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 07bdf92)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Alexander Usyskin <[email protected]>
commit 0dc0411

Extend the quirk to disable MEI interface on Intel PCH Ignition (IGN)
and SPS firmwares for RPL-S devices. These firmwares do not support
the MEI protocol.

Fixes: 3ed8c7d ("mei: me: add raptor lake point S DID")
	Cc: [email protected]
	Signed-off-by: Alexander Usyskin <[email protected]>
	Signed-off-by: Tomas Winkler <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Greg Kroah-Hartman <[email protected]>
(cherry picked from commit 0dc0411)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Babu Moger <[email protected]>
commit 0976783

The QOS Memory Bandwidth Enforcement Limit is reported by
CPUID_Fn80000020_EAX_x01 and CPUID_Fn80000020_EAX_x02:

  Bits	 Description
  31:0	 BW_LEN: Size of the QOS Memory Bandwidth Enforcement Limit.

Newer processors can support higher bandwidth limit than the current
hard-coded value. Remove latter and detect using CPUID instead. Also,
update the register variables eax and edx to match the AMD CPUID
definition.

The CPUID details are documented in the Processor Programming Reference
(PPR) Vol 1.1 for AMD Family 19h Model 11h B1 - 55901 Rev 0.25 in the
Link tag below.

Fixes: 4d05bf7 ("x86/resctrl: Introduce AMD QOS feature")
	Signed-off-by: Babu Moger <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Reviewed-by: Reinette Chatre <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Link: https://lore.kernel.org/r/c26a8ca79d399ed076cf8bf2e9fbc58048808289.1705359148.git.babu.moger@amd.com
(cherry picked from commit 0976783)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Babu Moger <[email protected]>
commit 54e35eb

If the BMEC (Bandwidth Monitoring Event Configuration) feature is
supported, the bandwidth events can be configured. The maximum supported
bandwidth bitmask can be read from CPUID:

  CPUID_Fn80000020_ECX_x03 [Platform QoS Monitoring Bandwidth Event Configuration]
  Bits    Description
  31:7    Reserved
   6:0    Identifies the bandwidth sources that can be tracked.

While at it, move the mask checking to mon_config_write() before
iterating over all the domains. Also, print the valid bitmask when the
user tries to configure invalid event configuration value.

The CPUID details are documented in the Processor Programming Reference
(PPR) Vol 1.1 for AMD Family 19h Model 11h B1 - 55901 Rev 0.25 in the
Link tag.

Fixes: dc2a3e8 ("x86/resctrl: Add interface to read mbm_total_bytes_config")
	Signed-off-by: Babu Moger <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Reviewed-by: Reinette Chatre <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
Link: https://lore.kernel.org/r/669896fa512c7451319fa5ca2fdb6f7e015b5635.1705359148.git.babu.moger@amd.com
(cherry picked from commit 54e35eb)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Babu Moger <[email protected]>
commit fc747ee

The kernel test robot reported the following warning after commit

  54e35eb ("x86/resctrl: Read supported bandwidth sources from CPUID").

even though the issue is present even in the original commit

  92bd5a1 ("x86/resctrl: Add interface to write mbm_total_bytes_config")

which added this function. The reported warning is:

  $ make C=1 CHECK=scripts/coccicheck arch/x86/kernel/cpu/resctrl/rdtgroup.o
  ...
  arch/x86/kernel/cpu/resctrl/rdtgroup.c:1621:5-8: Unneeded variable: "ret". Return "0" on line 1655

Remove the local variable 'ret'.

  [ bp: Massage commit message, make mbm_config_write_domain() void. ]

Fixes: 92bd5a1 ("x86/resctrl: Add interface to write mbm_total_bytes_config")
	Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
	Signed-off-by: Babu Moger <[email protected]>
	Signed-off-by: Borislav Petkov (AMD) <[email protected]>
	Acked-by: Reinette Chatre <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit fc747ee)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Tony Luck <[email protected]>
commit 17c4fc3

New CPU #defines encode vendor and family as well as model.

	Signed-off-by: Tony Luck <[email protected]>
	Acked-by: Rafael J. Wysocki <[email protected]>
	Signed-off-by: Rafael J. Wysocki <[email protected]>
(cherry picked from commit 17c4fc3)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Artem Bityutskiy <[email protected]>
commit 370406b

Add Granite Rapids Xeon C-states, which are C1, C1E, C6, and C6P.

Comparing to previous Xeon Generations (e.g., Emerald Rapids), C6
requests end up only in core C6 state, and no package C-state promotion
takes place even if all cores in the package are in core C6.

C6P requests also end up in core C6, but if all cores have requested
C6P, the SoC will enter the package C6 state.

	Signed-off-by: Artem Bityutskiy <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Changelog edits ]
	Signed-off-by: Rafael J. Wysocki <[email protected]>
(cherry picked from commit 370406b)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Kai-Heng Feng <[email protected]>
commit 5bb3321

PCIe ethernet throughut is sub-optimal on Jasper Lake and Elkhart Lake.

The CPU can take long time to exit to C0 to handle IRQ and perform DMA
when C1E has been entered.

For this reason, adjust intel_idle to disable promotion to C1E and still
use C-states from ACPI _CST on those two platforms.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219023
	Signed-off-by: Kai-Heng Feng <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Subject and changelog edits ]
	Signed-off-by: Rafael J. Wysocki <[email protected]>
(cherry picked from commit 5bb3321)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Michael Chan <[email protected]>
commit de37faf

The existing code is using RSS profile to determine IPV4/IPV6 GSO type
on all chips older than 5760X.  This won't work on 5750X chips that may
be using modified RSS profiles.  This commit from 2018 has updated the
driver to not use RSS profile for HW GRO packets on newer chips:

50f011b ("bnxt_en: Update RSS setup and GRO-HW logic according to the latest spec.")

However, a recent commit to add support for the newest 5760X chip broke
the logic.  If the GRO packet needs to be re-segmented by the stack, the
wrong GSO type will cause the packet to be dropped.

Fix it to only use RSS profile to determine GSO type on the oldest
5730X/5740X chips which cannot use the new method and is safe to use the
RSS profiles.

Also fix the L3/L4 hash type for RX packets by not using the RSS
profile for the same reason.  Use the ITYPE field in the RX completion
to determine L3/L4 hash types correctly.

Fixes: a7445d6 ("bnxt_en: Add support for new RX and TPA_START completion types for P7")
	Reviewed-by: Colin Winegarden <[email protected]>
	Reviewed-by: Somnath Kotur <[email protected]>
	Reviewed-by: Kalesh AP <[email protected]>
	Signed-off-by: Michael Chan <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit de37faf)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Eric Dumazet <[email protected]>
commit 581073f

trafgen performance considerably sank on hosts with many cores
after the blamed commit.

packet_read_pending() is very expensive, and calling it
in af_packet fast path defeats Daniel intent in commit
b013840 ("packet: use percpu mmap tx frame pending refcount")

tpacket_destruct_skb() makes room for one packet, we can immediately
wakeup a producer, no need to completely drain the tx ring.

Fixes: 89ed5b5 ("af_packet: Block execution of tasks waiting for transmit to complete in AF_PACKET")
	Signed-off-by: Eric Dumazet <[email protected]>
	Cc: Neil Horman <[email protected]>
	Cc: Daniel Borkmann <[email protected]>
	Reviewed-by: Willem de Bruijn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 581073f)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Xiaxi Shen <[email protected]>
commit bdcffe4

Fixed typos in various files under fs/smb/client/

	Signed-off-by: Xiaxi Shen <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit bdcffe4)
	Signed-off-by: Jonathan Maple <[email protected]>
PlaidCat added 20 commits March 11, 2025 16:46
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit fa2f990

When shutting down the server in cifs_put_tcp_session(), cifsd thread
might be reconnecting to multiple DFS targets before it realizes it
should exit the loop, so @server->hostname can't be freed as long as
cifsd thread isn't done.  Otherwise the following can happen:

  RIP: 0010:__slab_free+0x223/0x3c0
  Code: 5e 41 5f c3 cc cc cc cc 4c 89 de 4c 89 cf 44 89 44 24 08 4c 89
  1c 24 e8 fb cf 8e 00 44 8b 44 24 08 4c 8b 1c 24 e9 5f fe ff ff <0f>
  0b 41 f7 45 08 00 0d 21 00 0f 85 2d ff ff ff e9 1f ff ff ff 80
  RSP: 0018:ffffb26180dbfd08 EFLAGS: 00010246
  RAX: ffff8ea34728e510 RBX: ffff8ea34728e500 RCX: 0000000000800068
  RDX: 0000000000800068 RSI: 0000000000000000 RDI: ffff8ea340042400
  RBP: ffffe112041ca380 R08: 0000000000000001 R09: 0000000000000000
  R10: 6170732e31303000 R11: 70726f632e786563 R12: ffff8ea34728e500
  R13: ffff8ea340042400 R14: ffff8ea34728e500 R15: 0000000000800068
  FS: 0000000000000000(0000) GS:ffff8ea66fd80000(0000)
  000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007ffc25376080 CR3: 000000012a2ba001 CR4:
  PKRU: 55555554
  Call Trace:
   <TASK>
   ? show_trace_log_lvl+0x1c4/0x2df
   ? show_trace_log_lvl+0x1c4/0x2df
   ? __reconnect_target_unlocked+0x3e/0x160 [cifs]
   ? __die_body.cold+0x8/0xd
   ? die+0x2b/0x50
   ? do_trap+0xce/0x120
   ? __slab_free+0x223/0x3c0
   ? do_error_trap+0x65/0x80
   ? __slab_free+0x223/0x3c0
   ? exc_invalid_op+0x4e/0x70
   ? __slab_free+0x223/0x3c0
   ? asm_exc_invalid_op+0x16/0x20
   ? __slab_free+0x223/0x3c0
   ? extract_hostname+0x5c/0xa0 [cifs]
   ? extract_hostname+0x5c/0xa0 [cifs]
   ? __kmalloc+0x4b/0x140
   __reconnect_target_unlocked+0x3e/0x160 [cifs]
   reconnect_dfs_server+0x145/0x430 [cifs]
   cifs_handle_standard+0x1ad/0x1d0 [cifs]
   cifs_demultiplex_thread+0x592/0x730 [cifs]
   ? __pfx_cifs_demultiplex_thread+0x10/0x10 [cifs]
   kthread+0xdd/0x100
   ? __pfx_kthread+0x10/0x10
   ret_from_fork+0x29/0x50
   </TASK>

Fixes: 7be3248 ("cifs: To match file servers, make sure the server hostname matches")
	Reported-by: Jay Shin <[email protected]>
	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit fa2f990)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 62eecd8

Use new helper in find_domain_name() and find_timestamp() to avoid
duplicating code.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 62eecd8)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 0e8ae9b

Parse FQDN of the domain in CHALLENGE_MESSAGE message as it's gonna be
useful when mounting DFS shares against old Windows Servers (2012 R2
or earlier) that return not fully qualified hostnames for DFS targets
by default.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 0e8ae9b)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit ad46faf

Old Windows servers will return not fully qualified DFS targets by
default as specified in

  MS-DFSC 3.2.5.5 Receiving a Root Referral Request or Link Referral
  Request

    | Servers SHOULD<30> return fully qualified DNS host names of
    | targets in responses to root referral requests and link referral
    | requests.
    | ...
    | <30> Section 3.2.5.5: By default, Windows Server 2003, Windows
    | Server 2008, Windows Server 2008 R2, Windows Server 2012, and
    | Windows Server 2012 R2 return DNS host names that are not fully
    | qualified for targets.

Fix this by converting all NetBIOS host names from DFS targets to
FQDNs and try resolving them first if DNS domain name was provided in
NTLMSSP CHALLENGE_MESSAGE message from previous SMB2_SESSION_SETUP.
This also prevents the client from translating the DFS target
hostnames to another domain depending on the network domain search
order.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit ad46faf)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 489d152

If the user specified a DNS domain name in domain= mount option, then
use it instead of parsing it in NTLMSSP CHALLENGE_MESSAGE message.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 489d152)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 4b1b4c8

Some places pass hostnames rather than UNC paths to resolve them to ip
addresses, so provide helpers to handle both cases and then stop
converting hostnames to UNC paths by inserting path delimiters into
them.  Also kill @expiry parameter as it's not used anywhere.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 4b1b4c8)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 5433c62

If a link referral request sent to root server was successful but
client failed to connect to all link targets, there is no need to
retry same link referral on a different root server.  Set an end
marker for the DFS root referral so the client will not attempt to
re-send link referrals to different root servers on failures.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 5433c62)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit bfc1155

Return -ENOENT in parse_dfs_referrals() when server returns no targets
for a referral request as specified in

  MS-DFSC 3.1.5.4.3 Receiving a Root Referral Response or Link
  Referral Response:

    > If the referral request is successful, but the NumberOfReferrals
    > field in the referral header (as specified in section 2.2.4) is
    > 0, the DFS server could not find suitable targets to return to
    > the client.  In this case, the client MUST fail the original I/O
    > operation with STATUS_OBJECT_PATH_NOT_FOUND.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit bfc1155)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 48aa995

If TCP Server is about to be destroyed (e.g. CifsExiting was set) and
it is reconnecting, stop retrying DFS targets from cached DFS referral
as this would potentially delay server shutdown in several seconds.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 48aa995)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 0a9b00e

TCP_Server_Info::leaf_fullpath is allocated in cifs_get_tcp_session()
and never changed afterwards, so there is no need to serialize its
access.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 0a9b00e)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 056e91c

The matching of DFS connections is already handled by @dfs_conn, so
remove @leaf_fullpath matching altogether.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 056e91c)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 77c2e45

Some servers don't respect the DFSREF_STORAGE_SERVER bit, so
unconditionally tree connect to DFS link target and then decide
whether or not continue chasing DFS referrals for DFS interlinks.
Otherwise the client would fail to mount such shares.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 77c2e45)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 773dc23

When the client attempts to tree connect to a domain-based DFS
namespace from a DFS interlink target, the server will return
STATUS_BAD_NETWORK_NAME and the following will appear on dmesg:

	CIFS: VFS:  BAD_NETWORK_NAME: \\dom\dfs

Since a DFS share might contain several DFS interlinks and they expire
after 10 minutes, the above message might end up being flooded on
dmesg when mounting or accessing them.

Print this only once per share.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 773dc23)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit be1963d

After commit 36008fe ("smb: client: don't try following DFS links
in cifs_tree_connect()"), TCP_Server_Info::leaf_fullpath will no
longer be changed, so there is no need to kstrdup() it.

	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit be1963d)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Jacob Keller <[email protected]>
commit 95aca43

The ice_read_phy_tstamp_ll_e810 function repeatedly reads the PF_SB_ATQBAL
register until the TS_LL_READ_TS bit is cleared. This is a perfect
candidate for using rd32_poll_timeout. However, the default implementation
uses a sleep-based wait. Use read_poll_timeout_atomic macro which is based
on the non-sleeping implementation and use it to replace the loop reading
in the ice_read_phy_tstamp_ll_e810 function.

Co-developed-by: Karol Kolacinski <[email protected]>
	Signed-off-by: Karol Kolacinski <[email protected]>
	Signed-off-by: Jacob Keller <[email protected]>
	Signed-off-by: Anton Nadezhdin <[email protected]>
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 95aca43)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Jacob Keller <[email protected]>
commit 5b15b1f

The TS_LL_READ macros are used as part of the low latency Tx timestamp
interface. A future firmware extension will add support for performing PHY
timer updates over this interface. Using TS_LL_READ as the prefix for these
macros will be confusing once the interface is used for other purposes.

Rename the macros, using the prefix REG_LL_PROXY_H, to better clarify that
this is for the low latency interface.
Additionally add macros for PF_SB_ATQBAH and PF_SB_ATQBAL registers to
better clarify content of this registers as PF_SB_ATQBAH contain low
part of Tx timestamp

Co-developed-by: Karol Kolacinski <[email protected]>
	Signed-off-by: Karol Kolacinski <[email protected]>
	Signed-off-by: Jacob Keller <[email protected]>
	Reviewed-by: Milena Olech <[email protected]>
	Signed-off-by: Anton Nadezhdin <[email protected]>
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 5b15b1f)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Jacob Keller <[email protected]>
commit 5032722

Newer firmware for the E810 devices support a 'low latency' interface to
interact with the PHY without using the Admin Queue. This is interacted
with via the REG_LL_PROXY_L and REG_LL_PROXY_H registers.

Currently, this interface is only used for Tx timestamps. There are two
different mechanisms, including one which uses an interrupt for firmware to
signal completion. However, these two methods are mutually exclusive, so no
synchronization between them was necessary.

This low latency interface is being extended in future firmware to support
also programming the PHY timers. Use of the interface for PHY timers will
need synchronization to ensure there is no overlap with a Tx timestamp.

The interrupt-based response complicates the locking somewhat. We can't use
a simple spinlock. This would require being acquired in
ice_ptp_req_tx_single_tstamp, and released in
ice_ptp_complete_tx_single_tstamp. The ice_ptp_req_tx_single_tstamp
function is called from the threaded IRQ, and the
ice_ptp_complete_tx_single_stamp is called from the low latency IRQ, so we
would need to acquire the lock with IRQs disabled.

To handle this, we'll use a wait queue along with
wait_event_interruptible_locked_irq in the update flows which don't use the
interrupt.

The interrupt flow will acquire the wait queue lock, set the
ATQBAL_FLAGS_INTR_IN_PROGRESS, and then initiate the firmware low latency
request, and unlock the wait queue lock.

Upon receipt of the low latency interrupt, the lock will be acquired, the
ATQBAL_FLAGS_INTR_IN_PROGRESS bit will be cleared, and the firmware
response will be captured, and wake_up_locked() will be called on the wait
queue.

The other flows will use wait_event_interruptible_locked_irq() to wait
until the ATQBAL_FLAGS_INTR_IN_PROGRESS is clear. This function checks the
condition under lock, but does not hold the lock while waiting. On return,
the lock is held, and a return of zero indicates we hold the lock and the
in-progress flag is not set.

This will ensure that threads which need to use the low latency interface
will sleep until they can acquire the lock without any pending low latency
interrupt flow interfering.

	Signed-off-by: Jacob Keller <[email protected]>
	Reviewed-by: Milena Olech <[email protected]>
	Signed-off-by: Anton Nadezhdin <[email protected]>
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit 5032722)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Jacob Keller <[email protected]>
commit a5c69d4

Newer versions of firmware support programming the PHY timer via the low
latency interface exposed over REG_LL_PROXY_L and REG_LL_PROXY_H. Add
support for checking the device capabilities for this feature.

Co-developed-by: Karol Kolacinski <[email protected]>
	Signed-off-by: Karol Kolacinski <[email protected]>
	Signed-off-by: Jacob Keller <[email protected]>
	Reviewed-by: Milena Olech <[email protected]>
	Signed-off-by: Anton Nadezhdin <[email protected]>
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit a5c69d4)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2522
Rebuild_History Non-Buildable kernel-5.14.0-503.29.1.el9_5
commit-author Jacob Keller <[email protected]>
commit ef9a64c

Programming the PHY registers in preparation for an increment value change
or a timer adjustment on E810 requires issuing Admin Queue commands for
each PHY register. It has been found that the firmware Admin Queue
processing occasionally has delays of tens or rarely up to hundreds of
milliseconds. This delay cascades to failures in the PTP applications which
depend on these updates being low latency.

Consider a standard PTP profile with a sync rate of 16 times per second.
This means there is ~62 milliseconds between sync messages. A complete
cycle of the PTP algorithm

1) Sync message (with Tx timestamp) from source
2) Follow-up message from source
3) Delay request (with Tx timestamp) from sink
4) Delay response (with Rx timestamp of request) from source
5) measure instantaneous clock offset
6) request time adjustment via CLOCK_ADJTIME systemcall

The Tx timestamps have a default maximum timeout of 10 milliseconds. If we
assume that the maximum possible time is used, this leaves us with ~42
milliseconds of processing time for a complete cycle.

The CLOCK_ADJTIME system call is synchronous and will block until the
driver completes its timer adjustment or frequency change.

If the writes to prepare the PHY timers get hit by a latency spike of 50
milliseconds, then the PTP application will be delayed past the point where
the next cycle should start. Packets from the next cycle may have already
arrived and are waiting on the socket.

In particular, LinuxPTP ptp4l may start complaining about missing an
announce message from the source, triggering a fault. In addition, the
clockcheck logic it uses may trigger. This clockcheck failure occurs
because the timestamp captured by hardware is compared against a reading of
CLOCK_MONOTONIC. It is assumed that the time when the Rx timestamp is
captured and the read from CLOCK_MONOTONIC are relatively close together.
This is not the case if there is a significant delay to processing the Rx
packet.

Newer firmware supports programming the PHY registers over a low latency
interface which bypasses the Admin Queue. Instead, software writes to the
REG_LL_PROXY_L and REG_LL_PROXY_H registers. Firmware reads these registers
and then programs the PHY timers.

Implement functions to use this interface when available to program the PHY
timers instead of using the Admin Queue. This avoids the Admin Queue
latency and ensures that adjustments happen within acceptable latency
bounds.

Co-developed-by: Karol Kolacinski <[email protected]>
	Signed-off-by: Karol Kolacinski <[email protected]>
	Signed-off-by: Jacob Keller <[email protected]>
	Signed-off-by: Anton Nadezhdin <[email protected]>
	Signed-off-by: Tony Nguyen <[email protected]>
(cherry picked from commit ef9a64c)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..master: 280817
Number of commits in rpm: 61
Number of commits matched with upstream: 58 (95.08%)
Number of commits in upstream but not in rpm: 280759
Number of commits NOT found in upstream: 3 (4.92%)

Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.29.1.el9_5 for kernel-5.14.0-503.29.1.el9_5
Clean Cherry Picks: 54 (93.10%)
Empty Cherry Picks: 3 (5.17%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-5.14.0-503.29.1.el9_5/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

@PlaidCat
Copy link
Collaborator Author

PlaidCat commented Mar 12, 2025

Thank You
I will fast-forward this locally soon.

This might be worth looking into as -ff-only is a basic requirement for these steps in our kernel.
https://github.com/orgs/community/discussions/4618#discussioncomment-2795556

[jmaple@devbox kernel-src-tree]$ git branch
 main
 master
 rocky9_5
* rocky9_5_rebuild
 rocky9_5_rebuild_kernel-5.14.0-503.29.1.el9_5
[jmaple@devbox kernel-src-tree]$ git checkout rocky9_5
Switched to branch 'rocky9_5'
Your branch is up to date with 'origin/rocky9_5'.
[jmaple@devbox kernel-src-tree]$ git merge --ff-only rocky9_5_rebuild
Updating 1173e76fd4c0..e6683295364e
Fast-forward

[jmaple@devbox kernel-src-tree]$ git log --decorate --oneline | head -n 2
e6683295364e (HEAD -> rocky9_5, tag: resf_kernel-5.14.0-503.29.1.el9_5, origin/rocky9_5_rebuild, rocky9_5_rebuild_kernel-5.14.0-503.29.1.el9_5, rocky9_5_rebuild) Rebuild rocky9_5 with kernel-5.14.0-503.29.1.el9_5
47b488370ef6 ice: implement low latency PHY timer updates

@PlaidCat PlaidCat merged commit e668329 into rocky9_5 Mar 12, 2025
4 checks passed
@PlaidCat
Copy link
Collaborator Author

[jmaple@devbox kernel-src-tree]$ git push origin rocky9_5
Total 0 (delta 0), reused 0 (delta 0), pack-reused 0
To github.com:ctrliq/kernel-src-tree.git
   1173e76fd4c0..e6683295364e  rocky9_5 -> rocky9_5

@PlaidCat PlaidCat deleted the rocky9_5_rebuild branch March 12, 2025 18:56
@PlaidCat
Copy link
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants