Skip to content

Conversation

PlaidCat
Copy link
Collaborator

@PlaidCat PlaidCat commented Apr 9, 2025

General Process:

Contains the following: http://download.rockylinux.org/pub/rocky/8.10/BaseOS/source/tree/Packages/k/

Checking Rebuild Commits for potentially missing commits:

$ ls ciq/ciq_backports/kernel-5.14.0-503.3*/rebuild.details.txt | while read line; do echo $line; cat $line; echo ""; echo ""; done
ciq/ciq_backports/kernel-5.14.0-503.31.1.el9_5/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..master: 294815
Number of commits in rpm: 44
Number of commits matched with upstream: 42 (95.45%)
Number of commits in upstream but not in rpm: 294790
Number of commits NOT found in upstream: 2 (4.55%)

Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.31.1.el9_5 for kernel-5.14.0-503.31.1.el9_5
Clean Cherry Picks: 17 (40.48%)
Empty Cherry Picks: 8 (19.05%)
_______________________________

__EMPTY COMMITS__________________________
64a66f4a3c89b4602ee1e6cd23b28729fc4562b3 cpufreq: intel_pstate: Update Balance performance EPP for Emerald Rapids
f8696133f6aa4e6a83c9fb2d9dddc6d194a2ba1f arp: Factorise ip_route_output() call in arp_req_set() and arp_req_delete().
51e9ba48d48786da89d2695be9a1cab40b2afc31 arp: Remove a nest in arp_req_get().
a428bfc77a4dd4ba19b7646e887fa655fcfee5a0 arp: Get dev after calling arp_req_(delete|set|get)().
0840556e5a3a331b6932ef17dd4bc94445df3297 net: Protect dev->name by seqlock.
bf4ea58874df3d43f7264709cec7fe320616552c arp: Convert ioctl(SIOCGARP) to RCU.
62e58ddb146502faff1dd23164a20688624eaaed net: add softirq safety to netdev_rename_lock
e361560a7912958ba3059f51e7dd21612d119169 dev: Acquire netdev_rename_lock before restoring dev->name in dev_change_name().

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 9, debranding and Rocky branding'
Ensure aarch64 kernel is not compressed'


ciq/ciq_backports/kernel-5.14.0-503.33.1.el9_5/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..master: 294815
Number of commits in rpm: 10
Number of commits matched with upstream: 8 (80.00%)
Number of commits in upstream but not in rpm: 294807
Number of commits NOT found in upstream: 2 (20.00%)

Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.33.1.el9_5 for kernel-5.14.0-503.33.1.el9_5
Clean Cherry Picks: 8 (100.00%)
Empty Cherry Picks: 0 (0.00%)
_______________________________

__EMPTY COMMITS__________________________

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 9, debranding and Rocky branding'
Ensure aarch64 kernel is not compressed'


ciq/ciq_backports/kernel-5.14.0-503.34.1.el9_5/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..master: 294815
Number of commits in rpm: 4
Number of commits matched with upstream: 1 (25.00%)
Number of commits in upstream but not in rpm: 294814
Number of commits NOT found in upstream: 3 (75.00%)

Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.34.1.el9_5 for kernel-5.14.0-503.34.1.el9_5
Clean Cherry Picks: 1 (100.00%)
Empty Cherry Picks: 0 (0.00%)
_______________________________

__EMPTY COMMITS__________________________

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 9, debranding and Rocky branding'
Ensure aarch64 kernel is not compressed'
crypto: rng - Fix extrng EFAULT handling


ciq/ciq_backports/kernel-5.14.0-503.35.1.el9_5/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..master: 294815
Number of commits in rpm: 11
Number of commits matched with upstream: 8 (72.73%)
Number of commits in upstream but not in rpm: 294807
Number of commits NOT found in upstream: 3 (27.27%)

Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.35.1.el9_5 for kernel-5.14.0-503.35.1.el9_5
Clean Cherry Picks: 7 (87.50%)
Empty Cherry Picks: 0 (0.00%)
_______________________________

__EMPTY COMMITS__________________________

__CHANGES NOT IN UPSTREAM________________
Porting to Rocky Linux 9, debranding and Rocky branding'
Ensure aarch64 kernel is not compressed'
redhat/configs: replace IOMMU_DEFAULT_DMA_STRICT with IOMMU_DEFAULT_DMA_LAZY


BUILD

/mnt/code/kernel-src-tree
no .config file found, moving on
[TIMER]{MRPROPER}: 0s
x86_64 architecture detected, copying config
'configs/kernel-5.14.0-x86_64.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky9_5_rebuild-c9a24474a2ec"
Making olddefconfig
#
# configuration written to .config
#
Starting Build
  SYNC    include/config/auto.conf.cmd
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_64.h

[SNIP]

  BTF [M] sound/virtio/virtio_snd.ko
  BTF [M] sound/xen/snd_xen_front.ko
[TIMER]{BUILD}: 1846s
Making Modules
  INSTALL /lib/modules/5.14.0-rocky9_5_rebuild-c9a24474a2ec/kernel/arch/x86/crypto/blake2s-x86_64.ko
  STRIP   /lib/modules/5.14.0-rocky9_5_rebuild-c9a24474a2ec/kernel/arch/x86/crypto/blake2s-x86_64.ko

[SNIP]

  SIGN    /lib/modules/5.14.0-rocky9_5_rebuild-c9a24474a2ec/kernel/sound/xen/snd_xen_front.ko
  DEPMOD  /lib/modules/5.14.0-rocky9_5_rebuild-c9a24474a2ec
[TIMER]{MODULES}: 41s
Making Install
sh ./arch/x86/boot/install.sh 5.14.0-rocky9_5_rebuild-c9a24474a2ec \
        arch/x86/boot/bzImage System.map "/boot"
[TIMER]{INSTALL}: 22s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-5.14.0-rocky9_5_rebuild-c9a24474a2ec and Index to 2
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 0s
[TIMER]{BUILD}: 1846s
[TIMER]{MODULES}: 41s
[TIMER]{INSTALL}: 22s
[TIMER]{TOTAL} 1913s
Rebooting in 10 seconds

Boot

[maple@r9-sigcloud-builder ~]$ uname -r
5.14.0-rocky9_5_rebuild-c9a24474a2ec

Kselftest

$ grep '^ok ' 5.14.0-rocky9_5_rebuild-c9a24474a2ec.kselftests.log | wc -l
316

PlaidCat added 30 commits April 8, 2025 18:11
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Pedro Henrique Kopper <[email protected]>
commit 64a66f4
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.31.1.el9_5/64a66f4a.failed

On Intel Emerald Rapids machines, we ship the Energy Performance Preference
(EPP) default for balance_performance as 128. However, during an internal
investigation together with Intel, we have determined that 32 is a more
suitable value. This leads to significant improvements in both performance
and energy:

POV-Ray: 32% faster | 12% less energy
OpenSSL: 12% faster | energy within 1%
Build Linux Kernel: 29% faster | 18% less energy

Therefore, we should move the default EPP for balance_performance to 32.
This is in line with what has already been done for Sapphire Rapids.

	Signed-off-by: Pedro Henrique Kopper <[email protected]>
	Acked-by: Srinivas Pandruvada <[email protected]>
Link: https://patch.msgid.link/Zqu6zjVMoiXwROBI@capivara
	Signed-off-by: Rafael J. Wysocki <[email protected]>
(cherry picked from commit 64a66f4)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/cpufreq/intel_pstate.c
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Bjorn Helgaas <[email protected]>
commit 199f968

Arul, Mateusz, Imcarneiro91, and Aman reported a regression caused by
07eab09 ("efi/x86: Remove EfiMemoryMappedIO from E820 map").  On the
Lenovo Legion 9i laptop, that commit removes the ECAM area from E820, which
means the early E820 validation fails, which means we don't enable ECAM in
the "early MCFG" path.

The static MCFG table describes ECAM without depending on the ACPI
interpreter.  Many Legion 9i ACPI methods rely on that, so they fail when
PCI config access isn't available, resulting in the embedded controller,
PS/2, audio, trackpad, and battery devices not being detected.  The _OSC
method also fails, so Linux can't take control of the PCIe hotplug, PME,
and AER features:

  # pci_mmcfg_early_init()

  PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0]
  PCI: not using ECAM ([mem 0xc0000000-0xce0fffff] not reserved)

  ACPI Error: AE_ERROR, Returned by Handler for [PCI_Config] (20230628/evregion-300)
  ACPI: Interpreter enabled
  ACPI: Ignoring error and continuing table load
  ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162)
  ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220)
  ACPI: Skipping parse of AML opcode: OpcodeName unavailable (0x0010)
  ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162)
  ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220)
  ...
  ACPI Error: Aborting method \_SB.PC00._OSC due to previous error (AE_NOT_FOUND) (20230628/psparse-529)
  acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_NOT_FOUND)

  # pci_mmcfg_late_init()

  PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0]
  PCI: [Firmware Info]: ECAM [mem 0xc0000000-0xce0fffff] not reserved in ACPI motherboard resources
  PCI: ECAM [mem 0xc0000000-0xce0fffff] is EfiMemoryMappedIO; assuming valid
  PCI: ECAM [mem 0xc0000000-0xce0fffff] reserved to work around lack of ACPI motherboard _CRS

Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be reserved by a PNP0C02
resource, but there's no requirement to mention it in E820, so we shouldn't
look at E820 to validate the ECAM space described by MCFG.

In 2006, 946f2ee ("[PATCH] i386/x86-64: Check that MCFG points to an
e820 reserved area") added a sanity check of E820 to work around buggy MCFG
tables, but that over-aggressive validation causes failures like this one.

Keep the E820 validation check for machines older than 2016, an arbitrary
ten years after 946f2ee, so machines that depend on it don't break.

Skip the early E820 check for 2016 and newer BIOSes since there's no
requirement to describe ECAM in E820.

Link: https://lore.kernel.org/r/[email protected]
Fixes: 07eab09 ("efi/x86: Remove EfiMemoryMappedIO from E820 map")
	Reported-by: Mateusz Kaduk <[email protected]>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218444
	Signed-off-by: Bjorn Helgaas <[email protected]>
	Tested-by: Mateusz Kaduk <[email protected]>
	Reviewed-by: Andy Shevchenko <[email protected]>
	Reviewed-by: Hans de Goede <[email protected]>
	Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
	Cc: [email protected]
(cherry picked from commit 199f968)
	Signed-off-by: Jonathan Maple <[email protected]>
…->trans

jira LE-2742
cve CVE-2024-50264
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Hyunwoo Kim <[email protected]>
commit 6ca5753

During loopback communication, a dangling pointer can be created in
vsk->trans, potentially leading to a Use-After-Free condition.  This
issue is resolved by initializing vsk->trans to NULL.

	Cc: stable <[email protected]>
Fixes: 06a8fc7 ("VSOCK: Introduce virtio_vsock_common.ko")
	Signed-off-by: Hyunwoo Kim <[email protected]>
	Signed-off-by: Wongi Lee <[email protected]>
	Signed-off-by: Greg Kroah-Hartman <[email protected]>
Message-Id: <2024102245-strive-crib-c8d3@gregkh>
	Signed-off-by: Michael S. Tsirkin <[email protected]>
(cherry picked from commit 6ca5753)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
cve CVE-2023-52605
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Prarit Bhargava <[email protected]>
commit 72d9b97

The gcc plugin -fanalyzer [1] tries to detect various
patterns of incorrect behaviour.  The tool reports:

drivers/acpi/acpi_extlog.c: In function ‘extlog_exit’:
drivers/acpi/acpi_extlog.c:307:12: warning: check of ‘extlog_l1_addr’ for NULL after already dereferencing it [-Wanalyzer-deref-before-check]
    |
    |  306 |         ((struct extlog_l1_head *)extlog_l1_addr)->flags &= ~FLAG_OS_OPTIN;
    |      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~
    |      |                                                  |
    |      |                                                  (1) pointer ‘extlog_l1_addr’ is dereferenced here
    |  307 |         if (extlog_l1_addr)
    |      |            ~
    |      |            |
    |      |            (2) pointer ‘extlog_l1_addr’ is checked for NULL here but it was already dereferenced at (1)
    |

Fix the NULL pointer dereference check in extlog_exit().

Link: https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Static-Analyzer-Options.html # [1]

	Signed-off-by: Prarit Bhargava <[email protected]>
	Signed-off-by: Rafael J. Wysocki <[email protected]>
(cherry picked from commit 72d9b97)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Kuniyuki Iwashima <[email protected]>
commit 42033d0

In arp_req_set(), if ATF_PERM is set in arpreq.arp_flags,
ATF_COM is set automatically.

The flag will be used later for neigh_update() only when
a neighbour entry is found.

Let's set ATF_COM just before calling neigh_update().

	Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 42033d0)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Kuniyuki Iwashima <[email protected]>
commit 0592367

When ioctl(SIOCDARP/SIOCSARP) is issued with ATF_PUBL, r.arp_netmask
must be 0.0.0.0 or 255.255.255.255.

Currently, the netmask is validated in arp_req_delete_public() or
arp_req_set_public() under rtnl_lock().

We have ATF_NETMASK test in arp_ioctl() before holding rtnl_lock(),
so let's move the netmask validation there.

	Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 0592367)
	Signed-off-by: Jonathan Maple <[email protected]>
…lete().

jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Kuniyuki Iwashima <[email protected]>
commit f869613
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.31.1.el9_5/f8696133.failed

When ioctl(SIOCDARP/SIOCSARP) is issued for non-proxy entry (no ATF_COM)
without arpreq.arp_dev[] set, arp_req_set() and arp_req_delete() looks up
dev based on IPv4 address by ip_route_output().

Let's factorise the same code as arp_req_dev().

	Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit f869613)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	net/ipv4/arp.c
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Kuniyuki Iwashima <[email protected]>
commit 51e9ba4
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.31.1.el9_5/51e9ba48.failed

This is a prep patch to make the following changes tidy.

No functional change intended.

	Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 51e9ba4)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	net/ipv4/arp.c
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Kuniyuki Iwashima <[email protected]>
commit a428bfc
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.31.1.el9_5/a428bfc7.failed

arp_ioctl() holds rtnl_lock() first regardless of cmd (SIOCDARP,
SIOCSARP, and SIOCGARP) to get net_device by __dev_get_by_name()
and copy dev->name safely.

In the SIOCGARP path, arp_req_get() calls neigh_lookup(), which
looks up a neighbour entry under RCU.

We will extend the RCU section not to take rtnl_lock() and instead
use dev_get_by_name_rcu() for SIOCGARP.

As a preparation, let's move __dev_get_by_name() into another
function and call it from arp_req_delete(), arp_req_set(), and
arp_req_get().

	Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit a428bfc)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	net/ipv4/arp.c
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author YueHaibing <[email protected]>
commit d0358c1

This is not used, so can remove it.

	Signed-off-by: YueHaibing <[email protected]>
	Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit d0358c1)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Kuniyuki Iwashima <[email protected]>
commit 0840556
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.31.1.el9_5/0840556e.failed

We will convert ioctl(SIOCGARP) to RCU, and then we need to copy
dev->name which is currently protected by rtnl_lock().

This patch does the following:

  1) Add seqlock netdev_rename_lock to protect dev->name

  2) Add netdev_copy_name() that copies dev->name to buffer
     under netdev_rename_lock

  3) Use netdev_copy_name() in netdev_get_name() and drop
     devnet_rename_sem

	Suggested-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/netdev/CANn89iJEWs7AYSJqGCUABeVqOCTkErponfZdT5kV-iD=-SajnQ@mail.gmail.com/
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 0840556)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	net/core/dev.c
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Kuniyuki Iwashima <[email protected]>
commit bf4ea58
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.31.1.el9_5/bf4ea588.failed

ioctl(SIOCGARP) holds rtnl_lock() to get netdev by __dev_get_by_name()
and copy dev->name safely and calls neigh_lookup() later, which looks
up a neighbour entry under RCU.

Let's replace __dev_get_by_name() with dev_get_by_name_rcu() and strscpy()
with netdev_copy_name() to avoid locking rtnl_lock().

	Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit bf4ea58)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	net/ipv4/arp.c
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Eric Dumazet <[email protected]>
commit 62e58dd
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.31.1.el9_5/62e58ddb.failed

syzbot reported a lockdep violation involving bridge driver [1]

Make sure netdev_rename_lock is softirq safe to fix this issue.

[1]
WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
6.10.0-rc2-syzkaller-00249-gbe27b8965297 #0 Not tainted
   -----------------------------------------------------
syz-executor.2/9449 [HC0[0]:SC0[2]:HE0:SE0] is trying to acquire:
 ffffffff8f5de668 (netdev_rename_lock.seqcount){+.+.}-{0:0}, at: rtnl_fill_ifinfo+0x38e/0x2270 net/core/rtnetlink.c:1839

and this task is already holding:
 ffff888060c64cb8 (&br->lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
 ffff888060c64cb8 (&br->lock){+.-.}-{2:2}, at: br_port_slave_changelink+0x3d/0x150 net/bridge/br_netlink.c:1212
which would create a new lock dependency:
 (&br->lock){+.-.}-{2:2} -> (netdev_rename_lock.seqcount){+.+.}-{0:0}

but this new dependency connects a SOFTIRQ-irq-safe lock:
 (&br->lock){+.-.}-{2:2}

... which became SOFTIRQ-irq-safe at:
   lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
   __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
   _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
   spin_lock include/linux/spinlock.h:351 [inline]
   br_forward_delay_timer_expired+0x50/0x440 net/bridge/br_stp_timer.c:86
   call_timer_fn+0x18e/0x650 kernel/time/timer.c:1792
   expire_timers kernel/time/timer.c:1843 [inline]
   __run_timers kernel/time/timer.c:2417 [inline]
   __run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2428
   run_timer_base kernel/time/timer.c:2437 [inline]
   run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2447
   handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
   __do_softirq kernel/softirq.c:588 [inline]
   invoke_softirq kernel/softirq.c:428 [inline]
   __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
   irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
   instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
   sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
   asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
   lock_acquire+0x264/0x550 kernel/locking/lockdep.c:5758
   fs_reclaim_acquire+0xaf/0x140 mm/page_alloc.c:3800
   might_alloc include/linux/sched/mm.h:334 [inline]
   slab_pre_alloc_hook mm/slub.c:3890 [inline]
   slab_alloc_node mm/slub.c:3980 [inline]
   kmalloc_trace_noprof+0x3d/0x2c0 mm/slub.c:4147
   kmalloc_noprof include/linux/slab.h:660 [inline]
   kzalloc_noprof include/linux/slab.h:778 [inline]
   class_dir_create_and_add drivers/base/core.c:3255 [inline]
   get_device_parent+0x2a7/0x410 drivers/base/core.c:3315
   device_add+0x325/0xbf0 drivers/base/core.c:3645
   netdev_register_kobject+0x17e/0x320 net/core/net-sysfs.c:2136
   register_netdevice+0x11d5/0x19e0 net/core/dev.c:10375
   nsim_init_netdevsim drivers/net/netdevsim/netdev.c:690 [inline]
   nsim_create+0x647/0x890 drivers/net/netdevsim/netdev.c:750
   __nsim_dev_port_add+0x6c0/0xae0 drivers/net/netdevsim/dev.c:1390
   nsim_dev_port_add_all drivers/net/netdevsim/dev.c:1446 [inline]
   nsim_dev_reload_create drivers/net/netdevsim/dev.c:1498 [inline]
   nsim_dev_reload_up+0x69b/0x8e0 drivers/net/netdevsim/dev.c:985
   devlink_reload+0x478/0x870 net/devlink/dev.c:474
   devlink_nl_reload_doit+0xbd6/0xe50 net/devlink/dev.c:586
   genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
   genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
   genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
   netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
   genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
   netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
   netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
   netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
   sock_sendmsg_nosec net/socket.c:730 [inline]
   __sock_sendmsg+0x221/0x270 net/socket.c:745
   ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
   ___sys_sendmsg net/socket.c:2639 [inline]
   __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
   do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
  entry_SYSCALL_64_after_hwframe+0x77/0x7f

to a SOFTIRQ-irq-unsafe lock:
 (netdev_rename_lock.seqcount){+.+.}-{0:0}

... which became SOFTIRQ-irq-unsafe at:
...
   lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
   do_write_seqcount_begin_nested include/linux/seqlock.h:469 [inline]
   do_write_seqcount_begin include/linux/seqlock.h:495 [inline]
   write_seqlock include/linux/seqlock.h:823 [inline]
   dev_change_name+0x184/0x920 net/core/dev.c:1229
   do_setlink+0xa4b/0x41f0 net/core/rtnetlink.c:2880
   __rtnl_newlink net/core/rtnetlink.c:3696 [inline]
   rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
   rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
   netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
   netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
   netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
   netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
   sock_sendmsg_nosec net/socket.c:730 [inline]
   __sock_sendmsg+0x221/0x270 net/socket.c:745
   __sys_sendto+0x3a4/0x4f0 net/socket.c:2192
   __do_sys_sendto net/socket.c:2204 [inline]
   __se_sys_sendto net/socket.c:2200 [inline]
   __x64_sys_sendto+0xde/0x100 net/socket.c:2200
   do_syscall_x64 arch/x86/entry/common.c:52 [inline]
   do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
  entry_SYSCALL_64_after_hwframe+0x77/0x7f

other info that might help us debug this:

 Possible interrupt unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(netdev_rename_lock.seqcount);
                               local_irq_disable();
                               lock(&br->lock);
                               lock(netdev_rename_lock.seqcount);
  <Interrupt>
    lock(&br->lock);

 *** DEADLOCK ***

3 locks held by syz-executor.2/9449:
  #0: ffffffff8f5e7448 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
  #0: ffffffff8f5e7448 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x842/0x1180 net/core/rtnetlink.c:6632
  #1: ffff888060c64cb8 (&br->lock){+.-.}-{2:2}, at: spin_lock_bh include/linux/spinlock.h:356 [inline]
  #1: ffff888060c64cb8 (&br->lock){+.-.}-{2:2}, at: br_port_slave_changelink+0x3d/0x150 net/bridge/br_netlink.c:1212
  #2: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:329 [inline]
  #2: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:781 [inline]
  #2: ffffffff8e333fa0 (rcu_read_lock){....}-{1:2}, at: team_change_rx_flags+0x29/0x330 drivers/net/team/team_core.c:1767

the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
-> (&br->lock){+.-.}-{2:2} {
   HARDIRQ-ON-W at:
                     lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
                     __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
                     _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
                     spin_lock_bh include/linux/spinlock.h:356 [inline]
                     br_add_if+0xb34/0xef0 net/bridge/br_if.c:682
                     do_set_master net/core/rtnetlink.c:2701 [inline]
                     do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2907
                     __rtnl_newlink net/core/rtnetlink.c:3696 [inline]
                     rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
                     rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
                     netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
                     netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
                     netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
                     netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
                     sock_sendmsg_nosec net/socket.c:730 [inline]
                     __sock_sendmsg+0x221/0x270 net/socket.c:745
                     __sys_sendto+0x3a4/0x4f0 net/socket.c:2192
                     __do_sys_sendto net/socket.c:2204 [inline]
                     __se_sys_sendto net/socket.c:2200 [inline]
                     __x64_sys_sendto+0xde/0x100 net/socket.c:2200
                     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
                     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
                    entry_SYSCALL_64_after_hwframe+0x77/0x7f
   IN-SOFTIRQ-W at:
                     lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
                     __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
                     _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
                     spin_lock include/linux/spinlock.h:351 [inline]
                     br_forward_delay_timer_expired+0x50/0x440 net/bridge/br_stp_timer.c:86
                     call_timer_fn+0x18e/0x650 kernel/time/timer.c:1792
                     expire_timers kernel/time/timer.c:1843 [inline]
                     __run_timers kernel/time/timer.c:2417 [inline]
                     __run_timer_base+0x66a/0x8e0 kernel/time/timer.c:2428
                     run_timer_base kernel/time/timer.c:2437 [inline]
                     run_timer_softirq+0xb7/0x170 kernel/time/timer.c:2447
                     handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
                     __do_softirq kernel/softirq.c:588 [inline]
                     invoke_softirq kernel/softirq.c:428 [inline]
                     __irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
                     irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
                     instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
                     sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
                     asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702
                     lock_acquire+0x264/0x550 kernel/locking/lockdep.c:5758
                     fs_reclaim_acquire+0xaf/0x140 mm/page_alloc.c:3800
                     might_alloc include/linux/sched/mm.h:334 [inline]
                     slab_pre_alloc_hook mm/slub.c:3890 [inline]
                     slab_alloc_node mm/slub.c:3980 [inline]
                     kmalloc_trace_noprof+0x3d/0x2c0 mm/slub.c:4147
                     kmalloc_noprof include/linux/slab.h:660 [inline]
                     kzalloc_noprof include/linux/slab.h:778 [inline]
                     class_dir_create_and_add drivers/base/core.c:3255 [inline]
                     get_device_parent+0x2a7/0x410 drivers/base/core.c:3315
                     device_add+0x325/0xbf0 drivers/base/core.c:3645
                     netdev_register_kobject+0x17e/0x320 net/core/net-sysfs.c:2136
                     register_netdevice+0x11d5/0x19e0 net/core/dev.c:10375
                     nsim_init_netdevsim drivers/net/netdevsim/netdev.c:690 [inline]
                     nsim_create+0x647/0x890 drivers/net/netdevsim/netdev.c:750
                     __nsim_dev_port_add+0x6c0/0xae0 drivers/net/netdevsim/dev.c:1390
                     nsim_dev_port_add_all drivers/net/netdevsim/dev.c:1446 [inline]
                     nsim_dev_reload_create drivers/net/netdevsim/dev.c:1498 [inline]
                     nsim_dev_reload_up+0x69b/0x8e0 drivers/net/netdevsim/dev.c:985
                     devlink_reload+0x478/0x870 net/devlink/dev.c:474
                     devlink_nl_reload_doit+0xbd6/0xe50 net/devlink/dev.c:586
                     genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
                     genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
                     genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
                     netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
                     genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
                     netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
                     netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
                     netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
                     sock_sendmsg_nosec net/socket.c:730 [inline]
                     __sock_sendmsg+0x221/0x270 net/socket.c:745
                     ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
                     ___sys_sendmsg net/socket.c:2639 [inline]
                     __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
                     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
                     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
                    entry_SYSCALL_64_after_hwframe+0x77/0x7f
   INITIAL USE at:
                    lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
                    __raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
                    _raw_spin_lock_bh+0x35/0x50 kernel/locking/spinlock.c:178
                    spin_lock_bh include/linux/spinlock.h:356 [inline]
                    br_add_if+0xb34/0xef0 net/bridge/br_if.c:682
                    do_set_master net/core/rtnetlink.c:2701 [inline]
                    do_setlink+0xe70/0x41f0 net/core/rtnetlink.c:2907
                    __rtnl_newlink net/core/rtnetlink.c:3696 [inline]
                    rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
                    rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
                    netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
                    netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
                    netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
                    netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
                    sock_sendmsg_nosec net/socket.c:730 [inline]
                    __sock_sendmsg+0x221/0x270 net/socket.c:745
                    __sys_sendto+0x3a4/0x4f0 net/socket.c:2192
                    __do_sys_sendto net/socket.c:2204 [inline]
                    __se_sys_sendto net/socket.c:2200 [inline]
                    __x64_sys_sendto+0xde/0x100 net/socket.c:2200
                    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
                    do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
                   entry_SYSCALL_64_after_hwframe+0x77/0x7f
 }
 ... key      at: [<ffffffff94b9a1a0>] br_dev_setup.__key+0x0/0x20

the dependencies between the lock to be acquired
 and SOFTIRQ-irq-unsafe lock:
-> (netdev_rename_lock.seqcount){+.+.}-{0:0} {
   HARDIRQ-ON-W at:
                     lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
                     do_write_seqcount_begin_nested include/linux/seqlock.h:469 [inline]
                     do_write_seqcount_begin include/linux/seqlock.h:495 [inline]
                     write_seqlock include/linux/seqlock.h:823 [inline]
                     dev_change_name+0x184/0x920 net/core/dev.c:1229
                     do_setlink+0xa4b/0x41f0 net/core/rtnetlink.c:2880
                     __rtnl_newlink net/core/rtnetlink.c:3696 [inline]
                     rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
                     rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
                     netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
                     netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
                     netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
                     netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
                     sock_sendmsg_nosec net/socket.c:730 [inline]
                     __sock_sendmsg+0x221/0x270 net/socket.c:745
                     __sys_sendto+0x3a4/0x4f0 net/socket.c:2192
                     __do_sys_sendto net/socket.c:2204 [inline]
                     __se_sys_sendto net/socket.c:2200 [inline]
                     __x64_sys_sendto+0xde/0x100 net/socket.c:2200
                     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
                     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
                    entry_SYSCALL_64_after_hwframe+0x77/0x7f
   SOFTIRQ-ON-W at:
                     lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
                     do_write_seqcount_begin_nested include/linux/seqlock.h:469 [inline]
                     do_write_seqcount_begin include/linux/seqlock.h:495 [inline]
                     write_seqlock include/linux/seqlock.h:823 [inline]
                     dev_change_name+0x184/0x920 net/core/dev.c:1229
                     do_setlink+0xa4b/0x41f0 net/core/rtnetlink.c:2880
                     __rtnl_newlink net/core/rtnetlink.c:3696 [inline]
                     rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
                     rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
                     netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
                     netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
                     netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
                     netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
                     sock_sendmsg_nosec net/socket.c:730 [inline]
                     __sock_sendmsg+0x221/0x270 net/socket.c:745
                     __sys_sendto+0x3a4/0x4f0 net/socket.c:2192
                     __do_sys_sendto net/socket.c:2204 [inline]
                     __se_sys_sendto net/socket.c:2200 [inline]
                     __x64_sys_sendto+0xde/0x100 net/socket.c:2200
                     do_syscall_x64 arch/x86/entry/common.c:52 [inline]
                     do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
                    entry_SYSCALL_64_after_hwframe+0x77/0x7f
   INITIAL USE at:
                    lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
                    do_write_seqcount_begin_nested include/linux/seqlock.h:469 [inline]
                    do_write_seqcount_begin include/linux/seqlock.h:495 [inline]
                    write_seqlock include/linux/seqlock.h:823 [inline]
                    dev_change_name+0x184/0x920 net/core/dev.c:1229
                    do_setlink+0xa4b/0x41f0 net/core/rtnetlink.c:2880
                    __rtnl_newlink net/core/rtnetlink.c:3696 [inline]
                    rtnl_newlink+0x180b/0x20a0 net/core/rtnetlink.c:3743
                    rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
                    netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
                    netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
                    netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
                    netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
                    sock_sendmsg_nosec net/socket.c:730 [inline]
                    __sock_sendmsg+0x221/0x270 net/socket.c:745
                    __sys_sendto+0x3a4/0x4f0 net/socket.c:2192
                    __do_sys_sendto net/socket.c:2204 [inline]
                    __se_sys_sendto net/socket.c:2200 [inline]
                    __x64_sys_sendto+0xde/0x100 net/socket.c:2200
                    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
                    do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
                   entry_SYSCALL_64_after_hwframe+0x77/0x7f
   INITIAL READ USE at:
                         lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
                         seqcount_lockdep_reader_access include/linux/seqlock.h:72 [inline]
                         read_seqbegin include/linux/seqlock.h:772 [inline]
                         netdev_copy_name+0x168/0x2c0 net/core/dev.c:949
                         rtnl_fill_ifinfo+0x38e/0x2270 net/core/rtnetlink.c:1839
                         rtmsg_ifinfo_build_skb+0x18a/0x260 net/core/rtnetlink.c:4073
                         rtmsg_ifinfo_event net/core/rtnetlink.c:4107 [inline]
                         rtmsg_ifinfo+0x91/0x1b0 net/core/rtnetlink.c:4116
                         register_netdevice+0x1665/0x19e0 net/core/dev.c:10422
                         register_netdev+0x3b/0x50 net/core/dev.c:10512
                         loopback_net_init+0x73/0x150 drivers/net/loopback.c:217
                         ops_init+0x359/0x610 net/core/net_namespace.c:139
                         __register_pernet_operations net/core/net_namespace.c:1247 [inline]
                         register_pernet_operations+0x2cb/0x660 net/core/net_namespace.c:1320
                         register_pernet_device+0x33/0x80 net/core/net_namespace.c:1407
                         net_dev_init+0xfcd/0x10d0 net/core/dev.c:11956
                         do_one_initcall+0x248/0x880 init/main.c:1267
                         do_initcall_level+0x157/0x210 init/main.c:1329
                         do_initcalls+0x3f/0x80 init/main.c:1345
                         kernel_init_freeable+0x435/0x5d0 init/main.c:1578
                         kernel_init+0x1d/0x2b0 init/main.c:1467
                         ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
                         ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
 }
 ... key      at: [<ffffffff8f5de668>] netdev_rename_lock+0x8/0xa0
 ... acquired at:
    lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
    seqcount_lockdep_reader_access include/linux/seqlock.h:72 [inline]
    read_seqbegin include/linux/seqlock.h:772 [inline]
    netdev_copy_name+0x168/0x2c0 net/core/dev.c:949
    rtnl_fill_ifinfo+0x38e/0x2270 net/core/rtnetlink.c:1839
    rtmsg_ifinfo_build_skb+0x18a/0x260 net/core/rtnetlink.c:4073
    rtmsg_ifinfo_event net/core/rtnetlink.c:4107 [inline]
    rtmsg_ifinfo+0x91/0x1b0 net/core/rtnetlink.c:4116
    __dev_notify_flags+0xf7/0x400 net/core/dev.c:8816
    __dev_set_promiscuity+0x152/0x5a0 net/core/dev.c:8588
    dev_set_promiscuity+0x51/0xe0 net/core/dev.c:8608
    team_change_rx_flags+0x203/0x330 drivers/net/team/team_core.c:1771
    dev_change_rx_flags net/core/dev.c:8541 [inline]
    __dev_set_promiscuity+0x406/0x5a0 net/core/dev.c:8585
    dev_set_promiscuity+0x51/0xe0 net/core/dev.c:8608
    br_port_clear_promisc net/bridge/br_if.c:135 [inline]
    br_manage_promisc+0x505/0x590 net/bridge/br_if.c:172
    nbp_update_port_count net/bridge/br_if.c:242 [inline]
    br_port_flags_change+0x161/0x1f0 net/bridge/br_if.c:761
    br_setport+0xcb5/0x16d0 net/bridge/br_netlink.c:1000
    br_port_slave_changelink+0x135/0x150 net/bridge/br_netlink.c:1213
    __rtnl_newlink net/core/rtnetlink.c:3689 [inline]
    rtnl_newlink+0x169f/0x20a0 net/core/rtnetlink.c:3743
    rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
    netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
    netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
    netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
    netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
    sock_sendmsg_nosec net/socket.c:730 [inline]
    __sock_sendmsg+0x221/0x270 net/socket.c:745
    ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
    ___sys_sendmsg net/socket.c:2639 [inline]
    __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
    do_syscall_x64 arch/x86/entry/common.c:52 [inline]
    do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
   entry_SYSCALL_64_after_hwframe+0x77/0x7f

stack backtrace:
CPU: 0 PID: 9449 Comm: syz-executor.2 Not tainted 6.10.0-rc2-syzkaller-00249-gbe27b8965297 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Call Trace:
 <TASK>
  __dump_stack lib/dump_stack.c:88 [inline]
  dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
  print_bad_irq_dependency kernel/locking/lockdep.c:2626 [inline]
  check_irq_usage kernel/locking/lockdep.c:2865 [inline]
  check_prev_add kernel/locking/lockdep.c:3138 [inline]
  check_prevs_add kernel/locking/lockdep.c:3253 [inline]
  validate_chain+0x4de0/0x5900 kernel/locking/lockdep.c:3869
  __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
  lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754
  seqcount_lockdep_reader_access include/linux/seqlock.h:72 [inline]
  read_seqbegin include/linux/seqlock.h:772 [inline]
  netdev_copy_name+0x168/0x2c0 net/core/dev.c:949
  rtnl_fill_ifinfo+0x38e/0x2270 net/core/rtnetlink.c:1839
  rtmsg_ifinfo_build_skb+0x18a/0x260 net/core/rtnetlink.c:4073
  rtmsg_ifinfo_event net/core/rtnetlink.c:4107 [inline]
  rtmsg_ifinfo+0x91/0x1b0 net/core/rtnetlink.c:4116
  __dev_notify_flags+0xf7/0x400 net/core/dev.c:8816
  __dev_set_promiscuity+0x152/0x5a0 net/core/dev.c:8588
  dev_set_promiscuity+0x51/0xe0 net/core/dev.c:8608
  team_change_rx_flags+0x203/0x330 drivers/net/team/team_core.c:1771
  dev_change_rx_flags net/core/dev.c:8541 [inline]
  __dev_set_promiscuity+0x406/0x5a0 net/core/dev.c:8585
  dev_set_promiscuity+0x51/0xe0 net/core/dev.c:8608
  br_port_clear_promisc net/bridge/br_if.c:135 [inline]
  br_manage_promisc+0x505/0x590 net/bridge/br_if.c:172
  nbp_update_port_count net/bridge/br_if.c:242 [inline]
  br_port_flags_change+0x161/0x1f0 net/bridge/br_if.c:761
  br_setport+0xcb5/0x16d0 net/bridge/br_netlink.c:1000
  br_port_slave_changelink+0x135/0x150 net/bridge/br_netlink.c:1213
  __rtnl_newlink net/core/rtnetlink.c:3689 [inline]
  rtnl_newlink+0x169f/0x20a0 net/core/rtnetlink.c:3743
  rtnetlink_rcv_msg+0x89b/0x1180 net/core/rtnetlink.c:6635
  netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2564
  netlink_unicast_kernel net/netlink/af_netlink.c:1335 [inline]
  netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1361
  netlink_sendmsg+0x8db/0xcb0 net/netlink/af_netlink.c:1905
  sock_sendmsg_nosec net/socket.c:730 [inline]
  __sock_sendmsg+0x221/0x270 net/socket.c:745
  ____sys_sendmsg+0x525/0x7d0 net/socket.c:2585
  ___sys_sendmsg net/socket.c:2639 [inline]
  __sys_sendmsg+0x2b0/0x3a0 net/socket.c:2668
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f3b3047cf29
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3b311740c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f3b305b4050 RCX: 00007f3b3047cf29
RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000008
RBP: 00007f3b304ec074 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007f3b305b4050 R15: 00007ffca2f3dc68
 </TASK>

Fixes: 0840556 ("net: Protect dev->name by seqlock.")
	Reported-by: syzbot <[email protected]>
	Signed-off-by: Eric Dumazet <[email protected]>
	Cc: Kuniyuki Iwashima <[email protected]>
	Reviewed-by: Kuniyuki Iwashima <[email protected]>
	Signed-off-by: David S. Miller <[email protected]>
(cherry picked from commit 62e58dd)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	net/core/dev.c
…nge_name().

jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Kuniyuki Iwashima <[email protected]>
commit e361560
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-5.14.0-503.31.1.el9_5/e361560a.failed

The cited commit forgot to add netdev_rename_lock in one of the
error paths in dev_change_name().

Let's hold netdev_rename_lock before restoring the old dev->name.

Fixes: 0840556 ("net: Protect dev->name by seqlock.")
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
	Reviewed-by: Eric Dumazet <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit e361560)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	net/core/dev.c
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Koichiro Den <[email protected]>
commit d0f14f7

Previously, surplus allocations triggered by mmap were typically made from
the node where the process was running.  On a page fault, the area was
reliably dequeued from the hugepage_freelists for that node.  However,
since commit 003af99 ("hugetlb: force allocating surplus hugepages on
mempolicy allowed nodes"), dequeue_hugetlb_folio_vma() may fall back to
other nodes unnecessarily even if there is no MPOL_BIND policy, causing
folios to be dequeued from nodes other than the current one.

Also, allocating from the node where the current process is running is
likely to result in a performance win, as mmap-ing processes often touch
the area not so long after allocation.  This change minimizes surprises
for users relying on the previous behavior while maintaining the benefit
introduced by the commit.

So, prioritize the node the current process is running on when possible.

Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Koichiro Den <[email protected]>
	Acked-by: Aristeu Rozanski <[email protected]>
	Cc: Aristeu Rozanski <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Muchun Song <[email protected]>
	Cc: Vishal Moola (Oracle) <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit d0f14f7)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Paulo Alcantara <[email protected]>
commit 654292a

When the user sets a file or directory as read-only (e.g. ~S_IWUGO),
the client will set the ATTR_READONLY attribute by sending an
SMB2_SET_INFO request to the server in cifs_setattr_{,nounix}(), but
cifsInodeInfo::cifsAttrs will be left unchanged as the client will
only update the new file attributes in the next call to
{smb311_posix,cifs}_get_inode_info() with the new metadata filled in
@DaTa parameter.

Commit a18280e ("smb: cilent: set reparse mount points as
automounts") mistakenly removed the @DaTa NULL check when calling
is_inode_cache_good(), which broke the above case as the new
ATTR_READONLY attribute would end up not being updated on files with a
read lease.

Fix this by updating the inode whenever we have cached metadata in
@DaTa parameter.

	Reported-by: Horst Reiterer <[email protected]>
Closes: https://lore.kernel.org/r/[email protected]
Fixes: a18280e ("smb: cilent: set reparse mount points as automounts")
	Cc: [email protected]
	Signed-off-by: Paulo Alcantara (Red Hat) <[email protected]>
	Signed-off-by: Steve French <[email protected]>
(cherry picked from commit 654292a)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
cve CVE-2023-52922
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author YueHaibing <[email protected]>
commit 55c3b96

BUG: KASAN: slab-use-after-free in bcm_proc_show+0x969/0xa80
Read of size 8 at addr ffff888155846230 by task cat/7862

CPU: 1 PID: 7862 Comm: cat Not tainted 6.5.0-rc1-00153-gc8746099c197 #230
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
 <TASK>
 dump_stack_lvl+0xd5/0x150
 print_report+0xc1/0x5e0
 kasan_report+0xba/0xf0
 bcm_proc_show+0x969/0xa80
 seq_read_iter+0x4f6/0x1260
 seq_read+0x165/0x210
 proc_reg_read+0x227/0x300
 vfs_read+0x1d5/0x8d0
 ksys_read+0x11e/0x240
 do_syscall_64+0x35/0xb0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Allocated by task 7846:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 __kasan_kmalloc+0x9e/0xa0
 bcm_sendmsg+0x264b/0x44e0
 sock_sendmsg+0xda/0x180
 ____sys_sendmsg+0x735/0x920
 ___sys_sendmsg+0x11d/0x1b0
 __sys_sendmsg+0xfa/0x1d0
 do_syscall_64+0x35/0xb0
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 7846:
 kasan_save_stack+0x1e/0x40
 kasan_set_track+0x21/0x30
 kasan_save_free_info+0x27/0x40
 ____kasan_slab_free+0x161/0x1c0
 slab_free_freelist_hook+0x119/0x220
 __kmem_cache_free+0xb4/0x2e0
 rcu_core+0x809/0x1bd0

bcm_op is freed before procfs entry be removed in bcm_release(),
this lead to bcm_proc_show() may read the freed bcm_op.

Fixes: ffd980f ("[CAN]: Add broadcast manager (bcm) protocol")
	Signed-off-by: YueHaibing <[email protected]>
	Reviewed-by: Oliver Hartkopp <[email protected]>
	Acked-by: Oliver Hartkopp <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
	Cc: [email protected]
	Signed-off-by: Marc Kleine-Budde <[email protected]>
(cherry picked from commit 55c3b96)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
cve CVE-2024-53113
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Jinjiang Tu <[email protected]>
commit 8ce41b0

We triggered a NULL pointer dereference for ac.preferred_zoneref->zone in
alloc_pages_bulk_noprof() when the task is migrated between cpusets.

When cpuset is enabled, in prepare_alloc_pages(), ac->nodemask may be
&current->mems_allowed.  when first_zones_zonelist() is called to find
preferred_zoneref, the ac->nodemask may be modified concurrently if the
task is migrated between different cpusets.  Assuming we have 2 NUMA Node,
when traversing Node1 in ac->zonelist, the nodemask is 2, and when
traversing Node2 in ac->zonelist, the nodemask is 1.  As a result, the
ac->preferred_zoneref points to NULL zone.

In alloc_pages_bulk_noprof(), for_each_zone_zonelist_nodemask() finds a
allowable zone and calls zonelist_node_idx(ac.preferred_zoneref), leading
to NULL pointer dereference.

__alloc_pages_noprof() fixes this issue by checking NULL pointer in commit
ea57485 ("mm, page_alloc: fix check for NULL preferred_zone") and
commit df76cee ("mm, page_alloc: remove redundant checks from alloc
fastpath").

To fix it, check NULL pointer for preferred_zoneref->zone.

Link: https://lkml.kernel.org/r/[email protected]
Fixes: 387ba26 ("mm/page_alloc: add a bulk page allocator")
	Signed-off-by: Jinjiang Tu <[email protected]>
	Reviewed-by: Vlastimil Babka <[email protected]>
	Cc: Alexander Lobakin <[email protected]>
	Cc: David Hildenbrand <[email protected]>
	Cc: Kefeng Wang <[email protected]>
	Cc: Mel Gorman <[email protected]>
	Cc: Nanyong Sun <[email protected]>
	Cc: <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit 8ce41b0)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Tao Liu <[email protected]>
commit 5760929

A kexec kernel boot failure is sometimes observed on AMD CPUs due to an
unmapped EFI config table array.  This can be seen when "nogbpages" is on
the kernel command line, and has been observed as a full BIOS reboot rather
than a successful kexec.

This was also the cause of reported regressions attributed to Commit
7143c5f ("x86/mm/ident_map: Use gbpages only where full GB page should
be mapped.") which was subsequently reverted.

To avoid this page fault, explicitly include the EFI config table array in
the kexec identity map.

Further explanation:

The following 2 commits caused the EFI config table array to be
accessed when enabling sev at kernel startup.

    commit ec1c66a ("x86/compressed/64: Detect/setup SEV/SME features
                          earlier during boot")
    commit c01fce9 ("x86/compressed: Add SEV-SNP feature
                          detection/setup")

This is in the code that examines whether SEV should be enabled or not, so
it can even affect systems that are not SEV capable.

This may result in a page fault if the EFI config table array's address is
unmapped. Since the page fault occurs before the new kernel establishes its
own identity map and page fault routines, it is unrecoverable and kexec
fails.

Most often, this problem is not seen because the EFI config table array
gets included in the map by the luck of being placed at a memory address
close enough to other memory areas that *are* included in the map created
by kexec.

Both the "nogbpages" command line option and the "use gpbages only where
full GB page should be mapped" change greatly reduce the chance of being
included in the map by luck, which is why the problem appears.

	Signed-off-by: Tao Liu <[email protected]>
	Signed-off-by: Steve Wahl <[email protected]>
	Signed-off-by: Thomas Gleixner <[email protected]>
	Tested-by: Pavin Joseph <[email protected]>
	Tested-by: Sarah Brofeldt <[email protected]>
	Tested-by: Eric Hagberg <[email protected]>
	Reviewed-by: Ard Biesheuvel <[email protected]>
Link: https://lore.kernel.org/all/[email protected]
(cherry picked from commit 5760929)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Manuel Barrio Linares <[email protected]>
commit 44f69dd

This adds support for all sample rates supported by the
hardware,Digidesign Mbox 3 supports: {44100, 48000, 88200, 96000}

Fixes syncing clock issues that presented as pops. To test this, without
this patch playing 440hz tone produces pops.

Clock is now synced between playback and capture interfaces so no more
latency drift issue when using pipewire pro-profile.
(https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/3900)

	Signed-off-by: Manuel Barrio Linares <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Takashi Iwai <[email protected]>
(cherry picked from commit 44f69dd)
	Signed-off-by: Jonathan Maple <[email protected]>
…box devices

jira LE-2742
cve CVE-2024-53197
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Benoît Sevens <[email protected]>
commit b909df1

A bogus device can provide a bNumConfigurations value that exceeds the
initial value used in usb_get_configuration for allocating dev->config.

This can lead to out-of-bounds accesses later, e.g. in
usb_destroy_configuration.

	Signed-off-by: Benoît Sevens <[email protected]>
Fixes: 1da177e ("Linux-2.6.12-rc2")
	Cc: [email protected]
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Takashi Iwai <[email protected]>
(cherry picked from commit b909df1)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Manuel Barrio Linares <[email protected]>
commit 5005ccd

Fixed wrong use of usb_sndctrlpipe to usb_rcvctrlpipe

Fixes: 44f69dd ("ALSA: usb-audio: Add sampling rates support for Mbox3")
	Signed-off-by: Manuel Barrio Linares <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Takashi Iwai <[email protected]>
(cherry picked from commit 5005ccd)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Dan Carpenter <[email protected]>
commit f7d306b

The usb_get_descriptor() function does DMA so we're not allowed
to use a stack buffer for that.  Doing DMA to the stack is not portable
all architectures.  Move the "new_device_descriptor" from being stored
on the stack and allocate it with kmalloc() instead.

Fixes: b909df1 ("ALSA: usb-audio: Fix potential out-of-bound accesses for Extigy and Mbox devices")
	Cc: [email protected]
	Signed-off-by: Dan Carpenter <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Takashi Iwai <[email protected]>
(cherry picked from commit f7d306b)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Thomas Gleixner <[email protected]>
commit ea72ce5

iounmap() on x86 occasionally fails to unmap because the provided valid
ioremap address is not below high_memory. It turned out that this
happens due to KASLR.

KASLR uses the full address space between PAGE_OFFSET and vaddr_end to
randomize the starting points of the direct map, vmalloc and vmemmap
regions.  It thereby limits the size of the direct map by using the
installed memory size plus an extra configurable margin for hot-plug
memory.  This limitation is done to gain more randomization space
because otherwise only the holes between the direct map, vmalloc,
vmemmap and vaddr_end would be usable for randomizing.

The limited direct map size is not exposed to the rest of the kernel, so
the memory hot-plug and resource management related code paths still
operate under the assumption that the available address space can be
determined with MAX_PHYSMEM_BITS.

request_free_mem_region() allocates from (1 << MAX_PHYSMEM_BITS) - 1
downwards.  That means the first allocation happens past the end of the
direct map and if unlucky this address is in the vmalloc space, which
causes high_memory to become greater than VMALLOC_START and consequently
causes iounmap() to fail for valid ioremap addresses.

MAX_PHYSMEM_BITS cannot be changed for that because the randomization
does not align with address bit boundaries and there are other places
which actually require to know the maximum number of address bits.  All
remaining usage sites of MAX_PHYSMEM_BITS have been analyzed and found
to be correct.

Cure this by exposing the end of the direct map via PHYSMEM_END and use
that for the memory hot-plug and resource management related places
instead of relying on MAX_PHYSMEM_BITS. In the KASLR case PHYSMEM_END
maps to a variable which is initialized by the KASLR initialization and
otherwise it is based on MAX_PHYSMEM_BITS as before.

To prevent future hickups add a check into add_pages() to catch callers
trying to add memory above PHYSMEM_END.

Fixes: 0483e1f ("x86/mm: Implement ASLR for kernel memory regions")
	Reported-by: Max Ramanouski <[email protected]>
	Reported-by: Alistair Popple <[email protected]>
	Signed-off-by: Thomas Gleixner <[email protected]>
Tested-By: Max Ramanouski <[email protected]>
	Tested-by: Alistair Popple <[email protected]>
	Reviewed-by: Dan Williams <[email protected]>
	Reviewed-by: Alistair Popple <[email protected]>
	Reviewed-by: Kees Cook <[email protected]>
	Cc: [email protected]
Link: https://lore.kernel.org/all/87ed6soy3z.ffs@tglx
(cherry picked from commit ea72ce5)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
cve CVE-2024-50302
Rebuild_History Non-Buildable kernel-5.14.0-503.31.1.el9_5
commit-author Jiri Kosina <[email protected]>
commit 177f25d

Since the report buffer is used by all kinds of drivers in various ways, let's
zero-initialize it during allocation to make sure that it can't be ever used
to leak kernel memory via specially-crafted report.

Fixes: 27ce405 ("HID: fix data access in implement()")
	Reported-by: Benoît Sevens <[email protected]>
	Acked-by: Benjamin Tissoires <[email protected]>
	Signed-off-by: Jiri Kosina <[email protected]>
(cherry picked from commit 177f25d)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..master: 294815
Number of commits in rpm: 44
Number of commits matched with upstream: 42 (95.45%)
Number of commits in upstream but not in rpm: 294790
Number of commits NOT found in upstream: 2 (4.55%)

Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.31.1.el9_5 for kernel-5.14.0-503.31.1.el9_5
Clean Cherry Picks: 17 (40.48%)
Empty Cherry Picks: 8 (19.05%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-5.14.0-503.31.1.el9_5/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.33.1.el9_5
commit-author Benjamin Coddington <[email protected]>
commit 675d456

We've observed an NFS server shrink the TCP window and then reset the TCP
connection as part of a HA failover.  When the connection has TLS, often
the NFS client will hang indefinitely in this stack:

     wait_woken+0x70/0x80
     wait_on_pending_writer+0xe4/0x110 [tls]
     tls_sk_proto_close+0x368/0x3a0 [tls]
     inet_release+0x54/0xb0
     __sock_release+0x48/0xc8
     sock_close+0x20/0x38
     __fput+0xe0/0x2f0
     __fput_sync+0x58/0x70
     xs_reset_transport+0xe8/0x1f8 [sunrpc]
     xs_tcp_shutdown+0xa4/0x190 [sunrpc]
     xprt_autoclose+0x68/0x170 [sunrpc]
     process_one_work+0x180/0x420
     worker_thread+0x258/0x368
     kthread+0x104/0x118
     ret_from_fork+0x10/0x20

This hang prevents the client from closing the socket and reconnecting to
the server.

Because xs_nospace() elevates sk_write_pending, and sk_sndtimeo is
MAX_SCHEDULE_TIMEOUT, tls_sk_proto_close is never able to complete its wait
for pending writes to the socket.  For this case where we are resetting the
transport anyway, we don't expect the socket to ever have write space, so
fix this by simply clearing the sock's sndtimeo under the sock's lock.

	Signed-off-by: Benjamin Coddington <[email protected]>
	Signed-off-by: Trond Myklebust <[email protected]>
(cherry picked from commit 675d456)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.33.1.el9_5
commit-author Benjamin Coddington <[email protected]>
commit b341ca5

We've noticed that NFS can hang when using RPC over TLS on an unstable
connection, and investigation shows that the RPC layer is stuck in a tight
loop attempting to transmit, but forever getting -EBADMSG back from the
underlying network.  The loop begins when tcp_sendmsg_locked() returns
-EPIPE to tls_tx_records(), but that error is converted to -EBADMSG when
calling the socket's error reporting handler.

Instead of converting errors from tcp_sendmsg_locked(), let's pass them
along in this path.  The RPC layer handles -EPIPE by reconnecting the
transport, which prevents the endless attempts to transmit on a broken
connection.

	Signed-off-by: Benjamin Coddington <[email protected]>
Fixes: a42055e ("net/tls: Add support for async encryption of records for performance")
Link: https://patch.msgid.link/9594185559881679d81f071b181a10eb07cd079f.1736004079.git.bcodding@redhat.com
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit b341ca5)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.33.1.el9_5
commit-author Benjamin Coddington <[email protected]>
commit d7bdd84

We've noticed a situation where an unstable TCP connection can cause the
TLS handshake to timeout waiting for userspace to complete it.  When this
happens, we don't want to return from xs_tls_handshake_sync() with zero, as
this will cause the upper xprt to be set CONNECTED, and subsequent attempts
to transmit will be returned with -EPIPE.  The sunrpc machine does not
recover from this situation and will spin attempting to transmit.

The return value of tls_handshake_cancel() can be used to detect a race
with completion:

 * tls_handshake_cancel - cancel a pending handshake
 * Return values:
 *   %true - Uncompleted handshake request was canceled
 *   %false - Handshake request already completed or not found

If true, we do not want the upper xprt to be connected, so return
-ETIMEDOUT.  If false, its possible the handshake request was lost and
that may be the reason for our timeout.  Again we do not want the upper
xprt to be connected, so return -ETIMEDOUT.

Ensure that we alway return an error from xs_tls_handshake_sync() if we
call tls_handshake_cancel().

	Signed-off-by: Benjamin Coddington <[email protected]>
	Reviewed-by: Chuck Lever <[email protected]>
Fixes: 75eb6af ("SUNRPC: Add a TCP-with-TLS RPC transport class")
	Signed-off-by: Trond Myklebust <[email protected]>
(cherry picked from commit d7bdd84)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.33.1.el9_5
commit-author Benjamin Coddington <[email protected]>
commit 7a2f6f7

If the TLS handshake attempt returns -ETIMEDOUT, we currently translate
that error into -EACCES.  This becomes problematic for cases where the RPC
layer is attempting to re-connect in paths that don't resonably handle
-EACCES, for example: writeback.  The RPC layer can handle -ETIMEDOUT quite
well, however - so if the handshake returns this error let's just pass it
along.

Fixes: 75eb6af ("SUNRPC: Add a TCP-with-TLS RPC transport class")
	Signed-off-by: Benjamin Coddington <[email protected]>
	Signed-off-by: Anna Schumaker <[email protected]>
(cherry picked from commit 7a2f6f7)
	Signed-off-by: Jonathan Maple <[email protected]>
PlaidCat added 15 commits April 8, 2025 18:12
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.33.1.el9_5
commit-author Steve Wahl <[email protected]>
commit d794734

When ident_pud_init() uses only gbpages to create identity maps, large
ranges of addresses not actually requested can be included in the
resulting table; a 4K request will map a full GB.  On UV systems, this
ends up including regions that will cause hardware to halt the system
if accessed (these are marked "reserved" by BIOS).  Even processor
speculation into these regions is enough to trigger the system halt.

Only use gbpages when map creation requests include the full GB page
of space.  Fall back to using smaller 2M pages when only portions of a
GB page are included in the request.

No attempt is made to coalesce mapping requests. If a request requires
a map entry at the 2M (pmd) level, subsequent mapping requests within
the same 1G region will also be at the pmd level, even if adjacent or
overlapping such requests could have been combined to map a full
gbpage.  Existing usage starts with larger regions and then adds
smaller regions, so this should not have any great consequence.

[ dhansen: fix up comment formatting, simplifty changelog ]

	Signed-off-by: Steve Wahl <[email protected]>
	Signed-off-by: Dave Hansen <[email protected]>
	Cc: [email protected]
Link: https://lore.kernel.org/all/20240126164841.170866-1-steve.wahl%40hpe.com
(cherry picked from commit d794734)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.33.1.el9_5
commit-author Kai Mäkisara <[email protected]>
commit 5bb2d61

Struct mtget field mt_blkno -1 means it is unknown. Don't add anything to
it.

	Signed-off-by: Kai Mäkisara <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219419#c14
Link: https://lore.kernel.org/r/[email protected]
	Reviewed-by: John Meneghini <[email protected]>
	Tested-by: John Meneghini <[email protected]>
	Signed-off-by: Martin K. Petersen <[email protected]>
(cherry picked from commit 5bb2d61)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.33.1.el9_5
commit-author Kai Mäkisara <[email protected]>
commit 0b120ed

Most drives rewind the tape when the device is reset. Reading and writing
are not allowed until something is done to make the tape position match the
user's expectation (e.g., rewind the tape). Add MTIOCGET and MTLOAD to
operations allowed after reset. MTIOCGET is modified to not touch the tape
if pos_unknown is non-zero. The tape location is known after MTLOAD.

	Signed-off-by: Kai Mäkisara <[email protected]>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219419#c14
Link: https://lore.kernel.org/r/[email protected]
	Reviewed-by: John Meneghini <[email protected]>
	Tested-by: John Meneghini <[email protected]>
	Signed-off-by: Martin K. Petersen <[email protected]>
(cherry picked from commit 0b120ed)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.33.1.el9_5
commit-author Kai Mäkisara <[email protected]>
commit a4550b2

Currently the code starts new tape session when any Unit Attention
(UA) is seen when opening the device. This leads to incorrectly
clearing pos_unknown when the UA is for reset. Set new session only
when the UA is for a new tape.

	Signed-off-by: Kai Mäkisara <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Reviewed-by: John Meneghini <[email protected]>
	Tested-by: John Meneghini <[email protected]>
	Signed-off-by: Martin K. Petersen <[email protected]>
(cherry picked from commit a4550b2)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..master: 294815
Number of commits in rpm: 10
Number of commits matched with upstream: 8 (80.00%)
Number of commits in upstream but not in rpm: 294807
Number of commits NOT found in upstream: 2 (20.00%)

Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.33.1.el9_5 for kernel-5.14.0-503.33.1.el9_5
Clean Cherry Picks: 8 (100.00%)
Empty Cherry Picks: 0 (0.00%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-5.14.0-503.33.1.el9_5/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
jira LE-2742
cve CVE-2025-21785
Rebuild_History Non-Buildable kernel-5.14.0-503.34.1.el9_5
commit-author Radu Rendec <[email protected]>
commit 875d742

The loop that detects/populates cache information already has a bounds
check on the array size but does not account for cache levels with
separate data/instructions cache. Fix this by incrementing the index
for any populated leaf (instead of any populated level).

Fixes: 5d425c1 ("arm64: kernel: add support for cpu cache information")

	Signed-off-by: Radu Rendec <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Will Deacon <[email protected]>
(cherry picked from commit 875d742)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..master: 294815
Number of commits in rpm: 4
Number of commits matched with upstream: 1 (25.00%)
Number of commits in upstream but not in rpm: 294814
Number of commits NOT found in upstream: 3 (75.00%)

Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.34.1.el9_5 for kernel-5.14.0-503.34.1.el9_5
Clean Cherry Picks: 1 (100.00%)
Empty Cherry Picks: 0 (0.00%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-5.14.0-503.34.1.el9_5/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.35.1.el9_5
commit-author Artem Bityutskiy <[email protected]>
commit 4c411cc

Background
~~~~~~~~~~

The driver uses 'use_acpi = true' in C-state custom table for all Xeon
platforms. The meaning of this flag is as follows.

 1. If a C-state from the custom table is defined in ACPI _CST (matched
    by the mwait hint), then enable this C-state.

 2. Otherwise, disable this C-state, unless the C-sate definition in the
    custom table has the 'CPUIDLE_FLAG_ALWAYS_ENABLE' flag set, in which
    case enabled it.

The goal is to honor BIOS C6 settings - If BIOS disables C6, disable it
by default in the OS too (but it can be enabled via sysfs).

This works well on Xeons that expose only one flavor of C6. This are all
Xeons except for the newest Granite Rapids (GNR) and Sierra Forest (SRF).

The problem
~~~~~~~~~~~

GNR and SRF have 2 flavors of C6: C6/C6P on GNR, C6S/C6SP on SRF. The
the "P" flavor allows for the package C6, while the "non-P" flavor
allows only for core/module C6.

As far as this patch is concerned, both GNR and SRF platforms are
handled the same way. Therefore, further discussion is focused on GNR,
but it applies to SRF as well.

On Intel Xeon platforms, BIOS exposes only 2 ACPI C-states: C1 and C2.
Well, depending on BIOS settings, C2 may be named as C3. But there still
will be only 2 states - C1 and C3. But this is a non-essential detail,
so further discussion is focused on the ACPI C1 and C2 case.

On pre-GNR/SRF Xeon platforms, ACPI C1 is mapped to C1 or C1E, and ACPI
C2 is mapped to C6. The 'use_acpi' flag works just fine:

 * If ACPI C2 enabled, enable C6.
 * Otherwise, disable C6.

However, on GNR there are 2 flavors of C6, so BIOS maps ACPI C2 to
either C6 or C6P, depending on the user settings. As a result, due to
the 'use_acpi' flag, 'intel_idle' disables least one of the C6 flavors.

BIOS                   | OS                         | Verdict
----------------------------------------------------|---------
ACPI C2 disabled       | C6 disabled, C6P disabled  | OK
ACPI C2 mapped to C6   | C6 enabled,  C6P disabled  | Not OK
ACPI C2 mapped to C6P  | C6 disabled, C6P enabled   | Not OK

The goal of 'use_acpi' is to honor BIOS ACPI C2 disabled case, which
works fine. But if ACPI C2 is enabled, the goal is to enable all flavors
of C6, not just one of the flavors. This was overlooked when enabling
GNR/SRF platforms.

In other words, before GNR/SRF, the ACPI C2 status was binary - enabled
or disabled. But it is not binary on GNR/SRF, however the goal is to
continue treat it as binary.

The fix
~~~~~~~

Notice, that current algorithm matches ACPI and custom table C-states
by the mwait hint. However, mwait hint consists of the 'state' and
'sub-state' parts, and all C6 flavors have the same state value of 0x20,
but different sub-state values.

Introduce new C-state table flag - CPUIDLE_FLAG_PARTIAL_HINT_MATCH and
add it to both C6 flavors of the GNR/SRF platforms.

When matching ACPI _CST and custom table C-states, match only the start
part if the C-state has CPUIDLE_FLAG_PARTIAL_HINT_MATCH, other wise
match both state and sub-state parts (as before).

With this fix, GNR C-states enabled/disabled status looks like this.

BIOS                   | OS
----------------------------------------------------
ACPI C2 disabled       | C6 disabled, C6P disabled
ACPI C2 mapped to C6   | C6 enabled, C6P enabled
ACPI C2 mapped to C6P  | C6 enabled, C6P enabled

Possible alternative
~~~~~~~~~~~~~~~~~~~~

The alternative would be to remove 'use_acpi' flag for GNR and SRF.
This would be a simpler solution, but it would violate the principle of
least surprise - users of Xeon platforms are used to the fact that
intel_idle honors C6 enabled/disabled flag. It is more consistent user
experience if GNR/SRF continue doing so.

How tested
~~~~~~~~~~

Tested on GNR and SRF platform with all the 3 BIOS configurations: ACPI
C2 disabled, mapped to C6/C6S, mapped to C6P/C6SP.

Tested on Ice lake Xeon and Sapphire Rapids Xeon platforms with ACPI C2
enabled and disabled, just to verify that the patch does not break older
Xeons.

Fixes: 92813fd ("intel_idle: add Sierra Forest SoC support")
Fixes: 370406b ("intel_idle: add Granite Rapids Xeon support")
	Cc: 6.8+ <[email protected]> # 6.8+
	Signed-off-by: Artem Bityutskiy <[email protected]>
Link: https://patch.msgid.link/[email protected]
[ rjw: Changelog edits ]
	Signed-off-by: Rafael J. Wysocki <[email protected]>
(cherry picked from commit 4c411cc)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.35.1.el9_5
commit-author Nick Child <[email protected]>
commit 5cb431d

In ibmvnic_xmit() if ibmvnic_tx_scrq_flush() returns H_CLOSED then
it will inform upper level networking functions to disable tx
queues. H_CLOSED signals that the connection with the vnic server is
down and a transport event is expected to recover the device.

Previously, ibmvnic_tx_scrq_flush() was hard-coded to return success.
Therefore, the queues would remain active until ibmvnic_cleanup() is
called within do_reset().

The problem is that do_reset() depends on the RTNL lock. If several
ibmvnic devices are resetting then there can be a long wait time until
the last device can grab the lock. During this time the tx/rx queues
still appear active to upper level functions.

FYI, we do make a call to netif_carrier_off() outside the RTNL lock but
its calls to dev_deactivate() are also dependent on the RTNL lock.

As a result, large amounts of retransmissions were observed in a short
period of time, eventually leading to ETIMEOUT. This was specifically
seen with HNV devices, likely because of even more RTNL dependencies.

Therefore, ensure the return code of ibmvnic_tx_scrq_flush() is
propagated to the xmit function to allow for an earlier (and lock-less)
response to a transport event.

	Signed-off-by: Nick Child <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Paolo Abeni <[email protected]>
(cherry picked from commit 5cb431d)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.35.1.el9_5
commit-author Nick Child <[email protected]>
commit dda10fc

Previously, the driver would replenish the rx pool if the polling
function consumed less than the budget. The logic being that the driver
did not exhaust its budget so that must mean that the driver is not busy
and has cycles to spare for replenishing the pool.

So pool replenishment happens on every poll which did not consume
the budget. This can very costly during request-response tests.

In fact, an extra ~100pps can be seen in TCP_RR_150 tests when we remove
this conditional. Trace results (ftrace, graph-time=1) for the poll
function are below:
Previous results:
    ibmvnic_poll = 64951846.0 us / 4167628.0 hits = AVG 15.58
    replenish_rx_pool = 17602846.0 us / 4710437.0 hits = AVG 3.74
Now:
    ibmvnic_poll = 57673941.0 us / 4791737.0 hits = AVG 12.04
    replenish_rx_pool = 3938171.6 us / 4314.0 hits = AVG 912.88

While the replenish function takes longer, it is hit less frequently
meaning the ibmvnic_poll function, on average, is faster.

Furthermore, this change does not have a negative effect on
performance bandwidth/latency measurements.

	Signed-off-by: Nick Child <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit dda10fc)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.35.1.el9_5
commit-author Nick Child <[email protected]>
commit 1c33e29

Byte Queue Limits depends on dql_completed being called once per tx
completion round in order to adjust its algorithm appropriately. The
dql->limit value is an approximation of the amount of bytes that the NIC
can consume per irq interval. If this approximation is too high then the
NIC will become over-saturated. Too low and the NIC will starve.

The dql->limit depends on dql->prev-* stats to calculate an optimal
value. If dql_completed() is called more than once per irq handler then
those prev-* values become unreliable (because they are not an accurate
representation of the previous state of the NIC) resulting in a
sub-optimal limit value.

Therefore, move the call to netdev_tx_completed_queue() to the end of
ibmvnic_complete_tx().

When performing 150 sessions of TCP rr (request-response 1 byte packets)
workloads, one could observe:
  PREVIOUSLY: - limit and inflight values hovering around 130
              - transaction rate of around 750k pps.

  NOW:        - limit rises and falls in response to inflight (130-900)
              - transaction rate of around 1M pps (33% improvement)

	Signed-off-by: Nick Child <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 1c33e29)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
cve CVE-2024-43855
Rebuild_History Non-Buildable kernel-5.14.0-503.35.1.el9_5
commit-author Li Nan <[email protected]>
commit 611d5cb

Deadlock occurs when mddev is being suspended while some flush bio is in
progress. It is a complex issue.

T1. the first flush is at the ending stage, it clears 'mddev->flush_bio'
    and tries to submit data, but is blocked because mddev is suspended
    by T4.
T2. the second flush sets 'mddev->flush_bio', and attempts to queue
    md_submit_flush_data(), which is already running (T1) and won't
    execute again if on the same CPU as T1.
T3. the third flush inc active_io and tries to flush, but is blocked because
    'mddev->flush_bio' is not NULL (set by T2).
T4. mddev_suspend() is called and waits for active_io dec to 0 which is inc
    by T3.

  T1		T2		T3		T4
  (flush 1)	(flush 2)	(third 3)	(suspend)
  md_submit_flush_data
   mddev->flush_bio = NULL;
   .
   .	 	md_flush_request
   .	  	 mddev->flush_bio = bio
   .	  	 queue submit_flushes
   .		 .
   .		 .		md_handle_request
   .		 .		 active_io + 1
   .		 .		 md_flush_request
   .		 .		  wait !mddev->flush_bio
   .		 .
   .		 .				mddev_suspend
   .		 .				 wait !active_io
   .		 .
   .		 submit_flushes
   .		 queue_work md_submit_flush_data
   .		 //md_submit_flush_data is already running (T1)
   .
   md_handle_request
    wait resume

The root issue is non-atomic inc/dec of active_io during flush process.
active_io is dec before md_submit_flush_data is queued, and inc soon
after md_submit_flush_data() run.
  md_flush_request
    active_io + 1
    submit_flushes
      active_io - 1
      md_submit_flush_data
        md_handle_request
        active_io + 1
          make_request
        active_io - 1

If active_io is dec after md_handle_request() instead of within
submit_flushes(), make_request() can be called directly intead of
md_handle_request() in md_submit_flush_data(), and active_io will
only inc and dec once in the whole flush process. Deadlock will be
fixed.

Additionally, the only difference between fixing the issue and before is
that there is no return error handling of make_request(). But after
previous patch cleaned md_write_start(), make_requst() only return error
in raid5_make_request() by dm-raid, see commit 41425f9 ("dm-raid456,
md/raid456: fix a deadlock for dm-raid456 while io concurrent with
reshape)". Since dm always splits data and flush operation into two
separate io, io size of flush submitted by dm always is 0, make_request()
will not be called in md_submit_flush_data(). To prevent future
modifications from introducing issues, add WARN_ON to ensure
make_request() no error is returned in this context.

Fixes: fa2bbff ("md: synchronize flush io with array reconfiguration")
	Signed-off-by: Li Nan <[email protected]>
	Signed-off-by: Song Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 611d5cb)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.35.1.el9_5
commit-author Li Lingfeng <[email protected]>
commit b313a8c

Lockdep reported a warning in Linux version 6.6:

[  414.344659] ================================
[  414.345155] WARNING: inconsistent lock state
[  414.345658] 6.6.0-07439-gba2303cacfda #6 Not tainted
[  414.346221] --------------------------------
[  414.346712] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[  414.347545] kworker/u10:3/1152 [HC0[0]:SC0[0]:HE0:SE1] takes:
[  414.349245] ffff88810edd1098 (&sbq->ws[i].wait){+.?.}-{2:2}, at: blk_mq_dispatch_rq_list+0x131c/0x1ee0
[  414.351204] {IN-SOFTIRQ-W} state was registered at:
[  414.351751]   lock_acquire+0x18d/0x460
[  414.352218]   _raw_spin_lock_irqsave+0x39/0x60
[  414.352769]   __wake_up_common_lock+0x22/0x60
[  414.353289]   sbitmap_queue_wake_up+0x375/0x4f0
[  414.353829]   sbitmap_queue_clear+0xdd/0x270
[  414.354338]   blk_mq_put_tag+0xdf/0x170
[  414.354807]   __blk_mq_free_request+0x381/0x4d0
[  414.355335]   blk_mq_free_request+0x28b/0x3e0
[  414.355847]   __blk_mq_end_request+0x242/0xc30
[  414.356367]   scsi_end_request+0x2c1/0x830
[  414.345155] WARNING: inconsistent lock state
[  414.345658] 6.6.0-07439-gba2303cacfda #6 Not tainted
[  414.346221] --------------------------------
[  414.346712] inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
[  414.347545] kworker/u10:3/1152 [HC0[0]:SC0[0]:HE0:SE1] takes:
[  414.349245] ffff88810edd1098 (&sbq->ws[i].wait){+.?.}-{2:2}, at: blk_mq_dispatch_rq_list+0x131c/0x1ee0
[  414.351204] {IN-SOFTIRQ-W} state was registered at:
[  414.351751]   lock_acquire+0x18d/0x460
[  414.352218]   _raw_spin_lock_irqsave+0x39/0x60
[  414.352769]   __wake_up_common_lock+0x22/0x60
[  414.353289]   sbitmap_queue_wake_up+0x375/0x4f0
[  414.353829]   sbitmap_queue_clear+0xdd/0x270
[  414.354338]   blk_mq_put_tag+0xdf/0x170
[  414.354807]   __blk_mq_free_request+0x381/0x4d0
[  414.355335]   blk_mq_free_request+0x28b/0x3e0
[  414.355847]   __blk_mq_end_request+0x242/0xc30
[  414.356367]   scsi_end_request+0x2c1/0x830
[  414.356863]   scsi_io_completion+0x177/0x1610
[  414.357379]   scsi_complete+0x12f/0x260
[  414.357856]   blk_complete_reqs+0xba/0xf0
[  414.358338]   __do_softirq+0x1b0/0x7a2
[  414.358796]   irq_exit_rcu+0x14b/0x1a0
[  414.359262]   sysvec_call_function_single+0xaf/0xc0
[  414.359828]   asm_sysvec_call_function_single+0x1a/0x20
[  414.360426]   default_idle+0x1e/0x30
[  414.360873]   default_idle_call+0x9b/0x1f0
[  414.361390]   do_idle+0x2d2/0x3e0
[  414.361819]   cpu_startup_entry+0x55/0x60
[  414.362314]   start_secondary+0x235/0x2b0
[  414.362809]   secondary_startup_64_no_verify+0x18f/0x19b
[  414.363413] irq event stamp: 428794
[  414.363825] hardirqs last  enabled at (428793): [<ffffffff816bfd1c>] ktime_get+0x1dc/0x200
[  414.364694] hardirqs last disabled at (428794): [<ffffffff85470177>] _raw_spin_lock_irq+0x47/0x50
[  414.365629] softirqs last  enabled at (428444): [<ffffffff85474780>] __do_softirq+0x540/0x7a2
[  414.366522] softirqs last disabled at (428419): [<ffffffff813f65ab>] irq_exit_rcu+0x14b/0x1a0
[  414.367425]
               other info that might help us debug this:
[  414.368194]  Possible unsafe locking scenario:
[  414.368900]        CPU0
[  414.369225]        ----
[  414.369548]   lock(&sbq->ws[i].wait);
[  414.370000]   <Interrupt>
[  414.370342]     lock(&sbq->ws[i].wait);
[  414.370802]
                *** DEADLOCK ***
[  414.371569] 5 locks held by kworker/u10:3/1152:
[  414.372088]  #0: ffff88810130e938 ((wq_completion)writeback){+.+.}-{0:0}, at: process_scheduled_works+0x357/0x13f0
[  414.373180]  #1: ffff88810201fdb8 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_scheduled_works+0x3a3/0x13f0
[  414.374384]  #2: ffffffff86ffbdc0 (rcu_read_lock){....}-{1:2}, at: blk_mq_run_hw_queue+0x637/0xa00
[  414.375342]  #3: ffff88810edd1098 (&sbq->ws[i].wait){+.?.}-{2:2}, at: blk_mq_dispatch_rq_list+0x131c/0x1ee0
[  414.376377]  #4: ffff888106205a08 (&hctx->dispatch_wait_lock){+.-.}-{2:2}, at: blk_mq_dispatch_rq_list+0x1337/0x1ee0
[  414.378607]
               stack backtrace:
[  414.379177] CPU: 0 PID: 1152 Comm: kworker/u10:3 Not tainted 6.6.0-07439-gba2303cacfda #6
[  414.380032] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[  414.381177] Workqueue: writeback wb_workfn (flush-253:0)
[  414.381805] Call Trace:
[  414.382136]  <TASK>
[  414.382429]  dump_stack_lvl+0x91/0xf0
[  414.382884]  mark_lock_irq+0xb3b/0x1260
[  414.383367]  ? __pfx_mark_lock_irq+0x10/0x10
[  414.383889]  ? stack_trace_save+0x8e/0xc0
[  414.384373]  ? __pfx_stack_trace_save+0x10/0x10
[  414.384903]  ? graph_lock+0xcf/0x410
[  414.385350]  ? save_trace+0x3d/0xc70
[  414.385808]  mark_lock.part.20+0x56d/0xa90
[  414.386317]  mark_held_locks+0xb0/0x110
[  414.386791]  ? __pfx_do_raw_spin_lock+0x10/0x10
[  414.387320]  lockdep_hardirqs_on_prepare+0x297/0x3f0
[  414.387901]  ? _raw_spin_unlock_irq+0x28/0x50
[  414.388422]  trace_hardirqs_on+0x58/0x100
[  414.388917]  _raw_spin_unlock_irq+0x28/0x50
[  414.389422]  __blk_mq_tag_busy+0x1d6/0x2a0
[  414.389920]  __blk_mq_get_driver_tag+0x761/0x9f0
[  414.390899]  blk_mq_dispatch_rq_list+0x1780/0x1ee0
[  414.391473]  ? __pfx_blk_mq_dispatch_rq_list+0x10/0x10
[  414.392070]  ? sbitmap_get+0x2b8/0x450
[  414.392533]  ? __blk_mq_get_driver_tag+0x210/0x9f0
[  414.393095]  __blk_mq_sched_dispatch_requests+0xd99/0x1690
[  414.393730]  ? elv_attempt_insert_merge+0x1b1/0x420
[  414.394302]  ? __pfx___blk_mq_sched_dispatch_requests+0x10/0x10
[  414.394970]  ? lock_acquire+0x18d/0x460
[  414.395456]  ? blk_mq_run_hw_queue+0x637/0xa00
[  414.395986]  ? __pfx_lock_acquire+0x10/0x10
[  414.396499]  blk_mq_sched_dispatch_requests+0x109/0x190
[  414.397100]  blk_mq_run_hw_queue+0x66e/0xa00
[  414.397616]  blk_mq_flush_plug_list.part.17+0x614/0x2030
[  414.398244]  ? __pfx_blk_mq_flush_plug_list.part.17+0x10/0x10
[  414.398897]  ? writeback_sb_inodes+0x241/0xcc0
[  414.399429]  blk_mq_flush_plug_list+0x65/0x80
[  414.399957]  __blk_flush_plug+0x2f1/0x530
[  414.400458]  ? __pfx___blk_flush_plug+0x10/0x10
[  414.400999]  blk_finish_plug+0x59/0xa0
[  414.401467]  wb_writeback+0x7cc/0x920
[  414.401935]  ? __pfx_wb_writeback+0x10/0x10
[  414.402442]  ? mark_held_locks+0xb0/0x110
[  414.402931]  ? __pfx_do_raw_spin_lock+0x10/0x10
[  414.403462]  ? lockdep_hardirqs_on_prepare+0x297/0x3f0
[  414.404062]  wb_workfn+0x2b3/0xcf0
[  414.404500]  ? __pfx_wb_workfn+0x10/0x10
[  414.404989]  process_scheduled_works+0x432/0x13f0
[  414.405546]  ? __pfx_process_scheduled_works+0x10/0x10
[  414.406139]  ? do_raw_spin_lock+0x101/0x2a0
[  414.406641]  ? assign_work+0x19b/0x240
[  414.407106]  ? lock_is_held_type+0x9d/0x110
[  414.407604]  worker_thread+0x6f2/0x1160
[  414.408075]  ? __kthread_parkme+0x62/0x210
[  414.408572]  ? lockdep_hardirqs_on_prepare+0x297/0x3f0
[  414.409168]  ? __kthread_parkme+0x13c/0x210
[  414.409678]  ? __pfx_worker_thread+0x10/0x10
[  414.410191]  kthread+0x33c/0x440
[  414.410602]  ? __pfx_kthread+0x10/0x10
[  414.411068]  ret_from_fork+0x4d/0x80
[  414.411526]  ? __pfx_kthread+0x10/0x10
[  414.411993]  ret_from_fork_asm+0x1b/0x30
[  414.412489]  </TASK>

When interrupt is turned on while a lock holding by spin_lock_irq it
throws a warning because of potential deadlock.

blk_mq_prep_dispatch_rq
 blk_mq_get_driver_tag
  __blk_mq_get_driver_tag
   __blk_mq_alloc_driver_tag
    blk_mq_tag_busy -> tag is already busy
    // failed to get driver tag
 blk_mq_mark_tag_wait
  spin_lock_irq(&wq->lock) -> lock A (&sbq->ws[i].wait)
  __add_wait_queue(wq, wait) -> wait queue active
  blk_mq_get_driver_tag
  __blk_mq_tag_busy
-> 1) tag must be idle, which means there can't be inflight IO
   spin_lock_irq(&tags->lock) -> lock B (hctx->tags)
   spin_unlock_irq(&tags->lock) -> unlock B, turn on interrupt accidentally
-> 2) context must be preempt by IO interrupt to trigger deadlock.

As shown above, the deadlock is not possible in theory, but the warning
still need to be fixed.

Fix it by using spin_lock_irqsave to get lockB instead of spin_lock_irq.

Fixes: 4f1731d ("blk-mq: fix potential io hang by wrong 'wake_batch'")
	Signed-off-by: Li Lingfeng <[email protected]>
	Reviewed-by: Ming Lei <[email protected]>
	Reviewed-by: Yu Kuai <[email protected]>
	Reviewed-by: Bart Van Assche <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jens Axboe <[email protected]>
(cherry picked from commit b313a8c)
	Signed-off-by: Jonathan Maple <[email protected]>
jira LE-2742
Rebuild_History Non-Buildable kernel-5.14.0-503.35.1.el9_5
commit-author Jie Zhan <[email protected]>
commit c471956

The CPPC performance feedback counters could be 0 or unchanged when the
target cpu is in a low-power idle state, e.g. power-gated or clock-gated.

When the counters are 0, cppc_cpufreq_get_rate() returns 0 KHz, which makes
cpufreq_online() get a false error and fail to generate a cpufreq policy.

When the counters are unchanged, the existing cppc_perf_from_fbctrs()
returns a cached desired perf, but some platforms may update the real
frequency back to the desired perf reg.

For the above cases in cppc_cpufreq_get_rate(), get the latest desired perf
from the CPPC reg to reflect the frequency because some platforms may
update the actual frequency back there; if failed, use the cached desired
perf.

Fixes: 6a4fec4 ("cpufreq: cppc: cppc_cpufreq_get_rate() returns zero in all error cases.")
	Signed-off-by: Jie Zhan <[email protected]>
	Reviewed-by: Zeng Heng <[email protected]>
	Reviewed-by: Ionela Voinescu <[email protected]>
	Reviewed-by: Huisong Li <[email protected]>
	Signed-off-by: Viresh Kumar <[email protected]>
(cherry picked from commit c471956)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v5.14~1..master: 294815
Number of commits in rpm: 11
Number of commits matched with upstream: 8 (72.73%)
Number of commits in upstream but not in rpm: 294807
Number of commits NOT found in upstream: 3 (27.27%)

Rebuilding Kernel on Branch rocky9_5_rebuild_kernel-5.14.0-503.35.1.el9_5 for kernel-5.14.0-503.35.1.el9_5
Clean Cherry Picks: 7 (87.50%)
Empty Cherry Picks: 0 (0.00%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-5.14.0-503.35.1.el9_5/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
@thefossguy-ciq
Copy link

Do we verify if every "RPM NVR tag" (kernel-5.14.0-503.31.1.el9_5, kernel-5.14.0-503.33.1.el9_5, etc) compiles and works correctly? Or rather, is verifying every such "tag" necessary?

Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

@PlaidCat
Copy link
Collaborator Author

PlaidCat commented Apr 9, 2025

Do we verify if every "RPM NVR tag" (kernel-5.14.0-503.31.1.el9_5, kernel-5.14.0-503.33.1.el9_5, etc) compiles and works correctly? Or rather, is verifying every such "tag" necessary?

No we only test the latest in the list, but the process is the same per src.rpm so if this happened in 5 PRs or 1 the getting to the test build is the same everytime. But we're behind on syncing so this is a speed up.
If they failed to build there would not have been a src.rpm and binaries created for the repo.

This is why we source from the built binaries in the RESF repo Rather than the staging dist-git.

Copy link

@thefossguy-ciq thefossguy-ciq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚤

@PlaidCat PlaidCat merged commit c9a2447 into rocky9_5 Apr 9, 2025
4 checks passed
@PlaidCat PlaidCat deleted the rocky9_5_rebuild branch April 9, 2025 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants