Skip to content

Conversation

@PlaidCat
Copy link
Collaborator

@PlaidCat PlaidCat commented Jan 16, 2026

General Process:

Checking Rebuild Commits for Potentially missing commits:

kernel-4.18.0-553.92.1.el8_10

[jmaple@devbox kernel-src-tree]$ cat ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/rebuild.details.txt
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 594898
Number of commits in rpm: 31
Number of commits matched with upstream: 20 (64.52%)
Number of commits in upstream but not in rpm: 594878
Number of commits NOT found in upstream: 11 (35.48%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.92.1.el8_10 for kernel-4.18.0-553.92.1.el8_10
Clean Cherry Picks: 11 (55.00%)
Empty Cherry Picks: 9 (45.00%)
_______________________________

__EMPTY COMMITS__________________________
6e9a2f8dbe93c8004c2af2c0158888628b7ca034 NFSv4: xattr handlers should check for absent nfs filehandles
a85c2257a8ac353af16dbcbf32c50d3380860bc5 sched/isolation: add cpu_is_isolated() API
a46c27026da10a126dd870f7b65380010bd20db5 blk-mq: don't schedule block kworker on isolated CPUs
c25c0c9035bb8b28c844dfddeda7b8bdbcfcae95 blk-mq: setup queue ->tag_set before initializing hctx
15f519e9f883b316d86e2bb6b767a023aafd9d83 ceph: fix race condition validating r_parent before applying state
bec324f33d1ed346394b2eee25bf6dbf3511f727 ceph: fix race condition where r_parent becomes stale before sending message
89e97eb8cec0f1af5ebf2380308913256ca7915a perf/x86/intel/ds: Fix the conversion from TSC to perf time
cd493dcf4f827700b66d1adb43f20d3838759beb gfs2: run_queue cleanup
fa0f61cc1d828178aa921475a9b786e7fbb65ccb media: rc: fix races with imon_disconnect()

__CHANGES NOT IN UPSTREAM________________
Adding prod certs and changed cert date to 20210620
Adding Rocky secure boot certs
Fixing vmlinuz removal
Fixing UEFI CA path
Porting to 8.10, debranding and Rocky branding
Fixing pesign_key_name values
gfs2: Do not cancel internal demote requests
gfs2: Retries missing in gfs2_{rename,exchange}
gfs2: glock cancelation flag fix
redhat: introduce RELEASE_LOCALVERSION variable
cifs: fix automount with passwords that contain commas

None of these exist in mainline currently

gfs2: Do not cancel internal demote requests
gfs2: Retries missing in gfs2_{rename,exchange}
gfs2: glock cancelation flag fix
cifs: fix automount with passwords that contain commas

BUILD

[jmaple@devbox code]$ egrep -B 5 -A 5 "\[TIMER\]|^Starting Build" $(ls -t kbuild* | head -n1)
/mnt/code/kernel-src-tree-build
Running make mrproper...
  CLEAN   scripts/basic
  CLEAN   scripts/kconfig
[TIMER]{MRPROPER}: 5s
x86_64 architecture detected, copying config
'configs/kernel-x86_64.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-rocky8_10_rebuild-4be5de5480b0"
Making olddefconfig
--
  HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf  --olddefconfig Kconfig
#
# configuration written to .config
#
Starting Build
scripts/kconfig/conf  --syncconfig Kconfig
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_64.h
--
  LD [M]  sound/usb/usx2y/snd-usb-usx2y.ko
  LD [M]  sound/virtio/virtio_snd.ko
  LD [M]  sound/x86/snd-hdmi-lpe-audio.ko
  LD [M]  sound/xen/snd_xen_front.ko
  LD [M]  virt/lib/irqbypass.ko
[TIMER]{BUILD}: 1439s
Making Modules
  INSTALL arch/x86/crypto/blowfish-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx2.ko
  INSTALL arch/x86/crypto/camellia-x86_64.ko
--
  INSTALL sound/virtio/virtio_snd.ko
  INSTALL sound/x86/snd-hdmi-lpe-audio.ko
  INSTALL sound/xen/snd_xen_front.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-rocky8_10_rebuild-4be5de5480b0+
[TIMER]{MODULES}: 14s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-rocky8_10_rebuild-4be5de5480b0+ arch/x86/boot/bzImage \
        System.map "/boot"
[TIMER]{INSTALL}: 21s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-rocky8_10_rebuild-4be5de5480b0+ and Index to 2
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 5s
[TIMER]{BUILD}: 1439s
[TIMER]{MODULES}: 14s
[TIMER]{INSTALL}: 21s
[TIMER]{TOTAL} 1485s
Rebooting in 10 seconds

KSelfTests

[jmaple@devbox code]$ ~/workspace/auto_kernel_history_rebuild/Rocky10/rocky10/code/get_kselftest_diff.sh
kselftest.4.18.0-rocky8_10_rebuild-f01f784daddc+.log
207
kselftest.4.18.0-rocky8_10_rebuild-d79f08d232d6+.log
207
kselftest.4.18.0-rocky8_10_rebuild-a8a2edf9cea8+.log
207
kselftest.4.18.0-rocky8_10_rebuild-4be5de5480b0+.log
207
Before: kselftest.4.18.0-rocky8_10_rebuild-a8a2edf9cea8+.log
After: kselftest.4.18.0-rocky8_10_rebuild-4be5de5480b0+.log
Diff:
No differences found.

jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Scott Mayhew <[email protected]>
commit 6e9a2f8
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/6e9a2f8d.failed

The nfs inodes for referral anchors that have not yet been followed have
their filehandles zeroed out.

Attempting to call getxattr() on one of these will cause the nfs client
to send a GETATTR to the nfs server with the preceding PUTFH sans
filehandle.  The server will reply NFS4ERR_NOFILEHANDLE, leading to -EIO
being returned to the application.

For example:

$ strace -e trace=getxattr getfattr -n system.nfs4_acl /mnt/t/ref
getxattr("/mnt/t/ref", "system.nfs4_acl", NULL, 0) = -1 EIO (Input/output error)
/mnt/t/ref: system.nfs4_acl: Input/output error
+++ exited with 1 +++

Have the xattr handlers return -ENODATA instead.

	Signed-off-by: Scott Mayhew <[email protected]>
	Signed-off-by: Anna Schumaker <[email protected]>
(cherry picked from commit 6e9a2f8)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	fs/nfs/nfs4proc.c
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Frederic Weisbecker <[email protected]>
commit a85c225
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/a85c2257.failed

Patch series "memcg, cpuisol: do not interfere pcp cache charges draining
with cpuisol workloads".

Leonardo has reported [1] that pcp memcg charge draining can interfere
with cpu isolated workloads.  The said draining is done from a WQ context
with a pcp worker scheduled on each CPU which holds any cached charges for
a specific memcg hierarchy.  Operation is not really a common operation
[2].  It can be triggered from the userspace though so some care is
definitely due.

Leonardo has tried to address the issue by allowing remote charge draining
[3].  This approach requires an additional locking to synchronize pcp
caches sync from a remote cpu from local pcp consumers.  Even though the
proposed lock was per-cpu there is still potential for contention and less
predictable behavior.

This patchset addresses the issue from a different angle.  Rather than
dealing with a potential synchronization, cpus which are isolated are
simply never scheduled to be drained.  This means that a small amount of
charges could be laying around and waiting for a later use or they are
flushed when a different memcg is charged from the same cpu.  More details
are in patch 2.  The first patch from Frederic is implementing an
abstraction to tell whether a specific cpu has been isolated and therefore
require a special treatment.

This patch (of 2):

Provide this new API to check if a CPU has been isolated either through
isolcpus= or nohz_full= kernel parameter.

It aims at avoiding kernel load deemed to be safely spared on CPUs running
sensitive workload that can't bear any disturbance, such as pcp cache
draining.

Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
	Signed-off-by: Frederic Weisbecker <[email protected]>
	Signed-off-by: Michal Hocko <[email protected]>
	Suggested-by: Michal Hocko <[email protected]>
	Cc: Johannes Weiner <[email protected]>
	Cc: Marcelo Tosatti <[email protected]>
	Cc: Muchun Song <[email protected]>
	Cc: Peter Zijlstra <[email protected]>
	Cc: Roman Gushchin <[email protected]>
	Cc: Shakeel Butt <[email protected]>
	Cc: Thomas Gleixner <[email protected]>
	Cc: Leonardo Bras <[email protected]>
	Cc: Ingo Molnar <[email protected]>
	Signed-off-by: Andrew Morton <[email protected]>
(cherry picked from commit a85c225)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	include/linux/sched/isolation.h
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Ming Lei <[email protected]>
commit a46c270
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/a46c2702.failed

Kernel parameter of `isolcpus=` or 'nohz_full=' are used to isolate CPUs
for specific task, and it isn't expected to let block IO disturb these CPUs.
blk-mq kworker shouldn't be scheduled on isolated CPUs. Also if isolated
CPUs is run for blk-mq kworker, long block IO latency can be caused.

Kernel workqueue only respects CPU isolation for WQ_UNBOUND, for bound
WQ, the responsibility is on user because CPU is specified as WQ API
parameter, such as mod_delayed_work_on(cpu), queue_delayed_work_on(cpu)
and queue_work_on(cpu).

So not run blk-mq kworker on isolated CPUs by removing isolated CPUs
from hctx->cpumask. Meantime use queue map to check if all CPUs in this
hw queue are offline instead of hctx->cpumask, this way can avoid any
cost in fast IO code path, and is safe since hctx->cpumask are only
used in the two cases.

	Cc: Tim Chen <[email protected]>
	Cc: Juri Lelli <[email protected]>
	Cc: Andrew Theurer <[email protected]>
	Cc: Joe Mario <[email protected]>
	Cc: Sebastian Jug <[email protected]>
	Cc: Frederic Weisbecker <[email protected]>
	Cc: Bart Van Assche <[email protected]>
	Cc: Tejun Heo <[email protected]>
Tesed-by: Joe Mario <[email protected]>
	Signed-off-by: Ming Lei <[email protected]>
	Reviewed-by: Ewan D. Milne <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jens Axboe <[email protected]>
(cherry picked from commit a46c270)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	block/blk-mq.c
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Ming Lei <[email protected]>
commit 7b81581

Commit a46c270 ("blk-mq: don't schedule block kworker on isolated CPUs")
rules out isolated CPUs from hctx->cpumask, and hctx->cpumask should only be
used for scheduling kworker.

Add helper blk_mq_cpu_mapped_to_hctx() and apply it into cpuhp handlers.

This patch avoids to forget clearing INACTIVE of hctx state in case that one
isolated CPU becomes online, and fixes hang issue when allocating request
from this hctx's tags.

	Cc: Raju Cheerla <[email protected]>
Fixes: a46c270 ("blk-mq: don't schedule block kworker on isolated CPUs")
	Signed-off-by: Ming Lei <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Tested-by: Raju Cheerla <[email protected]>
	Signed-off-by: Jens Axboe <[email protected]>
(cherry picked from commit 7b81581)
	Signed-off-by: Jonathan Maple <[email protected]>
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Ming Lei <[email protected]>
commit c25c0c9
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/c25c0c90.failed

Commit 7b81581 ("blk-mq: add helper for checking if one CPU is mapped to specified hctx")
needs to check queue mapping via tag set in hctx's cpuhp handler.

However, q->tag_set may not be setup yet when the cpuhp handler is
enabled, then kernel oops is triggered.

Fix the issue by setup queue tag_set before initializing hctx.

	Cc: [email protected]
Reported-and-tested-by: Rick Koch <[email protected]>
Closes: https://lore.kernel.org/linux-block/CANa58eeNDozLaBHKPLxSAhEy__FPfJT_F71W=sEQw49UCrC9PQ@mail.gmail.com
Fixes: 7b81581 ("blk-mq: add helper for checking if one CPU is mapped to specified hctx")
	Signed-off-by: Ming Lei <[email protected]>
	Reviewed-by: Christoph Hellwig <[email protected]>
	Reviewed-by: John Garry <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jens Axboe <[email protected]>
(cherry picked from commit c25c0c9)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	block/blk-mq.c
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
Rebuild_CHGLOG: - ceph: fix client race condition validating r_parent before applying state (Alex Markuze) [RHEL-120226]
Rebuild_FUZZ: 94.96%
commit-author Alex Markuze <[email protected]>
commit 15f519e
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/15f519e9.failed

Add validation to ensure the cached parent directory inode matches the
directory info in MDS replies. This prevents client-side race conditions
where concurrent operations (e.g. rename) cause r_parent to become stale
between request initiation and reply processing, which could lead to
applying state changes to incorrect directory inodes.

[ idryomov: folded a kerneldoc fixup and a follow-up fix from Alex to
  move CEPH_CAP_PIN reference when r_parent is updated:

  When the parent directory lock is not held, req->r_parent can become
  stale and is updated to point to the correct inode.  However, the
  associated CEPH_CAP_PIN reference was not being adjusted.  The
  CEPH_CAP_PIN is a reference on an inode that is tracked for
  accounting purposes.  Moving this pin is important to keep the
  accounting balanced. When the pin was not moved from the old parent
  to the new one, it created two problems: The reference on the old,
  stale parent was never released, causing a reference leak.
  A reference for the new parent was never acquired, creating the risk
  of a reference underflow later in ceph_mdsc_release_request().  This
  patch corrects the logic by releasing the pin from the old parent and
  acquiring it for the new parent when r_parent is switched.  This
  ensures reference accounting stays balanced. ]

	Cc: [email protected]
	Signed-off-by: Alex Markuze <[email protected]>
	Reviewed-by: Viacheslav Dubeyko <[email protected]>
	Signed-off-by: Ilya Dryomov <[email protected]>
(cherry picked from commit 15f519e)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	fs/ceph/debugfs.c
#	fs/ceph/dir.c
#	fs/ceph/file.c
#	fs/ceph/inode.c
#	fs/ceph/mds_client.c
#	fs/ceph/mds_client.h
…message

jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
Rebuild_CHGLOG: - ceph: fix client race condition where r_parent becomes stale before sending message (Alex Markuze) [RHEL-120226]
Rebuild_FUZZ: 95.60%
commit-author Alex Markuze <[email protected]>
commit bec324f
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/bec324f3.failed

When the parent directory's i_rwsem is not locked, req->r_parent may become
stale due to concurrent operations (e.g. rename) between dentry lookup and
message creation. Validate that r_parent matches the encoded parent inode
and update to the correct inode if a mismatch is detected.

[ idryomov: folded a follow-up fix from Alex to drop extra reference
  from ceph_get_reply_dir() in ceph_fill_trace():

  ceph_get_reply_dir() may return a different, referenced inode when
  r_parent is stale and the parent directory lock is not held.
  ceph_fill_trace() used that inode but failed to drop the reference
  when it differed from req->r_parent, leaking an inode reference.

  Keep the directory inode in a local variable and iput() it at
  function end if it does not match req->r_parent. ]

	Cc: [email protected]
	Signed-off-by: Alex Markuze <[email protected]>
	Reviewed-by: Viacheslav Dubeyko <[email protected]>
	Signed-off-by: Ilya Dryomov <[email protected]>
(cherry picked from commit bec324f)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	fs/ceph/inode.c
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Kan Liang <[email protected]>
commit 47a3aeb

The PEBS TSC-based timestamps do not appear correctly in the final
perf.data output file from perf record.

The data->time field setup by PEBS in the setup_pebs_fixed_sample_data()
is later overwritten by perf_events generic code in
perf_prepare_sample(). There is an ordering problem.

Set the sample flags when the data->time is updated by PEBS.
The data->time field will not be overwritten anymore.

	Reported-by: Andreas Kogler <[email protected]>
	Reported-by: Stephane Eranian <[email protected]>
	Signed-off-by: Kan Liang <[email protected]>
	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
(cherry picked from commit 47a3aeb)
	Signed-off-by: Jonathan Maple <[email protected]>
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Kan Liang <[email protected]>
commit 89e97eb
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/89e97eb8.failed

The time order is incorrect when the TSC in a PEBS record is used.

 $perf record -e cycles:upp dd if=/dev/zero of=/dev/null
  count=10000
 $ perf script --show-task-events
       perf-exec     0     0.000000: PERF_RECORD_COMM: perf-exec:915/915
              dd   915   106.479872: PERF_RECORD_COMM exec: dd:915/915
              dd   915   106.483270: PERF_RECORD_EXIT(915:915):(914:914)
              dd   915   106.512429:          1 cycles:upp:
 ffffffff96c011b7 [unknown] ([unknown])
 ... ...

The perf time is from sched_clock_cpu(). The current PEBS code
unconditionally convert the TSC to native_sched_clock(). There is a
shift between the two clocks. If the TSC is stable, the shift is
consistent, __sched_clock_offset. If the TSC is unstable, the shift has
to be calculated at runtime.

This patch doesn't support the conversion when the TSC is unstable. The
TSC unstable case is a corner case and very unlikely to happen. If it
happens, the TSC in a PEBS record will be dropped and fall back to
perf_event_clock().

Fixes: 47a3aeb ("perf/x86/intel/pebs: Fix PEBS timestamps overwritten")
	Reported-by: Namhyung Kim <[email protected]>
	Signed-off-by: Kan Liang <[email protected]>
	Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lore.kernel.org/all/CAM9d7cgWDVAq8-11RbJ2uGfwkKD6fA-OMwOKDrNUrU_=8MgEjg@mail.gmail.com/
(cherry picked from commit 89e97eb)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	arch/x86/events/intel/ds.c
jira KERNEL-428
cve CVE-2025-40240
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Alexey Simakov <[email protected]>
commit 441f064

chunk->skb pointer is dereferenced in the if-block where it's supposed
to be NULL only.

chunk->skb can only be NULL if chunk->head_skb is not. Check for frag_list
instead and do it just before replacing chunk->skb. We're sure that
otherwise chunk->skb is non-NULL because of outer if() condition.

Fixes: 90017ac ("sctp: Add GSO support")
	Signed-off-by: Alexey Simakov <[email protected]>
	Acked-by: Marcelo Ricardo Leitner <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 441f064)
	Signed-off-by: Jonathan Maple <[email protected]>
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Andreas Gruenbacher <[email protected]>
commit 75bb2dd

Commit 6cb3b1c changed how finish_xmote() clears the GLF_LOCK flag,
but it failed to adjust the equivalent code in do_xmote().  Fix that.

Fixes: 6cb3b1c ("gfs2: Fix additional unlikely request cancelation race")
	Signed-off-by: Andreas Gruenbacher <[email protected]>
(cherry picked from commit 75bb2dd)
	Signed-off-by: Jonathan Maple <[email protected]>
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Andreas Gruenbacher <[email protected]>
commit 262ee3a

The GLF_LOCK flag is protected by the gl->gl_lockref.lock spin lock
which is held when entering run_queue(), so we can use test_bit() and
set_bit() here.

	Signed-off-by: Andreas Gruenbacher <[email protected]>
(cherry picked from commit 262ee3a)
	Signed-off-by: Jonathan Maple <[email protected]>
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Andreas Gruenbacher <[email protected]>
commit 92cef39

As a follow-up to commit a431d49 ("gfs2: Fix request cancelation
bug"), it turns out that any call to finish_xmote() is always followed
by a call to run_queue(), either

 * directly when glock_work_func() calls finish_xmote() before calling
   run_queue(), or

 * indirectly when do_xmote() calls finish_xmote() before calling
   gfs2_glock_queue_work(), which queues a call to glock_work_func() in
   work queue context,

so remove the code in finish_xmote() that duplicates the functionality
of run_queue().

In addition, the code this commit removes is missing a check for the
GLF_DEMOTE flag which indicates that no further promotes should be
performed, so if that code didn't get removed, that check would have to
be added.

	Signed-off-by: Andreas Gruenbacher <[email protected]>
	Reviewed-by: Andrew Price <[email protected]>
(cherry picked from commit 92cef39)
	Signed-off-by: Jonathan Maple <[email protected]>
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Andreas Gruenbacher <[email protected]>
commit cd493dc
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/cd493dcf.failed

Transform the code in run_queue() to make it more readable.  No change
in functionality.

	Signed-off-by: Andreas Gruenbacher <[email protected]>
	Reviewed-by: Andrew Price <[email protected]>
(cherry picked from commit cd493dc)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	fs/gfs2/glock.c
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Flavius Georgescu <[email protected]>
commit cf33069

The device it's an iMON UltraBay (0x98 in config byte) with LCD,
IR and dual-knobs front panel.

To work properly the device also require its own key table,
and repeat suppression for all buttons.

	Signed-off-by: Flavius Georgescu <[email protected]>
Co-developed-by: Chris Vandomelen <[email protected]>
	Signed-off-by: Chris Vandomelen <[email protected]>
	Signed-off-by: Sean Young <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit cf33069)
	Signed-off-by: Jonathan Maple <[email protected]>
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Oliver Neukum <[email protected]>
commit af2aa3c

The point of using get/put_device() is to keep references
for as long as the device may be in use. That means dropping
them must be the penultimate action right before freeing the memory.

	Signed-off-by: Oliver Neukum <[email protected]>
	Signed-off-by: Sean Young <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit af2aa3c)
	Signed-off-by: Jonathan Maple <[email protected]>
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Tetsuo Handa <[email protected]>
commit db264d4

Since usb_register_dev() from imon_init_display() from imon_probe() holds
minor_rwsem while display_open() which holds driver_lock and ictx->lock is
called with minor_rwsem held from usb_open(), holding driver_lock or
ictx->lock when calling usb_register_dev() causes circular locking
dependency problem.

Since usb_deregister_dev() from imon_disconnect() holds minor_rwsem while
display_open() which holds driver_lock is called with minor_rwsem held,
holding driver_lock when calling usb_deregister_dev() also causes circular
locking dependency problem.

Sean Young explained that the problem is there are imon devices which have
two usb interfaces, even though it is one device. The probe and disconnect
function of both usb interfaces can run concurrently.

Alan Stern responded that the driver and USB cores guarantee that when an
interface is probed, both the interface and its USB device are locked.
Ditto for when the disconnect callback gets run. So concurrent probing/
disconnection of multiple interfaces on the same device is not possible.

Therefore, we don't need locks for handling race between imon_probe() and
imon_disconnect(). But we still need to handle race between display_open()
/vfd_write()/lcd_write()/display_close() and imon_disconnect(), for
disconnect event can happen while file descriptors are in use.

Since "struct file"->private_data is set by display_open(), vfd_write()/
lcd_write()/display_close() can assume that "struct file"->private_data
is not NULL even after usb_set_intfdata(interface, NULL) was called.

Replace insufficiently held driver_lock with refcount_t based management.
Add a boolean flag for recording whether imon_disconnect() was already
called. Use RCU for accessing this boolean flag and refcount_t.

Since the boolean flag for imon_disconnect() is shared, disconnect event
on either intf0 or intf1 affects both interfaces. But I assume that this
change does not matter, for usually disconnect event would not happen
while interfaces are in use.

Link: https://syzkaller.appspot.com/bug?extid=c558267ad910fc494497

	Reported-by: syzbot <[email protected]>
	Signed-off-by: Tetsuo Handa <[email protected]>
	Tested-by: syzbot <[email protected]>
	Cc: Alan Stern <[email protected]>
	Signed-off-by: Sean Young <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit db264d4)
	Signed-off-by: Jonathan Maple <[email protected]>
jira KERNEL-428
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Gautam Menghani <[email protected]>
commit 813ceef

The function send_packet() has a race condition as follows:

func send_packet()
{
    // do work
    call usb_submit_urb()
    mutex_unlock()
    wait_for_event_interruptible()  <-- lock gone
    mutex_lock()
}

func vfd_write()
{
    mutex_lock()
    call send_packet()  <- prev call is not completed
    mutex_unlock()
}

When the mutex is unlocked and the function send_packet() waits for the
call to complete, vfd_write() can start another call, which leads to the
"URB submitted while active" warning in usb_submit_urb().
Fix this by removing the mutex_unlock() call in send_packet() and using
mutex_lock_interruptible().

Link: https://syzkaller.appspot.com/bug?id=e378e6a51fbe6c5cc43e34f131cc9a315ef0337e

Fixes: 21677cf ("V4L/DVB: ir-core: add imon driver")
	Reported-by: [email protected]
	Signed-off-by: Gautam Menghani <[email protected]>
	Signed-off-by: Sean Young <[email protected]>
	Signed-off-by: Mauro Carvalho Chehab <[email protected]>
(cherry picked from commit 813ceef)
	Signed-off-by: Jonathan Maple <[email protected]>
jira KERNEL-428
cve CVE-2025-39993
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Larshin Sergey <[email protected]>
commit fa0f61c
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/fa0f61cc.failed

Syzbot reports a KASAN issue as below:
BUG: KASAN: use-after-free in __create_pipe include/linux/usb.h:1945 [inline]
BUG: KASAN: use-after-free in send_packet+0xa2d/0xbc0 drivers/media/rc/imon.c:627
Read of size 4 at addr ffff8880256fb000 by task syz-executor314/4465

CPU: 2 PID: 4465 Comm: syz-executor314 Not tainted 6.0.0-rc1-syzkaller #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
Call Trace:
 <TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:317 [inline]
print_report.cold+0x2ba/0x6e9 mm/kasan/report.c:433
kasan_report+0xb1/0x1e0 mm/kasan/report.c:495
__create_pipe include/linux/usb.h:1945 [inline]
send_packet+0xa2d/0xbc0 drivers/media/rc/imon.c:627
vfd_write+0x2d9/0x550 drivers/media/rc/imon.c:991
vfs_write+0x2d7/0xdd0 fs/read_write.c:576
ksys_write+0x127/0x250 fs/read_write.c:631
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

The iMON driver improperly releases the usb_device reference in
imon_disconnect without coordinating with active users of the
device.

Specifically, the fields usbdev_intf0 and usbdev_intf1 are not
protected by the users counter (ictx->users). During probe,
imon_init_intf0 or imon_init_intf1 increments the usb_device
reference count depending on the interface. However, during
disconnect, usb_put_dev is called unconditionally, regardless of
actual usage.

As a result, if vfd_write or other operations are still in
progress after disconnect, this can lead to a use-after-free of
the usb_device pointer.

Thread 1 vfd_write                      Thread 2 imon_disconnect
                                        ...
                                        if
                                          usb_put_dev(ictx->usbdev_intf0)
                                        else
                                          usb_put_dev(ictx->usbdev_intf1)
...
while
  send_packet
    if
      pipe = usb_sndintpipe(
        ictx->usbdev_intf0) UAF
    else
      pipe = usb_sndctrlpipe(
        ictx->usbdev_intf0, 0) UAF

Guard access to usbdev_intf0 and usbdev_intf1 after disconnect by
checking ictx->disconnected in all writer paths. Add early return
with -ENODEV in send_packet(), vfd_write(), lcd_write() and
display_open() if the device is no longer present.

Set and read ictx->disconnected under ictx->lock to ensure memory
synchronization. Acquire the lock in imon_disconnect() before setting
the flag to synchronize with any ongoing operations.

Ensure writers exit early and safely after disconnect before the USB
core proceeds with cleanup.

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.

	Reported-by: [email protected]
Closes: https://syzkaller.appspot.com/bug?extid=f1a69784f6efe748c3bf
Fixes: 21677cf ("V4L/DVB: ir-core: add imon driver")
	Cc: [email protected]

	Signed-off-by: Larshin Sergey <[email protected]>
	Signed-off-by: Sean Young <[email protected]>
	Signed-off-by: Hans Verkuil <[email protected]>
(cherry picked from commit fa0f61c)
	Signed-off-by: Jonathan Maple <[email protected]>

# Conflicts:
#	drivers/media/rc/imon.c
jira KERNEL-428
cve CVE-2025-68285
Rebuild_History Non-Buildable kernel-4.18.0-553.92.1.el8_10
commit-author Ilya Dryomov <[email protected]>
commit 076381c

The wait loop in __ceph_open_session() can race with the client
receiving a new monmap or osdmap shortly after the initial map is
received.  Both ceph_monc_handle_map() and handle_one_map() install
a new map immediately after freeing the old one

    kfree(monc->monmap);
    monc->monmap = monmap;

    ceph_osdmap_destroy(osdc->osdmap);
    osdc->osdmap = newmap;

under client->monc.mutex and client->osdc.lock respectively, but
because neither is taken in have_mon_and_osd_map() it's possible for
client->monc.monmap->epoch and client->osdc.osdmap->epoch arms in

    client->monc.monmap && client->monc.monmap->epoch &&
        client->osdc.osdmap && client->osdc.osdmap->epoch;

condition to dereference an already freed map.  This happens to be
reproducible with generic/395 and generic/397 with KASAN enabled:

    BUG: KASAN: slab-use-after-free in have_mon_and_osd_map+0x56/0x70
    Read of size 4 at addr ffff88811012d810 by task mount.ceph/13305
    CPU: 2 UID: 0 PID: 13305 Comm: mount.ceph Not tainted 6.14.0-rc2-build2+ #1266
    ...
    Call Trace:
    <TASK>
    have_mon_and_osd_map+0x56/0x70
    ceph_open_session+0x182/0x290
    ceph_get_tree+0x333/0x680
    vfs_get_tree+0x49/0x180
    do_new_mount+0x1a3/0x2d0
    path_mount+0x6dd/0x730
    do_mount+0x99/0xe0
    __do_sys_mount+0x141/0x180
    do_syscall_64+0x9f/0x100
    entry_SYSCALL_64_after_hwframe+0x76/0x7e
    </TASK>

    Allocated by task 13305:
    ceph_osdmap_alloc+0x16/0x130
    ceph_osdc_init+0x27a/0x4c0
    ceph_create_client+0x153/0x190
    create_fs_client+0x50/0x2a0
    ceph_get_tree+0xff/0x680
    vfs_get_tree+0x49/0x180
    do_new_mount+0x1a3/0x2d0
    path_mount+0x6dd/0x730
    do_mount+0x99/0xe0
    __do_sys_mount+0x141/0x180
    do_syscall_64+0x9f/0x100
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

    Freed by task 9475:
    kfree+0x212/0x290
    handle_one_map+0x23c/0x3b0
    ceph_osdc_handle_map+0x3c9/0x590
    mon_dispatch+0x655/0x6f0
    ceph_con_process_message+0xc3/0xe0
    ceph_con_v1_try_read+0x614/0x760
    ceph_con_workfn+0x2de/0x650
    process_one_work+0x486/0x7c0
    process_scheduled_works+0x73/0x90
    worker_thread+0x1c8/0x2a0
    kthread+0x2ec/0x300
    ret_from_fork+0x24/0x40
    ret_from_fork_asm+0x1a/0x30

Rewrite the wait loop to check the above condition directly with
client->monc.mutex and client->osdc.lock taken as appropriate.  While
at it, improve the timeout handling (previously mount_timeout could be
exceeded in case wait_event_interruptible_timeout() slept more than
once) and access client->auth_err under client->monc.mutex to match
how it's set in finish_auth().

monmap_show() and osdmap_show() now take the respective lock before
accessing the map as well.

	Cc: [email protected]
	Reported-by: David Howells <[email protected]>
	Signed-off-by: Ilya Dryomov <[email protected]>
	Reviewed-by: Viacheslav Dubeyko <[email protected]>
(cherry picked from commit 076381c)
	Signed-off-by: Jonathan Maple <[email protected]>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..kernel-mainline: 594898
Number of commits in rpm: 31
Number of commits matched with upstream: 20 (64.52%)
Number of commits in upstream but not in rpm: 594878
Number of commits NOT found in upstream: 11 (35.48%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.92.1.el8_10 for kernel-4.18.0-553.92.1.el8_10
Clean Cherry Picks: 11 (55.00%)
Empty Cherry Picks: 9 (45.00%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-4.18.0-553.92.1.el8_10/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
@PlaidCat PlaidCat requested review from a team January 16, 2026 19:38
@PlaidCat PlaidCat self-assigned this Jan 16, 2026
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

@PlaidCat PlaidCat requested a review from a team January 16, 2026 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants