Skip to content

Conversation

PlaidCat
Copy link
Collaborator

Update process (This kernel CentOS base for 4.18.0-553)

  • Kernel History Rebuild Process for all src.rpms hosted by RESF
  • Create sig-cloud-8/4.18.0-553.40.1.el8_10 branch
  • Check if any maintained code is included in the new el release.
  • Cherry-pick all code from previous branch into new branch (skipping unneeded code)
    • Fix conflicts as they arise
  • Build and Test

Removed Patches

  • None

Build

/mnt/code/kernel-src-tree-build
  CLEAN    resolve_btfids
  CLEAN   .tmp_versions
  CLEAN   scripts/basic
  CLEAN   scripts/genksyms
  CLEAN   scripts/kconfig
  CLEAN   scripts/mod
  CLEAN   scripts/selinux/genheaders
  CLEAN   scripts/selinux/mdp
  CLEAN   scripts
  CLEAN   include/config usr/include include/generated arch/x86/include/generated
  CLEAN   .config .config.old .version Module.symvers
[TIMER]{MRPROPER}: 8s
x86_64 architecture detected, copying config
'configs/kernel-x86_64.config' -> '.config'
Setting Local Version for build
CONFIG_LOCALVERSION="-jmaple_sig-cloud-8_4.18.0-553.40.1.el8_10-db599ef0953c"
Making olddefconfig
  HOSTCC  scripts/basic/fixdep
  HOSTCC  scripts/kconfig/conf.o
  YACC    scripts/kconfig/zconf.tab.c
  LEX     scripts/kconfig/zconf.lex.c
  HOSTCC  scripts/kconfig/zconf.tab.o
  HOSTLD  scripts/kconfig/conf
scripts/kconfig/conf  --olddefconfig Kconfig
#
# configuration written to .config
#
Starting Build
scripts/kconfig/conf  --syncconfig Kconfig
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h

[SNIP]
  
  LD [M]  sound/x86/snd-hdmi-lpe-audio.ko
  LD [M]  sound/xen/snd_xen_front.ko
  LD [M]  virt/lib/irqbypass.ko
[TIMER]{BUILD}: 2241s
Making Modules
  INSTALL arch/x86/crypto/blowfish-x86_64.ko
  INSTALL arch/x86/crypto/camellia-aesni-avx-x86_64.ko

[SNIP]

  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-jmaple_sig-cloud-8_4.18.0-553.40.1.el8_10-db599ef0953c+
[TIMER]{MODULES}: 58s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-jmaple_sig-cloud-8_4.18.0-553.40.1.el8_10-db599ef0953c+ arch/x86/boot/bzImage \
        System.map "/boot"
[TIMER]{INSTALL}: 28s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-jmaple_sig-cloud-8_4.18.0-553.40.1.el8_10-db599ef0953c+ and Index to 2
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 8s
[TIMER]{BUILD}: 2241s
[TIMER]{MODULES}: 58s
[TIMER]{INSTALL}: 28s
[TIMER]{TOTAL} 2339s
Rebooting in 10 seconds

Login

[maple@r8-sigcloud-builder ~]$ uname -a
Linux r8-sigcloud-builder 4.18.0-jmaple_sig-cloud-8_4.18.0-553.40.1.el8_10-db599ef0953c+ #1 SMP Tue Mar 11 20:58:59 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Testing

Just ran a basic two pass
Screenshot 2025-03-11 at 6 37 51 PM

gvrose8192 and others added 3 commits March 11, 2025 14:23
jira LE-1482
commit 415d832
upstream-diff - Conflicts around use of arch_atomic vs atomic - there
are no arch_atomic* ops in Rocky 8.x

    These operations are documented as always ordered in
    include/asm-generic/bitops/instrumented-atomic.h, and producer-consumer
    type use cases where one side needs to ensure a flag is left pending
    after some shared data was updated rely on this ordering, even in the
    failure case.

    This is the case with the workqueue code, which currently suffers from a
    reproducible ordering violation on Apple M1 platforms (which are
    notoriously out-of-order) that ends up causing the TTY layer to fail to
    deliver data to userspace properly under the right conditions.  This
    change fixes that bug.

    Change the documentation to restrict the "no order on failure" story to
    the _lock() variant (for which it makes sense), and remove the
    early-exit from the generic implementation, which is what causes the
    missing barrier semantics in that case.  Without this, the remaining
    atomic op is fully ordered (including on ARM64 LSE, as of recent
    versions of the architecture spec).

    Suggested-by: Linus Torvalds <[email protected]>
    Cc: [email protected]
    Fixes: e986a0d ("locking/atomics, asm-generic/bitops/atomic.h: Rewrite using atomic_*() APIs")
    Fixes: 61e0239 ("locking/atomic/bitops: Document and clarify ordering semantics for failed test_and_{}_bit()")
    Signed-off-by: Hector Martin <[email protected]>
    Acked-by: Will Deacon <[email protected]>
    Reviewed-by: Arnd Bergmann <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>
    (cherry picked from commit 415d832)

Signed-off-by: Greg Rose <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
…tead of a two-phase approach

jira roc-2673
commit fbf6449

Instead of setting x86_virt_bits to a possibly-correct value and then
correcting it later, do all the necessary checks before setting it.

At this point, the #VC handler references boot_cpu_data.x86_virt_bits,
and in the previous version, it would be triggered by the CPUIDs between
the point at which it is set to 48 and when it is set to the correct
value.

    Suggested-by: Dave Hansen <[email protected]>
    Signed-off-by: Adam Dunlap <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Tested-by: Jacob Xu <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]

Signed-off-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Jonathan Maple <[email protected]>
jira roc-2673
commit 3e32552

c->x86_cache_alignment is initialized from c->x86_clflush_size.
However, commit fbf6449 moved c->x86_clflush_size initialization
to later in boot without moving the c->x86_cache_alignment assignment:

  fbf6449 ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")

This presumably left c->x86_cache_alignment set to zero for longer
than it should be.

The result was an oops on 32-bit kernels while accessing a pointer
at 0x20.  The 0x20 came from accessing a structure member at offset
0x10 (buffer->cpumask) from a ZERO_SIZE_PTR=0x10.  kmalloc() can
evidently return ZERO_SIZE_PTR when it's given 0 as its alignment
requirement.

Move the c->x86_cache_alignment initialization to be after
c->x86_clflush_size has an actual value.

    Fixes: fbf6449 ("x86/sev-es: Set x86_virt_bits to the correct value straight away, instead of a two-phase approach")
    Signed-off-by: Dave Hansen <[email protected]>
    Signed-off-by: Ingo Molnar <[email protected]>
    Tested-by: Nathan Chancellor <[email protected]>
    Link: https://lore.kernel.org/r/[email protected]
    (cherry picked from commit 3e32552)
Signed-off-by: Ronnie Sahlberg <[email protected]>

Signed-off-by: Jonathan Maple <[email protected]>
@bmastbergen
Copy link
Collaborator

In sig-cloud-8/4.18.0-553.36.1.el8_10 we added the three commits below because they were fixes to fbf6449:

b2097aceefe7 x86/cpu: Provide default cache line size if not enumerated
f167c45a2e09 x86/cpu: Get rid of an unnecessary local variable in get_cpu_address_sizes()
06c0ca8aeecd x86/cpu: Allow reducing x86_phys_bits during early_identify_cpu()

Any reason not to have them here?

@PlaidCat
Copy link
Collaborator Author

PlaidCat commented Mar 12, 2025

Any reason not to have them here?

Because I based it on 22.1 since for my tree .36.1 was blank so I probably have a stale tree.

So I need to go fix that since, I'll be back and restarting reviews.

@PlaidCat PlaidCat requested a review from elguero March 12, 2025 13:21
@PlaidCat
Copy link
Collaborator Author

Actually I'm about to push a new update to rocky9.5 we'll just go with that.

@PlaidCat PlaidCat closed this Mar 12, 2025
@PlaidCat PlaidCat deleted the jmaple_sig-cloud-8/4.18.0-553.40.1.el8_10 branch March 12, 2025 13:25
github-actions bot pushed a commit that referenced this pull request Jun 7, 2025
During our test, it is found that a warning can be trigger in try_grab_folio
as follow:

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 1678 at mm/gup.c:147 try_grab_folio+0x106/0x130
  Modules linked in:
  CPU: 0 UID: 0 PID: 1678 Comm: syz.3.31 Not tainted 6.15.0-rc5 #163 PREEMPT(undef)
  RIP: 0010:try_grab_folio+0x106/0x130
  Call Trace:
   <TASK>
   follow_huge_pmd+0x240/0x8e0
   follow_pmd_mask.constprop.0.isra.0+0x40b/0x5c0
   follow_pud_mask.constprop.0.isra.0+0x14a/0x170
   follow_page_mask+0x1c2/0x1f0
   __get_user_pages+0x176/0x950
   __gup_longterm_locked+0x15b/0x1060
   ? gup_fast+0x120/0x1f0
   gup_fast_fallback+0x17e/0x230
   get_user_pages_fast+0x5f/0x80
   vmci_host_unlocked_ioctl+0x21c/0xf80
  RIP: 0033:0x54d2cd
  ---[ end trace 0000000000000000 ]---

Digging into the source, context->notify_page may init by get_user_pages_fast
and can be seen in vmci_ctx_unset_notify which will try to put_page. However
get_user_pages_fast is not finished here and lead to following
try_grab_folio warning. The race condition is shown as follow:

cpu0			cpu1
vmci_host_do_set_notify
vmci_host_setup_notify
get_user_pages_fast(uva, 1, FOLL_WRITE, &context->notify_page);
lockless_pages_from_mm
gup_pgd_range
gup_huge_pmd  // update &context->notify_page
			vmci_host_do_set_notify
			vmci_ctx_unset_notify
			notify_page = context->notify_page;
			if (notify_page)
			put_page(notify_page);	// page is freed
__gup_longterm_locked
__get_user_pages
follow_trans_huge_pmd
try_grab_folio // warn here

To slove this, use local variable page to make notify_page can be seen
after finish get_user_pages_fast.

Fixes: a1d8843 ("VMCI: Fix two UVA mapping bugs")
Cc: stable <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Wupeng Ma <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
github-actions bot pushed a commit that referenced this pull request Jun 20, 2025
commit 1bd6406 upstream.

During our test, it is found that a warning can be trigger in try_grab_folio
as follow:

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 1678 at mm/gup.c:147 try_grab_folio+0x106/0x130
  Modules linked in:
  CPU: 0 UID: 0 PID: 1678 Comm: syz.3.31 Not tainted 6.15.0-rc5 #163 PREEMPT(undef)
  RIP: 0010:try_grab_folio+0x106/0x130
  Call Trace:
   <TASK>
   follow_huge_pmd+0x240/0x8e0
   follow_pmd_mask.constprop.0.isra.0+0x40b/0x5c0
   follow_pud_mask.constprop.0.isra.0+0x14a/0x170
   follow_page_mask+0x1c2/0x1f0
   __get_user_pages+0x176/0x950
   __gup_longterm_locked+0x15b/0x1060
   ? gup_fast+0x120/0x1f0
   gup_fast_fallback+0x17e/0x230
   get_user_pages_fast+0x5f/0x80
   vmci_host_unlocked_ioctl+0x21c/0xf80
  RIP: 0033:0x54d2cd
  ---[ end trace 0000000000000000 ]---

Digging into the source, context->notify_page may init by get_user_pages_fast
and can be seen in vmci_ctx_unset_notify which will try to put_page. However
get_user_pages_fast is not finished here and lead to following
try_grab_folio warning. The race condition is shown as follow:

cpu0			cpu1
vmci_host_do_set_notify
vmci_host_setup_notify
get_user_pages_fast(uva, 1, FOLL_WRITE, &context->notify_page);
lockless_pages_from_mm
gup_pgd_range
gup_huge_pmd  // update &context->notify_page
			vmci_host_do_set_notify
			vmci_ctx_unset_notify
			notify_page = context->notify_page;
			if (notify_page)
			put_page(notify_page);	// page is freed
__gup_longterm_locked
__get_user_pages
follow_trans_huge_pmd
try_grab_folio // warn here

To slove this, use local variable page to make notify_page can be seen
after finish get_user_pages_fast.

Fixes: a1d8843 ("VMCI: Fix two UVA mapping bugs")
Cc: stable <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Wupeng Ma <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

5 participants