Skip to content

Commit 122e794

Browse files
committed
Merge tag 'mm-hotfixes-stable-2023-07-28-15-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton: "11 hotfixes. Five are cc:stable and the remainder address post-6.4 issues or aren't considered serious enough to justify backporting" * tag 'mm-hotfixes-stable-2023-07-28-15-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/memory-failure: fix hardware poison check in unpoison_memory() proc/vmcore: fix signedness bug in read_from_oldmem() mailmap: update remaining active codeaurora.org email addresses mm: lock VMA in dup_anon_vma() before setting ->anon_vma mm: fix memory ordering for mm_lock_seq and vm_lock_seq scripts/spelling.txt: remove 'thead' as a typo mm/pagewalk: fix EFI_PGT_DUMP of espfix area shmem: minor fixes to splice-read implementation tmpfs: fix Documentation of noswap and huge mount options Revert "um: Use swap() to make code cleaner" mm/damon/core-test: initialise context before test in damon_test_set_attrs()
2 parents 20d3f24 + 6c54312 commit 122e794

File tree

13 files changed

+197
-51
lines changed

13 files changed

+197
-51
lines changed

.mailmap

Lines changed: 96 additions & 1 deletion
Large diffs are not rendered by default.

Documentation/filesystems/tmpfs.rst

Lines changed: 20 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,6 @@ nr_inodes The maximum number of inodes for this instance. The default
8484
is half of the number of your physical RAM pages, or (on a
8585
machine with highmem) the number of lowmem RAM pages,
8686
whichever is the lower.
87-
noswap Disables swap. Remounts must respect the original settings.
88-
By default swap is enabled.
8987
========= ============================================================
9088

9189
These parameters accept a suffix k, m or g for kilo, mega and giga and
@@ -99,36 +97,31 @@ mount with such options, since it allows any user with write access to
9997
use up all the memory on the machine; but enhances the scalability of
10098
that instance in a system with many CPUs making intensive use of it.
10199

100+
tmpfs blocks may be swapped out, when there is a shortage of memory.
101+
tmpfs has a mount option to disable its use of swap:
102+
103+
====== ===========================================================
104+
noswap Disables swap. Remounts must respect the original settings.
105+
By default swap is enabled.
106+
====== ===========================================================
107+
102108
tmpfs also supports Transparent Huge Pages which requires a kernel
103109
configured with CONFIG_TRANSPARENT_HUGEPAGE and with huge supported for
104110
your system (has_transparent_hugepage(), which is architecture specific).
105111
The mount options for this are:
106112

107-
====== ============================================================
108-
huge=0 never: disables huge pages for the mount
109-
huge=1 always: enables huge pages for the mount
110-
huge=2 within_size: only allocate huge pages if the page will be
111-
fully within i_size, also respect fadvise()/madvise() hints.
112-
huge=3 advise: only allocate huge pages if requested with
113-
fadvise()/madvise()
114-
====== ============================================================
115-
116-
There is a sysfs file which you can also use to control system wide THP
117-
configuration for all tmpfs mounts, the file is:
118-
119-
/sys/kernel/mm/transparent_hugepage/shmem_enabled
120-
121-
This sysfs file is placed on top of THP sysfs directory and so is registered
122-
by THP code. It is however only used to control all tmpfs mounts with one
123-
single knob. Since it controls all tmpfs mounts it should only be used either
124-
for emergency or testing purposes. The values you can set for shmem_enabled are:
125-
126-
== ============================================================
127-
-1 deny: disables huge on shm_mnt and all mounts, for
128-
emergency use
129-
-2 force: enables huge on shm_mnt and all mounts, w/o needing
130-
option, for testing
131-
== ============================================================
113+
================ ==============================================================
114+
huge=never Do not allocate huge pages. This is the default.
115+
huge=always Attempt to allocate huge page every time a new page is needed.
116+
huge=within_size Only allocate huge page if it will be fully within i_size.
117+
Also respect madvise(2) hints.
118+
huge=advise Only allocate huge page if requested with madvise(2).
119+
================ ==============================================================
120+
121+
See also Documentation/admin-guide/mm/transhuge.rst, which describes the
122+
sysfs file /sys/kernel/mm/transparent_hugepage/shmem_enabled: which can
123+
be used to deny huge pages on all tmpfs mounts in an emergency, or to
124+
force huge pages on all tmpfs mounts for testing.
132125

133126
tmpfs has a mount option to set the NUMA memory allocation policy for
134127
all files in that instance (if CONFIG_NUMA is enabled) - which can be

arch/um/os-Linux/sigio.c

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
* Copyright (C) 2002 - 2008 Jeff Dike (jdike@{addtoit,linux.intel}.com)
44
*/
55

6-
#include <linux/minmax.h>
76
#include <unistd.h>
87
#include <errno.h>
98
#include <fcntl.h>
@@ -51,7 +50,7 @@ static struct pollfds all_sigio_fds;
5150

5251
static int write_sigio_thread(void *unused)
5352
{
54-
struct pollfds *fds;
53+
struct pollfds *fds, tmp;
5554
struct pollfd *p;
5655
int i, n, respond_fd;
5756
char c;
@@ -78,7 +77,9 @@ static int write_sigio_thread(void *unused)
7877
"write_sigio_thread : "
7978
"read on socket failed, "
8079
"err = %d\n", errno);
81-
swap(current_poll, next_poll);
80+
tmp = current_poll;
81+
current_poll = next_poll;
82+
next_poll = tmp;
8283
respond_fd = sigio_private[1];
8384
}
8485
else {

fs/proc/vmcore.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ ssize_t read_from_oldmem(struct iov_iter *iter, size_t count,
132132
u64 *ppos, bool encrypted)
133133
{
134134
unsigned long pfn, offset;
135-
size_t nr_bytes;
135+
ssize_t nr_bytes;
136136
ssize_t read = 0, tmp;
137137
int idx;
138138

include/linux/mm.h

Lines changed: 23 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -641,8 +641,14 @@ static inline void vma_numab_state_free(struct vm_area_struct *vma) {}
641641
*/
642642
static inline bool vma_start_read(struct vm_area_struct *vma)
643643
{
644-
/* Check before locking. A race might cause false locked result. */
645-
if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))
644+
/*
645+
* Check before locking. A race might cause false locked result.
646+
* We can use READ_ONCE() for the mm_lock_seq here, and don't need
647+
* ACQUIRE semantics, because this is just a lockless check whose result
648+
* we don't rely on for anything - the mm_lock_seq read against which we
649+
* need ordering is below.
650+
*/
651+
if (READ_ONCE(vma->vm_lock_seq) == READ_ONCE(vma->vm_mm->mm_lock_seq))
646652
return false;
647653

648654
if (unlikely(down_read_trylock(&vma->vm_lock->lock) == 0))
@@ -653,8 +659,13 @@ static inline bool vma_start_read(struct vm_area_struct *vma)
653659
* False unlocked result is impossible because we modify and check
654660
* vma->vm_lock_seq under vma->vm_lock protection and mm->mm_lock_seq
655661
* modification invalidates all existing locks.
662+
*
663+
* We must use ACQUIRE semantics for the mm_lock_seq so that if we are
664+
* racing with vma_end_write_all(), we only start reading from the VMA
665+
* after it has been unlocked.
666+
* This pairs with RELEASE semantics in vma_end_write_all().
656667
*/
657-
if (unlikely(vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))) {
668+
if (unlikely(vma->vm_lock_seq == smp_load_acquire(&vma->vm_mm->mm_lock_seq))) {
658669
up_read(&vma->vm_lock->lock);
659670
return false;
660671
}
@@ -676,7 +687,7 @@ static bool __is_vma_write_locked(struct vm_area_struct *vma, int *mm_lock_seq)
676687
* current task is holding mmap_write_lock, both vma->vm_lock_seq and
677688
* mm->mm_lock_seq can't be concurrently modified.
678689
*/
679-
*mm_lock_seq = READ_ONCE(vma->vm_mm->mm_lock_seq);
690+
*mm_lock_seq = vma->vm_mm->mm_lock_seq;
680691
return (vma->vm_lock_seq == *mm_lock_seq);
681692
}
682693

@@ -688,7 +699,13 @@ static inline void vma_start_write(struct vm_area_struct *vma)
688699
return;
689700

690701
down_write(&vma->vm_lock->lock);
691-
vma->vm_lock_seq = mm_lock_seq;
702+
/*
703+
* We should use WRITE_ONCE() here because we can have concurrent reads
704+
* from the early lockless pessimistic check in vma_start_read().
705+
* We don't really care about the correctness of that early check, but
706+
* we should use WRITE_ONCE() for cleanliness and to keep KCSAN happy.
707+
*/
708+
WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq);
692709
up_write(&vma->vm_lock->lock);
693710
}
694711

@@ -702,7 +719,7 @@ static inline bool vma_try_start_write(struct vm_area_struct *vma)
702719
if (!down_write_trylock(&vma->vm_lock->lock))
703720
return false;
704721

705-
vma->vm_lock_seq = mm_lock_seq;
722+
WRITE_ONCE(vma->vm_lock_seq, mm_lock_seq);
706723
up_write(&vma->vm_lock->lock);
707724
return true;
708725
}

include/linux/mm_types.h

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -514,6 +514,20 @@ struct vm_area_struct {
514514
};
515515

516516
#ifdef CONFIG_PER_VMA_LOCK
517+
/*
518+
* Can only be written (using WRITE_ONCE()) while holding both:
519+
* - mmap_lock (in write mode)
520+
* - vm_lock->lock (in write mode)
521+
* Can be read reliably while holding one of:
522+
* - mmap_lock (in read or write mode)
523+
* - vm_lock->lock (in read or write mode)
524+
* Can be read unreliably (using READ_ONCE()) for pessimistic bailout
525+
* while holding nothing (except RCU to keep the VMA struct allocated).
526+
*
527+
* This sequence counter is explicitly allowed to overflow; sequence
528+
* counter reuse can only lead to occasional unnecessary use of the
529+
* slowpath.
530+
*/
517531
int vm_lock_seq;
518532
struct vma_lock *vm_lock;
519533

@@ -679,6 +693,20 @@ struct mm_struct {
679693
* by mmlist_lock
680694
*/
681695
#ifdef CONFIG_PER_VMA_LOCK
696+
/*
697+
* This field has lock-like semantics, meaning it is sometimes
698+
* accessed with ACQUIRE/RELEASE semantics.
699+
* Roughly speaking, incrementing the sequence number is
700+
* equivalent to releasing locks on VMAs; reading the sequence
701+
* number can be part of taking a read lock on a VMA.
702+
*
703+
* Can be modified under write mmap_lock using RELEASE
704+
* semantics.
705+
* Can be read with no other protection when holding write
706+
* mmap_lock.
707+
* Can be read with ACQUIRE semantics if not holding write
708+
* mmap_lock.
709+
*/
682710
int mm_lock_seq;
683711
#endif
684712

include/linux/mmap_lock.h

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -76,8 +76,14 @@ static inline void mmap_assert_write_locked(struct mm_struct *mm)
7676
static inline void vma_end_write_all(struct mm_struct *mm)
7777
{
7878
mmap_assert_write_locked(mm);
79-
/* No races during update due to exclusive mmap_lock being held */
80-
WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1);
79+
/*
80+
* Nobody can concurrently modify mm->mm_lock_seq due to exclusive
81+
* mmap_lock being held.
82+
* We need RELEASE semantics here to ensure that preceding stores into
83+
* the VMA take effect before we unlock it with this store.
84+
* Pairs with ACQUIRE semantics in vma_start_read().
85+
*/
86+
smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1);
8187
}
8288
#else
8389
static inline void vma_end_write_all(struct mm_struct *mm) {}

mm/damon/core-test.h

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -320,25 +320,25 @@ static void damon_test_update_monitoring_result(struct kunit *test)
320320

321321
static void damon_test_set_attrs(struct kunit *test)
322322
{
323-
struct damon_ctx ctx;
323+
struct damon_ctx *c = damon_new_ctx();
324324
struct damon_attrs valid_attrs = {
325325
.min_nr_regions = 10, .max_nr_regions = 1000,
326326
.sample_interval = 5000, .aggr_interval = 100000,};
327327
struct damon_attrs invalid_attrs;
328328

329-
KUNIT_EXPECT_EQ(test, damon_set_attrs(&ctx, &valid_attrs), 0);
329+
KUNIT_EXPECT_EQ(test, damon_set_attrs(c, &valid_attrs), 0);
330330

331331
invalid_attrs = valid_attrs;
332332
invalid_attrs.min_nr_regions = 1;
333-
KUNIT_EXPECT_EQ(test, damon_set_attrs(&ctx, &invalid_attrs), -EINVAL);
333+
KUNIT_EXPECT_EQ(test, damon_set_attrs(c, &invalid_attrs), -EINVAL);
334334

335335
invalid_attrs = valid_attrs;
336336
invalid_attrs.max_nr_regions = 9;
337-
KUNIT_EXPECT_EQ(test, damon_set_attrs(&ctx, &invalid_attrs), -EINVAL);
337+
KUNIT_EXPECT_EQ(test, damon_set_attrs(c, &invalid_attrs), -EINVAL);
338338

339339
invalid_attrs = valid_attrs;
340340
invalid_attrs.aggr_interval = 4999;
341-
KUNIT_EXPECT_EQ(test, damon_set_attrs(&ctx, &invalid_attrs), -EINVAL);
341+
KUNIT_EXPECT_EQ(test, damon_set_attrs(c, &invalid_attrs), -EINVAL);
342342
}
343343

344344
static struct kunit_case damon_test_cases[] = {

mm/memory-failure.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2487,7 +2487,7 @@ int unpoison_memory(unsigned long pfn)
24872487
goto unlock_mutex;
24882488
}
24892489

2490-
if (!folio_test_hwpoison(folio)) {
2490+
if (!PageHWPoison(p)) {
24912491
unpoison_pr_info("Unpoison: Page was already unpoisoned %#lx\n",
24922492
pfn, &unpoison_rs);
24932493
goto unlock_mutex;

mm/mmap.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -615,6 +615,7 @@ static inline int dup_anon_vma(struct vm_area_struct *dst,
615615
* anon pages imported.
616616
*/
617617
if (src->anon_vma && !dst->anon_vma) {
618+
vma_start_write(dst);
618619
dst->anon_vma = src->anon_vma;
619620
return anon_vma_clone(dst, src);
620621
}

0 commit comments

Comments
 (0)