Skip to content

Commit c4476fa

Browse files
Kiryl Shutsemaugregkh
authored andcommitted
mm/memory: do not populate page table entries beyond i_size
commit 74207de upstream. Patch series "Fix SIGBUS semantics with large folios", v3. Accessing memory within a VMA, but beyond i_size rounded up to the next page size, is supposed to generate SIGBUS. Darrick reported[1] an xfstests regression in v6.18-rc1. generic/749 failed due to missing SIGBUS. This was caused by my recent changes that try to fault in the whole folio where possible: 19773df ("mm/fault: try to map the entire file folio in finish_fault()") 357b927 ("mm/filemap: map entire large folio faultaround") These changes did not consider i_size when setting up PTEs, leading to xfstest breakage. However, the problem has been present in the kernel for a long time - since huge tmpfs was introduced in 2016. The kernel happily maps PMD-sized folios as PMD without checking i_size. And huge=always tmpfs allocates PMD-size folios on any writes. I considered this corner case when I implemented a large tmpfs, and my conclusion was that no one in their right mind should rely on receiving a SIGBUS signal when accessing beyond i_size. I cannot imagine how it could be useful for the workload. But apparently filesystem folks care a lot about preserving strict SIGBUS semantics. Generic/749 was introduced last year with reference to POSIX, but no real workloads were mentioned. It also acknowledged the tmpfs deviation from the test case. POSIX indeed says[3]: References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal. The patchset fixes the regression introduced by recent changes as well as more subtle SIGBUS breakage due to split failure on truncation. This patch (of 2): Accesses within VMA, but beyond i_size rounded up to PAGE_SIZE are supposed to generate SIGBUS. Recent changes attempted to fault in full folio where possible. They did not respect i_size, which led to populating PTEs beyond i_size and breaking SIGBUS semantics. Darrick reported generic/749 breakage because of this. However, the problem existed before the recent changes. With huge=always tmpfs, any write to a file leads to PMD-size allocation. Following the fault-in of the folio will install PMD mapping regardless of i_size. Fix filemap_map_pages() and finish_fault() to not install: - PTEs beyond i_size; - PMD mappings across i_size; Make an exception for shmem/tmpfs that for long time intentionally mapped with PMDs across i_size. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kiryl Shutsemau <[email protected]> Fixes: 6795801 ("xfs: Support large folios") Reported-by: "Darrick J. Wong" <[email protected]> Cc: Al Viro <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Dave Chinner <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Kiryl Shutsemau <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1 parent df92165 commit c4476fa

File tree

2 files changed

+36
-7
lines changed

2 files changed

+36
-7
lines changed

mm/filemap.c

Lines changed: 15 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3653,13 +3653,27 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
36533653
vm_fault_t ret = 0;
36543654
unsigned long rss = 0;
36553655
unsigned int nr_pages = 0, mmap_miss = 0, mmap_miss_saved, folio_type;
3656+
bool can_map_large;
36563657

36573658
rcu_read_lock();
36583659
folio = next_uptodate_folio(&xas, mapping, end_pgoff);
36593660
if (!folio)
36603661
goto out;
36613662

3662-
if (filemap_map_pmd(vmf, folio, start_pgoff)) {
3663+
file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
3664+
end_pgoff = min(end_pgoff, file_end);
3665+
3666+
/*
3667+
* Do not allow to map with PTEs beyond i_size and with PMD
3668+
* across i_size to preserve SIGBUS semantics.
3669+
*
3670+
* Make an exception for shmem/tmpfs that for long time
3671+
* intentionally mapped with PMDs across i_size.
3672+
*/
3673+
can_map_large = shmem_mapping(mapping) ||
3674+
file_end >= folio_next_index(folio);
3675+
3676+
if (can_map_large && filemap_map_pmd(vmf, folio, start_pgoff)) {
36633677
ret = VM_FAULT_NOPAGE;
36643678
goto out;
36653679
}
@@ -3672,10 +3686,6 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf,
36723686
goto out;
36733687
}
36743688

3675-
file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
3676-
if (end_pgoff > file_end)
3677-
end_pgoff = file_end;
3678-
36793689
folio_type = mm_counter_file(folio);
36803690
do {
36813691
unsigned long end;

mm/memory.c

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,7 @@
6868
#include <linux/gfp.h>
6969
#include <linux/migrate.h>
7070
#include <linux/string.h>
71+
#include <linux/shmem_fs.h>
7172
#include <linux/memory-tiers.h>
7273
#include <linux/debugfs.h>
7374
#include <linux/userfaultfd_k.h>
@@ -5088,6 +5089,8 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
50885089
else
50895090
page = vmf->page;
50905091

5092+
folio = page_folio(page);
5093+
50915094
/*
50925095
* check even for read faults because we might have lost our CoWed
50935096
* page
@@ -5098,8 +5101,25 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
50985101
return ret;
50995102
}
51005103

5104+
if (!needs_fallback && vma->vm_file) {
5105+
struct address_space *mapping = vma->vm_file->f_mapping;
5106+
pgoff_t file_end;
5107+
5108+
file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE);
5109+
5110+
/*
5111+
* Do not allow to map with PTEs beyond i_size and with PMD
5112+
* across i_size to preserve SIGBUS semantics.
5113+
*
5114+
* Make an exception for shmem/tmpfs that for long time
5115+
* intentionally mapped with PMDs across i_size.
5116+
*/
5117+
needs_fallback = !shmem_mapping(mapping) &&
5118+
file_end < folio_next_index(folio);
5119+
}
5120+
51015121
if (pmd_none(*vmf->pmd)) {
5102-
if (PageTransCompound(page)) {
5122+
if (!needs_fallback && PageTransCompound(page)) {
51035123
ret = do_set_pmd(vmf, page);
51045124
if (ret != VM_FAULT_FALLBACK)
51055125
return ret;
@@ -5111,7 +5131,6 @@ vm_fault_t finish_fault(struct vm_fault *vmf)
51115131
return VM_FAULT_OOM;
51125132
}
51135133

5114-
folio = page_folio(page);
51155134
nr_pages = folio_nr_pages(folio);
51165135

51175136
/*

0 commit comments

Comments
 (0)