Skip to content

Commit 9d57090

Browse files
Barry Songakpm00
authored andcommitted
mm: fix swap_read_folio_zeromap() for large folios with partial zeromap
Patch series "mm: enable large folios swap-in support", v9. Currently, we support mTHP swapout but not swapin. This means that once mTHP is swapped out, it will come back as small folios when swapped in. This is particularly detrimental for devices like Android, where more than half of the memory is in swap. The lack of mTHP swapin functionality makes mTHP a showstopper in scenarios that heavily rely on swap. This patchset introduces mTHP swap-in support. It starts with synchronous devices similar to zRAM, aiming to benefit as many users as possible with minimal changes. This patch (of 3): There could be a corner case where the first entry is non-zeromap, but a subsequent entry is zeromap. In this case, we should not let swap_read_folio_zeromap() return false since we will still read corrupted data. Additionally, the iteration of test_bit() is unnecessary and can be replaced with bitmap operations, which are more efficient. We can adopt the style of swap_pte_batch() and folio_pte_batch() to introduce swap_zeromap_batch() which seems to provide the greatest flexibility for the caller. This approach allows the caller to either check if the zeromap status of all entries is consistent or determine the number of contiguous entries with the same status. Since swap_read_folio() can't handle reading a large folio that's partially zeromap and partially non-zeromap, we've moved the code to mm/swap.h so that others, like those working on swap-in, can access it. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 0ca0c24 ("mm: store zero pages to be swapped out in a bitmap") Signed-off-by: Barry Song <[email protected]> Reviewed-by: Yosry Ahmed <[email protected]> Reviewed-by: Usama Arif <[email protected]> Cc: Baolin Wang <[email protected]> Cc: Chris Li <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Chuanhua Han <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Gao Xiang <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Kairui Song <[email protected]> Cc: Kairui Song <[email protected]> Cc: Kalesh Singh <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Nhat Pham <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Yang Shi <[email protected]> Cc: Kanchana P Sridhar <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent a0c9fd2 commit 9d57090

File tree

2 files changed

+40
-25
lines changed

2 files changed

+40
-25
lines changed

mm/page_io.c

Lines changed: 7 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -226,26 +226,6 @@ static void swap_zeromap_folio_clear(struct folio *folio)
226226
}
227227
}
228228

229-
/*
230-
* Return the index of the first subpage which is not zero-filled
231-
* according to swap_info_struct->zeromap.
232-
* If all pages are zero-filled according to zeromap, it will return
233-
* folio_nr_pages(folio).
234-
*/
235-
static unsigned int swap_zeromap_folio_test(struct folio *folio)
236-
{
237-
struct swap_info_struct *sis = swp_swap_info(folio->swap);
238-
swp_entry_t entry;
239-
unsigned int i;
240-
241-
for (i = 0; i < folio_nr_pages(folio); i++) {
242-
entry = page_swap_entry(folio_page(folio, i));
243-
if (!test_bit(swp_offset(entry), sis->zeromap))
244-
return i;
245-
}
246-
return i;
247-
}
248-
249229
/*
250230
* We may have stale swap cache pages in memory: notice
251231
* them here and get rid of the unnecessary final write.
@@ -522,19 +502,21 @@ static void sio_read_complete(struct kiocb *iocb, long ret)
522502

523503
static bool swap_read_folio_zeromap(struct folio *folio)
524504
{
525-
unsigned int idx = swap_zeromap_folio_test(folio);
526-
527-
if (idx == 0)
528-
return false;
505+
int nr_pages = folio_nr_pages(folio);
506+
bool is_zeromap;
529507

530508
/*
531509
* Swapping in a large folio that is partially in the zeromap is not
532510
* currently handled. Return true without marking the folio uptodate so
533511
* that an IO error is emitted (e.g. do_swap_page() will sigbus).
534512
*/
535-
if (WARN_ON_ONCE(idx < folio_nr_pages(folio)))
513+
if (WARN_ON_ONCE(swap_zeromap_batch(folio->swap, nr_pages,
514+
&is_zeromap) != nr_pages))
536515
return true;
537516

517+
if (!is_zeromap)
518+
return false;
519+
538520
folio_zero_range(folio, 0, folio_size(folio));
539521
folio_mark_uptodate(folio);
540522
return true;

mm/swap.h

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,32 @@ static inline unsigned int folio_swap_flags(struct folio *folio)
8080
{
8181
return swp_swap_info(folio->swap)->flags;
8282
}
83+
84+
/*
85+
* Return the count of contiguous swap entries that share the same
86+
* zeromap status as the starting entry. If is_zeromap is not NULL,
87+
* it will return the zeromap status of the starting entry.
88+
*/
89+
static inline int swap_zeromap_batch(swp_entry_t entry, int max_nr,
90+
bool *is_zeromap)
91+
{
92+
struct swap_info_struct *sis = swp_swap_info(entry);
93+
unsigned long start = swp_offset(entry);
94+
unsigned long end = start + max_nr;
95+
bool first_bit;
96+
97+
first_bit = test_bit(start, sis->zeromap);
98+
if (is_zeromap)
99+
*is_zeromap = first_bit;
100+
101+
if (max_nr <= 1)
102+
return max_nr;
103+
if (first_bit)
104+
return find_next_zero_bit(sis->zeromap, end, start) - start;
105+
else
106+
return find_next_bit(sis->zeromap, end, start) - start;
107+
}
108+
83109
#else /* CONFIG_SWAP */
84110
struct swap_iocb;
85111
static inline void swap_read_folio(struct folio *folio, struct swap_iocb **plug)
@@ -171,6 +197,13 @@ static inline unsigned int folio_swap_flags(struct folio *folio)
171197
{
172198
return 0;
173199
}
200+
201+
static inline int swap_zeromap_batch(swp_entry_t entry, int max_nr,
202+
bool *has_zeromap)
203+
{
204+
return 0;
205+
}
206+
174207
#endif /* CONFIG_SWAP */
175208

176209
#endif /* _MM_SWAP_H */

0 commit comments

Comments
 (0)