Skip to content

Commit 3f74e6b

Browse files
Yu Zhaoakpm00
authored andcommitted
mm/mglru: fix overshooting shrinker memory
set_initial_priority() tries to jump-start global reclaim by estimating the priority based on cold/hot LRU pages. The estimation does not account for shrinker objects, and it cannot do so because their sizes can be in different units other than page. If shrinker objects are the majority, e.g., on TrueNAS SCALE 24.04.0 where ZFS ARC can use almost all system memory, set_initial_priority() can vastly underestimate how much memory ARC shrinker can evict and assign extreme low values to scan_control->priority, resulting in overshoots of shrinker objects. To reproduce the problem, using TrueNAS SCALE 24.04.0 with 32GB DRAM, a test ZFS pool and the following commands: fio --name=mglru.file --numjobs=36 --ioengine=io_uring \ --directory=/root/test-zfs-pool/ --size=1024m --buffered=1 \ --rw=randread --random_distribution=random \ --time_based --runtime=1h & for ((i = 0; i < 20; i++)) do sleep 120 fio --name=mglru.anon --numjobs=16 --ioengine=mmap \ --filename=/dev/zero --size=1024m --fadvise_hint=0 \ --rw=randrw --random_distribution=random \ --time_based --runtime=1m done To fix the problem: 1. Cap scan_control->priority at or above DEF_PRIORITY/2, to prevent the jump-start from being overly aggressive. 2. Account for the progress from mm_account_reclaimed_pages(), to prevent kswapd_shrink_node() from raising the priority unnecessarily. Link: https://lkml.kernel.org/r/[email protected] Fixes: e4dde56 ("mm: multi-gen LRU: per-node lru_gen_folio lists") Signed-off-by: Yu Zhao <[email protected]> Reported-by: Alexander Motin <[email protected]> Cc: Wei Xu <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent 8b671fe commit 3f74e6b

File tree

1 file changed

+8
-2
lines changed

1 file changed

+8
-2
lines changed

mm/vmscan.c

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4930,7 +4930,11 @@ static void set_initial_priority(struct pglist_data *pgdat, struct scan_control
49304930
/* round down reclaimable and round up sc->nr_to_reclaim */
49314931
priority = fls_long(reclaimable) - 1 - fls_long(sc->nr_to_reclaim - 1);
49324932

4933-
sc->priority = clamp(priority, 0, DEF_PRIORITY);
4933+
/*
4934+
* The estimation is based on LRU pages only, so cap it to prevent
4935+
* overshoots of shrinker objects by large margins.
4936+
*/
4937+
sc->priority = clamp(priority, DEF_PRIORITY / 2, DEF_PRIORITY);
49344938
}
49354939

49364940
static void lru_gen_shrink_node(struct pglist_data *pgdat, struct scan_control *sc)
@@ -6754,6 +6758,7 @@ static bool kswapd_shrink_node(pg_data_t *pgdat,
67546758
{
67556759
struct zone *zone;
67566760
int z;
6761+
unsigned long nr_reclaimed = sc->nr_reclaimed;
67576762

67586763
/* Reclaim a number of pages proportional to the number of zones */
67596764
sc->nr_to_reclaim = 0;
@@ -6781,7 +6786,8 @@ static bool kswapd_shrink_node(pg_data_t *pgdat,
67816786
if (sc->order && sc->nr_reclaimed >= compact_gap(sc->order))
67826787
sc->order = 0;
67836788

6784-
return sc->nr_scanned >= sc->nr_to_reclaim;
6789+
/* account for progress from mm_account_reclaimed_pages() */
6790+
return max(sc->nr_scanned, sc->nr_reclaimed - nr_reclaimed) >= sc->nr_to_reclaim;
67856791
}
67866792

67876793
/* Page allocator PCP high watermark is lowered if reclaim is active. */

0 commit comments

Comments
 (0)