Skip to content

Commit aa6f8b2

Browse files
johnhubbardakpm00
authored andcommitted
mm/gup: stop leaking pinned pages in low memory conditions
If a driver tries to call any of the pin_user_pages*(FOLL_LONGTERM) family of functions, and requests "too many" pages, then the call will erroneously leave pages pinned. This is visible in user space as an actual memory leak. Repro is trivial: just make enough pin_user_pages(FOLL_LONGTERM) calls to exhaust memory. The root cause of the problem is this sequence, within __gup_longterm_locked(): __get_user_pages_locked() rc = check_and_migrate_movable_pages() ...which gets retried in a loop. The loop error handling is incomplete, clearly due to a somewhat unusual and complicated tri-state error API. But anyway, if -ENOMEM, or in fact, any unexpected error is returned from check_and_migrate_movable_pages(), then __gup_longterm_locked() happily returns the error, while leaving the pages pinned. In the failed case, which is an app that requests (via a device driver) 30720000000 bytes to be pinned, and then exits, I see this: $ grep foll /proc/vmstat nr_foll_pin_acquired 7502048 nr_foll_pin_released 2048 And after applying this patch, it returns to balanced pins: $ grep foll /proc/vmstat nr_foll_pin_acquired 7502048 nr_foll_pin_released 7502048 Note that the child routine, check_and_migrate_movable_folios(), avoids this problem, by unpinning any folios in the **folios argument, before returning an error. Fix this by making check_and_migrate_movable_pages() behave in exactly the same way as check_and_migrate_movable_folios(): unpin all pages in **pages, before returning an error. Also, documentation was an aggravating factor, so: 1) Consolidate the documentation for these two routines, now that they have identical external behavior. 2) Rewrite the consolidated documentation: a) Clearly list the three return code cases, and what happens in each case. b) Mention that one of the cases unpins the pages or folios, before returning an error code. Link: https://lkml.kernel.org/r/[email protected] Fixes: 24a9599 ("mm/gup.c: simplify and fix check_and_migrate_movable_pages() return codes") Signed-off-by: John Hubbard <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Suggested-by: David Hildenbrand <[email protected]> Cc: Shigeru Yoshida <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Pasha Tatashin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent 01626a1 commit aa6f8b2

File tree

1 file changed

+19
-14
lines changed

1 file changed

+19
-14
lines changed

mm/gup.c

Lines changed: 19 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2394,20 +2394,25 @@ static int migrate_longterm_unpinnable_folios(
23942394
}
23952395

23962396
/*
2397-
* Check whether all folios are *allowed* to be pinned indefinitely (longterm).
2397+
* Check whether all folios are *allowed* to be pinned indefinitely (long term).
23982398
* Rather confusingly, all folios in the range are required to be pinned via
23992399
* FOLL_PIN, before calling this routine.
24002400
*
2401-
* If any folios in the range are not allowed to be pinned, then this routine
2402-
* will migrate those folios away, unpin all the folios in the range and return
2403-
* -EAGAIN. The caller should re-pin the entire range with FOLL_PIN and then
2404-
* call this routine again.
2401+
* Return values:
24052402
*
2406-
* If an error other than -EAGAIN occurs, this indicates a migration failure.
2407-
* The caller should give up, and propagate the error back up the call stack.
2408-
*
2409-
* If everything is OK and all folios in the range are allowed to be pinned,
2403+
* 0: if everything is OK and all folios in the range are allowed to be pinned,
24102404
* then this routine leaves all folios pinned and returns zero for success.
2405+
*
2406+
* -EAGAIN: if any folios in the range are not allowed to be pinned, then this
2407+
* routine will migrate those folios away, unpin all the folios in the range. If
2408+
* migration of the entire set of folios succeeds, then -EAGAIN is returned. The
2409+
* caller should re-pin the entire range with FOLL_PIN and then call this
2410+
* routine again.
2411+
*
2412+
* -ENOMEM, or any other -errno: if an error *other* than -EAGAIN occurs, this
2413+
* indicates a migration failure. The caller should give up, and propagate the
2414+
* error back up the call stack. The caller does not need to unpin any folios in
2415+
* that case, because this routine will do the unpinning.
24112416
*/
24122417
static long check_and_migrate_movable_folios(unsigned long nr_folios,
24132418
struct folio **folios)
@@ -2425,10 +2430,8 @@ static long check_and_migrate_movable_folios(unsigned long nr_folios,
24252430
}
24262431

24272432
/*
2428-
* This routine just converts all the pages in the @pages array to folios and
2429-
* calls check_and_migrate_movable_folios() to do the heavy lifting.
2430-
*
2431-
* Please see the check_and_migrate_movable_folios() documentation for details.
2433+
* Return values and behavior are the same as those for
2434+
* check_and_migrate_movable_folios().
24322435
*/
24332436
static long check_and_migrate_movable_pages(unsigned long nr_pages,
24342437
struct page **pages)
@@ -2437,8 +2440,10 @@ static long check_and_migrate_movable_pages(unsigned long nr_pages,
24372440
long i, ret;
24382441

24392442
folios = kmalloc_array(nr_pages, sizeof(*folios), GFP_KERNEL);
2440-
if (!folios)
2443+
if (!folios) {
2444+
unpin_user_pages(pages, nr_pages);
24412445
return -ENOMEM;
2446+
}
24422447

24432448
for (i = 0; i < nr_pages; i++)
24442449
folios[i] = page_folio(pages[i]);

0 commit comments

Comments
 (0)