Commit 7170130
x86/mm/init: Handle the special case of device private pages in add_pages(), to not increase max_pfn and trigger dma_addressing_limited() bounce buffers
As Bert Karwatzki reported, the following recent commit causes a
performance regression on AMD iGPU and dGPU systems:
7ffb791 ("x86/kaslr: Reduce KASLR entropy on most x86 systems")
It exposed a bug with nokaslr and zone device interaction.
The root cause of the bug is that, the GPU driver registers a zone
device private memory region. When KASLR is disabled or the above commit
is applied, the direct_map_physmem_end is set to much higher than 10 TiB
typically to the 64TiB address. When zone device private memory is added
to the system via add_pages(), it bumps up the max_pfn to the same
value. This causes dma_addressing_limited() to return true, since the
device cannot address memory all the way up to max_pfn.
This caused a regression for games played on the iGPU, as it resulted in
the DMA32 zone being used for GPU allocations.
Fix this by not bumping up max_pfn on x86 systems, when pgmap is passed
into add_pages(). The presence of pgmap is used to determine if device
private memory is being added via add_pages().
More details:
devm_request_mem_region() and request_free_mem_region() request for
device private memory. iomem_resource is passed as the base resource
with start and end parameters. iomem_resource's end depends on several
factors, including the platform and virtualization. On x86 for example
on bare metal, this value is set to boot_cpu_data.x86_phys_bits.
boot_cpu_data.x86_phys_bits can change depending on support for MKTME.
By default it is set to the same as log2(direct_map_physmem_end) which
is 46 to 52 bits depending on the number of levels in the page table.
The allocation routines used iomem_resource's end and
direct_map_physmem_end to figure out where to allocate the region.
[ arch/powerpc is also impacted by this problem, but this patch does not fix
the issue for PowerPC. ]
Testing:
1. Tested on a virtual machine with test_hmm for zone device inseration
2. A previous version of this patch was tested by Bert, please see:
https://lore.kernel.org/lkml/[email protected]/
[ mingo: Clarified the comments and the changelog. ]
Reported-by: Bert Karwatzki <[email protected]>
Tested-by: Bert Karwatzki <[email protected]>
Fixes: 7ffb791 ("x86/kaslr: Reduce KASLR entropy on most x86 systems")
Signed-off-by: Balbir Singh <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Brian Gerst <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Pierre-Eric Pelloux-Prayer <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Christian König <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Simona Vetter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]1 parent f710202 commit 7170130
1 file changed
+12
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
959 | 959 | | |
960 | 960 | | |
961 | 961 | | |
962 | | - | |
963 | | - | |
964 | | - | |
| 962 | + | |
| 963 | + | |
| 964 | + | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
965 | 974 | | |
966 | 975 | | |
967 | 976 | | |
| |||
0 commit comments