Skip to content

Commit 63bef48

Browse files
committed
Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton: - a lot more of MM, quite a bit more yet to come: (memcg, pagemap, vmalloc, pagealloc, migration, thp, ksm, madvise, virtio, userfaultfd, memory-hotplug, shmem, rmap, zswap, zsmalloc, cleanups) - various other subsystems (procfs, misc, MAINTAINERS, bitops, lib, checkpatch, epoll, binfmt, kallsyms, reiserfs, kmod, gcov, kconfig, ubsan, fault-injection, ipc) * emailed patches from Andrew Morton <[email protected]>: (158 commits) ipc/shm.c: make compat_ksys_shmctl() static ipc/mqueue.c: fix a brace coding style issue lib/Kconfig.debug: fix a typo "capabilitiy" -> "capability" ubsan: include bug type in report header kasan: unset panic_on_warn before calling panic() ubsan: check panic_on_warn drivers/misc/lkdtm/bugs.c: add arithmetic overflow and array bounds checks ubsan: split "bounds" checker from other options ubsan: add trap instrumentation option init/Kconfig: clean up ANON_INODES and old IO schedulers options kernel/gcov/fs.c: replace zero-length array with flexible-array member gcov: gcc_3_4: replace zero-length array with flexible-array member gcov: gcc_4_7: replace zero-length array with flexible-array member kernel/kmod.c: fix a typo "assuems" -> "assumes" reiserfs: clean up several indentation issues kallsyms: unexport kallsyms_lookup_name() and kallsyms_on_each_symbol() samples/hw_breakpoint: drop use of kallsyms_lookup_name() samples/hw_breakpoint: drop HW_BREAKPOINT_R when reporting writes fs/binfmt_elf.c: don't free interpreter's ELF pheaders on common path fs/binfmt_elf.c: allocate less for static executable ...
2 parents 04de788 + 1cd377b commit 63bef48

File tree

169 files changed

+3630
-1190
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

169 files changed

+3630
-1190
lines changed

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2573,13 +2573,22 @@
25732573
For details see: Documentation/admin-guide/hw-vuln/mds.rst
25742574

25752575
mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory
2576-
Amount of memory to be used when the kernel is not able
2577-
to see the whole system memory or for test.
2576+
Amount of memory to be used in cases as follows:
2577+
2578+
1 for test;
2579+
2 when the kernel is not able to see the whole system memory;
2580+
3 memory that lies after 'mem=' boundary is excluded from
2581+
the hypervisor, then assigned to KVM guests.
2582+
25782583
[X86] Work as limiting max address. Use together
25792584
with memmap= to avoid physical address space collisions.
25802585
Without memmap= PCI devices could be placed at addresses
25812586
belonging to unused RAM.
25822587

2588+
Note that this only takes effects during boot time since
2589+
in above case 3, memory may need be hot added after boot
2590+
if system memory of hypervisor is not sufficient.
2591+
25832592
mem=nopentium [BUGS=X86-32] Disable usage of 4MB pages for kernel
25842593
memory.
25852594

Documentation/admin-guide/mm/transhuge.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -310,6 +310,11 @@ thp_fault_fallback
310310
is incremented if a page fault fails to allocate
311311
a huge page and instead falls back to using small pages.
312312

313+
thp_fault_fallback_charge
314+
is incremented if a page fault fails to charge a huge page and
315+
instead falls back to using small pages even though the
316+
allocation was successful.
317+
313318
thp_collapse_alloc_failed
314319
is incremented if khugepaged found a range
315320
of pages that should be collapsed into one huge page but failed
@@ -319,6 +324,15 @@ thp_file_alloc
319324
is incremented every time a file huge page is successfully
320325
allocated.
321326

327+
thp_file_fallback
328+
is incremented if a file huge page is attempted to be allocated
329+
but fails and instead falls back to using small pages.
330+
331+
thp_file_fallback_charge
332+
is incremented if a file huge page cannot be charged and instead
333+
falls back to using small pages even though the allocation was
334+
successful.
335+
322336
thp_file_mapped
323337
is incremented every time a file huge page is mapped into
324338
user address space.

Documentation/admin-guide/mm/userfaultfd.rst

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -108,6 +108,57 @@ UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an
108108
half copied page since it'll keep userfaulting until the copy has
109109
finished.
110110

111+
Notes:
112+
113+
- If you requested UFFDIO_REGISTER_MODE_MISSING when registering then
114+
you must provide some kind of page in your thread after reading from
115+
the uffd. You must provide either UFFDIO_COPY or UFFDIO_ZEROPAGE.
116+
The normal behavior of the OS automatically providing a zero page on
117+
an annonymous mmaping is not in place.
118+
119+
- None of the page-delivering ioctls default to the range that you
120+
registered with. You must fill in all fields for the appropriate
121+
ioctl struct including the range.
122+
123+
- You get the address of the access that triggered the missing page
124+
event out of a struct uffd_msg that you read in the thread from the
125+
uffd. You can supply as many pages as you want with UFFDIO_COPY or
126+
UFFDIO_ZEROPAGE. Keep in mind that unless you used DONTWAKE then
127+
the first of any of those IOCTLs wakes up the faulting thread.
128+
129+
- Be sure to test for all errors including (pollfd[0].revents &
130+
POLLERR). This can happen, e.g. when ranges supplied were
131+
incorrect.
132+
133+
Write Protect Notifications
134+
---------------------------
135+
136+
This is equivalent to (but faster than) using mprotect and a SIGSEGV
137+
signal handler.
138+
139+
Firstly you need to register a range with UFFDIO_REGISTER_MODE_WP.
140+
Instead of using mprotect(2) you use ioctl(uffd, UFFDIO_WRITEPROTECT,
141+
struct *uffdio_writeprotect) while mode = UFFDIO_WRITEPROTECT_MODE_WP
142+
in the struct passed in. The range does not default to and does not
143+
have to be identical to the range you registered with. You can write
144+
protect as many ranges as you like (inside the registered range).
145+
Then, in the thread reading from uffd the struct will have
146+
msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP set. Now you send
147+
ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect) again
148+
while pagefault.mode does not have UFFDIO_WRITEPROTECT_MODE_WP set.
149+
This wakes up the thread which will continue to run with writes. This
150+
allows you to do the bookkeeping about the write in the uffd reading
151+
thread before the ioctl.
152+
153+
If you registered with both UFFDIO_REGISTER_MODE_MISSING and
154+
UFFDIO_REGISTER_MODE_WP then you need to think about the sequence in
155+
which you supply a page and undo write protect. Note that there is a
156+
difference between writes into a WP area and into a !WP area. The
157+
former will have UFFD_PAGEFAULT_FLAG_WP set, the latter
158+
UFFD_PAGEFAULT_FLAG_WRITE. The latter did not fail on protection but
159+
you still need to supply a page when UFFDIO_REGISTER_MODE_MISSING was
160+
used.
161+
111162
QEMU/KVM
112163
========
113164

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
.. _free_page_reporting:
2+
3+
=====================
4+
Free Page Reporting
5+
=====================
6+
7+
Free page reporting is an API by which a device can register to receive
8+
lists of pages that are currently unused by the system. This is useful in
9+
the case of virtualization where a guest is then able to use this data to
10+
notify the hypervisor that it is no longer using certain pages in memory.
11+
12+
For the driver, typically a balloon driver, to use of this functionality
13+
it will allocate and initialize a page_reporting_dev_info structure. The
14+
field within the structure it will populate is the "report" function
15+
pointer used to process the scatterlist. It must also guarantee that it can
16+
handle at least PAGE_REPORTING_CAPACITY worth of scatterlist entries per
17+
call to the function. A call to page_reporting_register will register the
18+
page reporting interface with the reporting framework assuming no other
19+
page reporting devices are already registered.
20+
21+
Once registered the page reporting API will begin reporting batches of
22+
pages to the driver. The API will start reporting pages 2 seconds after
23+
the interface is registered and will continue to do so 2 seconds after any
24+
page of a sufficiently high order is freed.
25+
26+
Pages reported will be stored in the scatterlist passed to the reporting
27+
function with the final entry having the end bit set in entry nent - 1.
28+
While pages are being processed by the report function they will not be
29+
accessible to the allocator. Once the report function has been completed
30+
the pages will be returned to the free area from which they were obtained.
31+
32+
Prior to removing a driver that is making use of free page reporting it
33+
is necessary to call page_reporting_unregister to have the
34+
page_reporting_dev_info structure that is currently in use by free page
35+
reporting removed. Doing this will prevent further reports from being
36+
issued via the interface. If another driver or the same driver is
37+
registered it is possible for it to resume where the previous driver had
38+
left off in terms of reporting free pages.
39+
40+
Alexander Duyck, Dec 04, 2019

Documentation/vm/zswap.rst

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -35,9 +35,11 @@ Zswap evicts pages from compressed cache on an LRU basis to the backing swap
3535
device when the compressed pool reaches its size limit. This requirement had
3636
been identified in prior community discussions.
3737

38-
Zswap is disabled by default but can be enabled at boot time by setting
39-
the ``enabled`` attribute to 1 at boot time. ie: ``zswap.enabled=1``. Zswap
40-
can also be enabled and disabled at runtime using the sysfs interface.
38+
Whether Zswap is enabled at the boot time depends on whether
39+
the ``CONFIG_ZSWAP_DEFAULT_ON`` Kconfig option is enabled or not.
40+
This setting can then be overridden by providing the kernel command line
41+
``zswap.enabled=`` option, for example ``zswap.enabled=0``.
42+
Zswap can also be enabled and disabled at runtime using the sysfs interface.
4143
An example command to enable zswap at runtime, assuming sysfs is mounted
4244
at ``/sys``, is::
4345

@@ -64,9 +66,10 @@ allocation in zpool is not directly accessible by address. Rather, a handle is
6466
returned by the allocation routine and that handle must be mapped before being
6567
accessed. The compressed memory pool grows on demand and shrinks as compressed
6668
pages are freed. The pool is not preallocated. By default, a zpool
67-
of type zbud is created, but it can be selected at boot time by
68-
setting the ``zpool`` attribute, e.g. ``zswap.zpool=zbud``. It can
69-
also be changed at runtime using the sysfs ``zpool`` attribute, e.g.::
69+
of type selected in ``CONFIG_ZSWAP_ZPOOL_DEFAULT`` Kconfig option is created,
70+
but it can be overridden at boot time by setting the ``zpool`` attribute,
71+
e.g. ``zswap.zpool=zbud``. It can also be changed at runtime using the sysfs
72+
``zpool`` attribute, e.g.::
7073

7174
echo zbud > /sys/module/zswap/parameters/zpool
7275

@@ -97,8 +100,9 @@ controlled policy:
97100
* max_pool_percent - The maximum percentage of memory that the compressed
98101
pool can occupy.
99102

100-
The default compressor is lzo, but it can be selected at boot time by
101-
setting the ``compressor`` attribute, e.g. ``zswap.compressor=lzo``.
103+
The default compressor is selected in ``CONFIG_ZSWAP_COMPRESSOR_DEFAULT``
104+
Kconfig option, but it can be overridden at boot time by setting the
105+
``compressor`` attribute, e.g. ``zswap.compressor=lzo``.
102106
It can also be changed at runtime using the sysfs "compressor"
103107
attribute, e.g.::
104108

MAINTAINERS

Lines changed: 18 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -77,21 +77,13 @@ Tips for patch submitters
7777

7878
8. Happy hacking.
7979

80-
Descriptions of section entries
81-
-------------------------------
80+
Descriptions of section entries and preferred order
81+
---------------------------------------------------
8282

8383
M: *Mail* patches to: FullName <address@domain>
8484
R: Designated *Reviewer*: FullName <address@domain>
8585
These reviewers should be CCed on patches.
8686
L: *Mailing list* that is relevant to this area
87-
W: *Web-page* with status/info
88-
B: URI for where to file *bugs*. A web-page with detailed bug
89-
filing info, a direct bug tracker link, or a mailto: URI.
90-
C: URI for *chat* protocol, server and channel where developers
91-
usually hang out, for example irc://server/channel.
92-
Q: *Patchwork* web based patch tracking system site
93-
T: *SCM* tree type and location.
94-
Type is one of: git, hg, quilt, stgit, topgit
9587
S: *Status*, one of the following:
9688
Supported: Someone is actually paid to look after this.
9789
Maintained: Someone actually looks after it.
@@ -102,30 +94,39 @@ Descriptions of section entries
10294
Obsolete: Old code. Something tagged obsolete generally means
10395
it has been replaced by a better system and you
10496
should be using that.
97+
W: *Web-page* with status/info
98+
Q: *Patchwork* web based patch tracking system site
99+
B: URI for where to file *bugs*. A web-page with detailed bug
100+
filing info, a direct bug tracker link, or a mailto: URI.
101+
C: URI for *chat* protocol, server and channel where developers
102+
usually hang out, for example irc://server/channel.
105103
P: Subsystem Profile document for more details submitting
106104
patches to the given subsystem. This is either an in-tree file,
107105
or a URI. See Documentation/maintainer/maintainer-entry-profile.rst
108106
for details.
107+
T: *SCM* tree type and location.
108+
Type is one of: git, hg, quilt, stgit, topgit
109109
F: *Files* and directories wildcard patterns.
110110
A trailing slash includes all files and subdirectory files.
111111
F: drivers/net/ all files in and below drivers/net
112112
F: drivers/net/* all files in drivers/net, but not below
113113
F: */net/* all files in "any top level directory"/net
114114
One pattern per line. Multiple F: lines acceptable.
115+
X: *Excluded* files and directories that are NOT maintained, same
116+
rules as F:. Files exclusions are tested before file matches.
117+
Can be useful for excluding a specific subdirectory, for instance:
118+
F: net/
119+
X: net/ipv6/
120+
matches all files in and below net excluding net/ipv6/
115121
N: Files and directories *Regex* patterns.
116-
N: [^a-z]tegra all files whose path contains the word tegra
122+
N: [^a-z]tegra all files whose path contains tegra
123+
(not including files like integrator)
117124
One pattern per line. Multiple N: lines acceptable.
118125
scripts/get_maintainer.pl has different behavior for files that
119126
match F: pattern and matches of N: patterns. By default,
120127
get_maintainer will not look at git log history when an F: pattern
121128
match occurs. When an N: match occurs, git log history is used
122129
to also notify the people that have git commit signatures.
123-
X: *Excluded* files and directories that are NOT maintained, same
124-
rules as F:. Files exclusions are tested before file matches.
125-
Can be useful for excluding a specific subdirectory, for instance:
126-
F: net/
127-
X: net/ipv6/
128-
matches all files in and below net excluding net/ipv6/
129130
K: *Content regex* (perl extended) pattern match in a patch or file.
130131
For instance:
131132
K: of_get_profile

arch/alpha/include/asm/mmzone.h

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,6 @@
88

99
#include <asm/smp.h>
1010

11-
struct bootmem_data_t; /* stupid forward decl. */
12-
1311
/*
1412
* Following are macros that are specific to this numa platform.
1513
*/

arch/alpha/kernel/syscalls/syscallhdr.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,5 +32,5 @@ grep -E "^[0-9A-Fa-fXx]+[[:space:]]+${my_abis}" "$in" | sort -n | (
3232
printf "#define __NR_syscalls\t%s\n" "${nxt}"
3333
printf "#endif\n"
3434
printf "\n"
35-
printf "#endif /* %s */" "${fileguard}"
35+
printf "#endif /* %s */\n" "${fileguard}"
3636
) > "$out"

arch/csky/mm/fault.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -141,7 +141,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs, unsigned long write,
141141
if (!(vma->vm_flags & VM_WRITE))
142142
goto bad_area;
143143
} else {
144-
if (!(vma->vm_flags & (VM_READ | VM_WRITE | VM_EXEC)))
144+
if (unlikely(!vma_is_accessible(vma)))
145145
goto bad_area;
146146
}
147147

arch/ia64/kernel/syscalls/syscallhdr.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,5 +32,5 @@ grep -E "^[0-9A-Fa-fXx]+[[:space:]]+${my_abis}" "$in" | sort -n | (
3232
printf "#define __NR_syscalls\t%s\n" "${nxt}"
3333
printf "#endif\n"
3434
printf "\n"
35-
printf "#endif /* %s */" "${fileguard}"
35+
printf "#endif /* %s */\n" "${fileguard}"
3636
) > "$out"

0 commit comments

Comments
 (0)