@@ -12,7 +12,10 @@ Pkeys Userspace (PKU) is a feature which can be found on:
1212 * Intel server CPUs, Skylake and later
1313 * Intel client CPUs, Tiger Lake (11th Gen Core) and later
1414 * Future AMD CPUs
15+ * arm64 CPUs implementing the Permission Overlay Extension (FEAT_S1POE)
1516
17+ x86_64
18+ ======
1619Pkeys work by dedicating 4 previously Reserved bits in each page table entry to
1720a "protection key", giving 16 possible keys.
1821
@@ -28,6 +31,22 @@ register. The feature is only available in 64-bit mode, even though there is
2831theoretically space in the PAE PTEs. These permissions are enforced on data
2932access only and have no effect on instruction fetches.
3033
34+ arm64
35+ =====
36+
37+ Pkeys use 3 bits in each page table entry, to encode a "protection key index",
38+ giving 8 possible keys.
39+
40+ Protections for each key are defined with a per-CPU user-writable system
41+ register (POR_EL0). This is a 64-bit register encoding read, write and execute
42+ overlay permissions for each protection key index.
43+
44+ Being a CPU register, POR_EL0 is inherently thread-local, potentially giving
45+ each thread a different set of protections from every other thread.
46+
47+ Unlike x86_64, the protection key permissions also apply to instruction
48+ fetches.
49+
3150Syscalls
3251========
3352
@@ -38,11 +57,10 @@ There are 3 system calls which directly interact with pkeys::
3857 int pkey_mprotect(unsigned long start, size_t len,
3958 unsigned long prot, int pkey);
4059
41- Before a pkey can be used, it must first be allocated with
42- pkey_alloc(). An application calls the WRPKRU instruction
43- directly in order to change access permissions to memory covered
44- with a key. In this example WRPKRU is wrapped by a C function
45- called pkey_set().
60+ Before a pkey can be used, it must first be allocated with pkey_alloc(). An
61+ application writes to the architecture specific CPU register directly in order
62+ to change access permissions to memory covered with a key. In this example
63+ this is wrapped by a C function called pkey_set().
4664::
4765
4866 int real_prot = PROT_READ|PROT_WRITE;
@@ -64,9 +82,9 @@ is no longer in use::
6482 munmap(ptr, PAGE_SIZE);
6583 pkey_free(pkey);
6684
67- .. note :: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions .
68- An example implementation can be found in
69- tools/testing/selftests/x86/protection_keys.c.
85+ .. note :: pkey_set() is a wrapper around writing to the CPU register .
86+ Example implementations can be found in
87+ tools/testing/selftests/mm/pkey-{arm64,powerpc,x86}.h
7088
7189Behavior
7290========
@@ -96,3 +114,7 @@ with a read()::
96114The kernel will send a SIGSEGV in both cases, but si_code will be set
97115to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
98116the plain mprotect() permissions are violated.
117+
118+ Note that kernel accesses from a kthread (such as io_uring) will use a default
119+ value for the protection key register and so will not be consistent with
120+ userspace's value of the register or mprotect().
0 commit comments