Skip to content

Commit 57c78a2

Browse files
committed
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas: - Support for 32-bit tasks on asymmetric AArch32 systems (on top of the scheduler changes merged via the tip tree). - More entry.S clean-ups and conversion to C. - MTE updates: allow a preferred tag checking mode to be set per CPU (the overhead of synchronous mode is smaller for some CPUs than others); optimisations for kernel entry/exit path; optionally disable MTE on the kernel command line. - Kselftest improvements for SVE and signal handling, PtrAuth. - Fix unlikely race where a TLBI could use stale ASID on an ASID roll-over (found by inspection). - Miscellaneous fixes: disable trapping of PMSNEVFR_EL1 to higher exception levels; drop unnecessary sigdelsetmask() call in the signal32 handling; remove BUG_ON when failing to allocate SVE state (just signal the process); SYM_CODE annotations. - Other trivial clean-ups: use macros instead of magic numbers, remove redundant returns, typos. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (56 commits) arm64: Do not trap PMSNEVFR_EL1 arm64: mm: fix comment typo of pud_offset_phys() arm64: signal32: Drop pointless call to sigdelsetmask() arm64/sve: Better handle failure to allocate SVE register storage arm64: Document the requirement for SCR_EL3.HCE arm64: head: avoid over-mapping in map_memory arm64/sve: Add a comment documenting the binutils needed for SVE asm arm64/sve: Add some comments for sve_save/load_state() kselftest/arm64: signal: Add a TODO list for signal handling tests kselftest/arm64: signal: Add test case for SVE register state in signals kselftest/arm64: signal: Verify that signals can't change the SVE vector length kselftest/arm64: signal: Check SVE signal frame shows expected vector length kselftest/arm64: signal: Support signal frames with SVE register data kselftest/arm64: signal: Add SVE to the set of features we can check for arm64: replace in_irq() with in_hardirq() kselftest/arm64: pac: Fix skipping of tests on systems without PAC Documentation: arm64: describe asymmetric 32-bit support arm64: Remove logic to kill 32-bit tasks on 64-bit-only cores arm64: Hook up cmdline parameter to allow mismatched 32-bit EL0 arm64: Advertise CPUs capable of running 32-bit applications in sysfs ...
2 parents bcfeebb + 65266a7 commit 57c78a2

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

69 files changed

+1843
-464
lines changed

Documentation/ABI/testing/sysfs-devices-system-cpu

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -494,6 +494,15 @@ Description: AArch64 CPU registers
494494
'identification' directory exposes the CPU ID registers for
495495
identifying model and revision of the CPU.
496496

497+
What: /sys/devices/system/cpu/aarch32_el0
498+
Date: May 2021
499+
Contact: Linux ARM Kernel Mailing list <[email protected]>
500+
Description: Identifies the subset of CPUs in the system that can execute
501+
AArch32 (32-bit ARM) applications. If present, the same format as
502+
/sys/devices/system/cpu/{offline,online,possible,present} is used.
503+
If absent, then all or none of the CPUs can execute AArch32
504+
applications and execve() will behave accordingly.
505+
497506
What: /sys/devices/system/cpu/cpu#/cpu_capacity
498507
Date: December 2016
499508
Contact: Linux kernel mailing list <[email protected]>
@@ -640,3 +649,20 @@ Description: SPURR ticks for cpuX when it was idle.
640649

641650
This sysfs interface exposes the number of SPURR ticks
642651
for cpuX when it was idle.
652+
653+
What: /sys/devices/system/cpu/cpuX/mte_tcf_preferred
654+
Date: July 2021
655+
Contact: Linux ARM Kernel Mailing list <[email protected]>
656+
Description: Preferred MTE tag checking mode
657+
658+
When a user program specifies more than one MTE tag checking
659+
mode, this sysfs node is used to specify which mode should
660+
be preferred when scheduling a task on that CPU. Possible
661+
values:
662+
663+
================ ==============================================
664+
"sync" Prefer synchronous mode
665+
"async" Prefer asynchronous mode
666+
================ ==============================================
667+
668+
See also: Documentation/arm64/memory-tagging-extension.rst

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -287,6 +287,17 @@
287287
do not want to use tracing_snapshot_alloc() as it needs
288288
to be done where GFP_KERNEL allocations are allowed.
289289

290+
allow_mismatched_32bit_el0 [ARM64]
291+
Allow execve() of 32-bit applications and setting of the
292+
PER_LINUX32 personality on systems where only a strict
293+
subset of the CPUs support 32-bit EL0. When this
294+
parameter is present, the set of CPUs supporting 32-bit
295+
EL0 is indicated by /sys/devices/system/cpu/aarch32_el0
296+
and hot-unplug operations may be restricted.
297+
298+
See Documentation/arm64/asymmetric-32bit.rst for more
299+
information.
300+
290301
amd_iommu= [HW,X86-64]
291302
Pass parameters to the AMD IOMMU driver in the system.
292303
Possible values are:
@@ -380,6 +391,9 @@
380391
arm64.nopauth [ARM64] Unconditionally disable Pointer Authentication
381392
support
382393

394+
arm64.nomte [ARM64] Unconditionally disable Memory Tagging Extension
395+
support
396+
383397
ataflop= [HW,M68k]
384398

385399
atarimouse= [HW,MOUSE] Atari Mouse
Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
======================
2+
Asymmetric 32-bit SoCs
3+
======================
4+
5+
Author: Will Deacon <[email protected]>
6+
7+
This document describes the impact of asymmetric 32-bit SoCs on the
8+
execution of 32-bit (``AArch32``) applications.
9+
10+
Date: 2021-05-17
11+
12+
Introduction
13+
============
14+
15+
Some Armv9 SoCs suffer from a big.LITTLE misfeature where only a subset
16+
of the CPUs are capable of executing 32-bit user applications. On such
17+
a system, Linux by default treats the asymmetry as a "mismatch" and
18+
disables support for both the ``PER_LINUX32`` personality and
19+
``execve(2)`` of 32-bit ELF binaries, with the latter returning
20+
``-ENOEXEC``. If the mismatch is detected during late onlining of a
21+
64-bit-only CPU, then the onlining operation fails and the new CPU is
22+
unavailable for scheduling.
23+
24+
Surprisingly, these SoCs have been produced with the intention of
25+
running legacy 32-bit binaries. Unsurprisingly, that doesn't work very
26+
well with the default behaviour of Linux.
27+
28+
It seems inevitable that future SoCs will drop 32-bit support
29+
altogether, so if you're stuck in the unenviable position of needing to
30+
run 32-bit code on one of these transitionary platforms then you would
31+
be wise to consider alternatives such as recompilation, emulation or
32+
retirement. If neither of those options are practical, then read on.
33+
34+
Enabling kernel support
35+
=======================
36+
37+
Since the kernel support is not completely transparent to userspace,
38+
allowing 32-bit tasks to run on an asymmetric 32-bit system requires an
39+
explicit "opt-in" and can be enabled by passing the
40+
``allow_mismatched_32bit_el0`` parameter on the kernel command-line.
41+
42+
For the remainder of this document we will refer to an *asymmetric
43+
system* to mean an asymmetric 32-bit SoC running Linux with this kernel
44+
command-line option enabled.
45+
46+
Userspace impact
47+
================
48+
49+
32-bit tasks running on an asymmetric system behave in mostly the same
50+
way as on a homogeneous system, with a few key differences relating to
51+
CPU affinity.
52+
53+
sysfs
54+
-----
55+
56+
The subset of CPUs capable of running 32-bit tasks is described in
57+
``/sys/devices/system/cpu/aarch32_el0`` and is documented further in
58+
``Documentation/ABI/testing/sysfs-devices-system-cpu``.
59+
60+
**Note:** CPUs are advertised by this file as they are detected and so
61+
late-onlining of 32-bit-capable CPUs can result in the file contents
62+
being modified by the kernel at runtime. Once advertised, CPUs are never
63+
removed from the file.
64+
65+
``execve(2)``
66+
-------------
67+
68+
On a homogeneous system, the CPU affinity of a task is preserved across
69+
``execve(2)``. This is not always possible on an asymmetric system,
70+
specifically when the new program being executed is 32-bit yet the
71+
affinity mask contains 64-bit-only CPUs. In this situation, the kernel
72+
determines the new affinity mask as follows:
73+
74+
1. If the 32-bit-capable subset of the affinity mask is not empty,
75+
then the affinity is restricted to that subset and the old affinity
76+
mask is saved. This saved mask is inherited over ``fork(2)`` and
77+
preserved across ``execve(2)`` of 32-bit programs.
78+
79+
**Note:** This step does not apply to ``SCHED_DEADLINE`` tasks.
80+
See `SCHED_DEADLINE`_.
81+
82+
2. Otherwise, the cpuset hierarchy of the task is walked until an
83+
ancestor is found containing at least one 32-bit-capable CPU. The
84+
affinity of the task is then changed to match the 32-bit-capable
85+
subset of the cpuset determined by the walk.
86+
87+
3. On failure (i.e. out of memory), the affinity is changed to the set
88+
of all 32-bit-capable CPUs of which the kernel is aware.
89+
90+
A subsequent ``execve(2)`` of a 64-bit program by the 32-bit task will
91+
invalidate the affinity mask saved in (1) and attempt to restore the CPU
92+
affinity of the task using the saved mask if it was previously valid.
93+
This restoration may fail due to intervening changes to the deadline
94+
policy or cpuset hierarchy, in which case the ``execve(2)`` continues
95+
with the affinity unchanged.
96+
97+
Calls to ``sched_setaffinity(2)`` for a 32-bit task will consider only
98+
the 32-bit-capable CPUs of the requested affinity mask. On success, the
99+
affinity for the task is updated and any saved mask from a prior
100+
``execve(2)`` is invalidated.
101+
102+
``SCHED_DEADLINE``
103+
------------------
104+
105+
Explicit admission of a 32-bit deadline task to the default root domain
106+
(e.g. by calling ``sched_setattr(2)``) is rejected on an asymmetric
107+
32-bit system unless admission control is disabled by writing -1 to
108+
``/proc/sys/kernel/sched_rt_runtime_us``.
109+
110+
``execve(2)`` of a 32-bit program from a 64-bit deadline task will
111+
return ``-ENOEXEC`` if the root domain for the task contains any
112+
64-bit-only CPUs and admission control is enabled. Concurrent offlining
113+
of 32-bit-capable CPUs may still necessitate the procedure described in
114+
`execve(2)`_, in which case step (1) is skipped and a warning is
115+
emitted on the console.
116+
117+
**Note:** It is recommended that a set of 32-bit-capable CPUs are placed
118+
into a separate root domain if ``SCHED_DEADLINE`` is to be used with
119+
32-bit tasks on an asymmetric system. Failure to do so is likely to
120+
result in missed deadlines.
121+
122+
Cpusets
123+
-------
124+
125+
The affinity of a 32-bit task on an asymmetric system may include CPUs
126+
that are not explicitly allowed by the cpuset to which it is attached.
127+
This can occur as a result of the following two situations:
128+
129+
- A 64-bit task attached to a cpuset which allows only 64-bit CPUs
130+
executes a 32-bit program.
131+
132+
- All of the 32-bit-capable CPUs allowed by a cpuset containing a
133+
32-bit task are offlined.
134+
135+
In both of these cases, the new affinity is calculated according to step
136+
(2) of the process described in `execve(2)`_ and the cpuset hierarchy is
137+
unchanged irrespective of the cgroup version.
138+
139+
CPU hotplug
140+
-----------
141+
142+
On an asymmetric system, the first detected 32-bit-capable CPU is
143+
prevented from being offlined by userspace and any such attempt will
144+
return ``-EPERM``. Note that suspend is still permitted even if the
145+
primary CPU (i.e. CPU 0) is 64-bit-only.
146+
147+
KVM
148+
---
149+
150+
Although KVM will not advertise 32-bit EL0 support to any vCPUs on an
151+
asymmetric system, a broken guest at EL1 could still attempt to execute
152+
32-bit code at EL0. In this case, an exit from a vCPU thread in 32-bit
153+
mode will return to host userspace with an ``exit_reason`` of
154+
``KVM_EXIT_FAIL_ENTRY`` and will remain non-runnable until successfully
155+
re-initialised by a subsequent ``KVM_ARM_VCPU_INIT`` operation.

Documentation/arm64/booting.rst

Lines changed: 33 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -207,10 +207,17 @@ Before jumping into the kernel, the following conditions must be met:
207207
software at a higher exception level to prevent execution in an UNKNOWN
208208
state.
209209

210-
- SCR_EL3.FIQ must have the same value across all CPUs the kernel is
211-
executing on.
212-
- The value of SCR_EL3.FIQ must be the same as the one present at boot
213-
time whenever the kernel is executing.
210+
For all systems:
211+
- If EL3 is present:
212+
213+
- SCR_EL3.FIQ must have the same value across all CPUs the kernel is
214+
executing on.
215+
- The value of SCR_EL3.FIQ must be the same as the one present at boot
216+
time whenever the kernel is executing.
217+
218+
- If EL3 is present and the kernel is entered at EL2:
219+
220+
- SCR_EL3.HCE (bit 8) must be initialised to 0b1.
214221

215222
For systems with a GICv3 interrupt controller to be used in v3 mode:
216223
- If EL3 is present:
@@ -311,6 +318,28 @@ Before jumping into the kernel, the following conditions must be met:
311318
- ZCR_EL2.LEN must be initialised to the same value for all CPUs the
312319
kernel will execute on.
313320

321+
For CPUs with the Scalable Matrix Extension (FEAT_SME):
322+
323+
- If EL3 is present:
324+
325+
- CPTR_EL3.ESM (bit 12) must be initialised to 0b1.
326+
327+
- SCR_EL3.EnTP2 (bit 41) must be initialised to 0b1.
328+
329+
- SMCR_EL3.LEN must be initialised to the same value for all CPUs the
330+
kernel will execute on.
331+
332+
- If the kernel is entered at EL1 and EL2 is present:
333+
334+
- CPTR_EL2.TSM (bit 12) must be initialised to 0b0.
335+
336+
- CPTR_EL2.SMEN (bits 25:24) must be initialised to 0b11.
337+
338+
- SCTLR_EL2.EnTP2 (bit 60) must be initialised to 0b1.
339+
340+
- SMCR_EL2.LEN must be initialised to the same value for all CPUs the
341+
kernel will execute on.
342+
314343
The requirements described above for CPU mode, caches, MMUs, architected
315344
timers, coherency and system registers apply to all CPUs. All CPUs must
316345
enter the kernel in the same exception level. Where the values documented

Documentation/arm64/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ ARM64 Architecture
1010
acpi_object_usage
1111
amu
1212
arm-acpi
13+
asymmetric-32bit
1314
booting
1415
cpu-feature-registers
1516
elf_hwcaps

Documentation/arm64/memory-tagging-extension.rst

Lines changed: 41 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -77,14 +77,20 @@ configurable behaviours:
7777
address is unknown).
7878

7979
The user can select the above modes, per thread, using the
80-
``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where
81-
``flags`` contain one of the following values in the ``PR_MTE_TCF_MASK``
80+
``prctl(PR_SET_TAGGED_ADDR_CTRL, flags, 0, 0, 0)`` system call where ``flags``
81+
contains any number of the following values in the ``PR_MTE_TCF_MASK``
8282
bit-field:
8383

84-
- ``PR_MTE_TCF_NONE`` - *Ignore* tag check faults
84+
- ``PR_MTE_TCF_NONE``  - *Ignore* tag check faults
85+
(ignored if combined with other options)
8586
- ``PR_MTE_TCF_SYNC`` - *Synchronous* tag check fault mode
8687
- ``PR_MTE_TCF_ASYNC`` - *Asynchronous* tag check fault mode
8788

89+
If no modes are specified, tag check faults are ignored. If a single
90+
mode is specified, the program will run in that mode. If multiple
91+
modes are specified, the mode is selected as described in the "Per-CPU
92+
preferred tag checking modes" section below.
93+
8894
The current tag check fault mode can be read using the
8995
``prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0)`` system call.
9096

@@ -120,13 +126,39 @@ in the ``PR_MTE_TAG_MASK`` bit-field.
120126
interface provides an include mask. An include mask of ``0`` (exclusion
121127
mask ``0xffff``) results in the CPU always generating tag ``0``.
122128

129+
Per-CPU preferred tag checking mode
130+
-----------------------------------
131+
132+
On some CPUs the performance of MTE in stricter tag checking modes
133+
is similar to that of less strict tag checking modes. This makes it
134+
worthwhile to enable stricter checks on those CPUs when a less strict
135+
checking mode is requested, in order to gain the error detection
136+
benefits of the stricter checks without the performance downsides. To
137+
support this scenario, a privileged user may configure a stricter
138+
tag checking mode as the CPU's preferred tag checking mode.
139+
140+
The preferred tag checking mode for each CPU is controlled by
141+
``/sys/devices/system/cpu/cpu<N>/mte_tcf_preferred``, to which a
142+
privileged user may write the value ``async`` or ``sync``. The default
143+
preferred mode for each CPU is ``async``.
144+
145+
To allow a program to potentially run in the CPU's preferred tag
146+
checking mode, the user program may set multiple tag check fault mode
147+
bits in the ``flags`` argument to the ``prctl(PR_SET_TAGGED_ADDR_CTRL,
148+
flags, 0, 0, 0)`` system call. If the CPU's preferred tag checking
149+
mode is in the task's set of provided tag checking modes (this will
150+
always be the case at present because the kernel only supports two
151+
tag checking modes, but future kernels may support more modes), that
152+
mode will be selected. Otherwise, one of the modes in the task's mode
153+
set will be selected in a currently unspecified manner.
154+
123155
Initial process state
124156
---------------------
125157

126158
On ``execve()``, the new process has the following configuration:
127159

128160
- ``PR_TAGGED_ADDR_ENABLE`` set to 0 (disabled)
129-
- Tag checking mode set to ``PR_MTE_TCF_NONE``
161+
- No tag checking modes are selected (tag check faults ignored)
130162
- ``PR_MTE_TAG_MASK`` set to 0 (all tags excluded)
131163
- ``PSTATE.TCO`` set to 0
132164
- ``PROT_MTE`` not set on any of the initial memory maps
@@ -251,11 +283,13 @@ Example of correct usage
251283
return EXIT_FAILURE;
252284
253285
/*
254-
* Enable the tagged address ABI, synchronous MTE tag check faults and
255-
* allow all non-zero tags in the randomly generated set.
286+
* Enable the tagged address ABI, synchronous or asynchronous MTE
287+
* tag check faults (based on per-CPU preference) and allow all
288+
* non-zero tags in the randomly generated set.
256289
*/
257290
if (prctl(PR_SET_TAGGED_ADDR_CTRL,
258-
PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | (0xfffe << PR_MTE_TAG_SHIFT),
291+
PR_TAGGED_ADDR_ENABLE | PR_MTE_TCF_SYNC | PR_MTE_TCF_ASYNC |
292+
(0xfffe << PR_MTE_TAG_SHIFT),
259293
0, 0, 0)) {
260294
perror("prctl() failed");
261295
return EXIT_FAILURE;

arch/arm64/Makefile

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -157,8 +157,11 @@ Image: vmlinux
157157
Image.%: Image
158158
$(Q)$(MAKE) $(build)=$(boot) $(boot)/$@
159159

160-
zinstall install:
161-
$(Q)$(MAKE) $(build)=$(boot) $@
160+
install: install-image := Image
161+
zinstall: install-image := Image.gz
162+
install zinstall:
163+
$(CONFIG_SHELL) $(srctree)/$(boot)/install.sh $(KERNELRELEASE) \
164+
$(boot)/$(install-image) System.map "$(INSTALL_PATH)"
162165

163166
PHONY += vdso_install
164167
vdso_install:

arch/arm64/boot/Makefile

Lines changed: 0 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -35,11 +35,3 @@ $(obj)/Image.lzma: $(obj)/Image FORCE
3535

3636
$(obj)/Image.lzo: $(obj)/Image FORCE
3737
$(call if_changed,lzo)
38-
39-
install:
40-
$(CONFIG_SHELL) $(srctree)/$(src)/install.sh $(KERNELRELEASE) \
41-
$(obj)/Image System.map "$(INSTALL_PATH)"
42-
43-
zinstall:
44-
$(CONFIG_SHELL) $(srctree)/$(src)/install.sh $(KERNELRELEASE) \
45-
$(obj)/Image.gz System.map "$(INSTALL_PATH)"

0 commit comments

Comments
 (0)