Skip to content

Commit 65266a7

Browse files
committed
Merge remote-tracking branch 'tip/sched/arm64' into for-next/core
* tip/sched/arm64: (785 commits) Documentation: arm64: describe asymmetric 32-bit support arm64: Remove logic to kill 32-bit tasks on 64-bit-only cores arm64: Hook up cmdline parameter to allow mismatched 32-bit EL0 arm64: Advertise CPUs capable of running 32-bit applications in sysfs arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system arm64: exec: Adjust affinity for compat tasks with mismatched 32-bit EL0 arm64: Implement task_cpu_possible_mask() sched: Introduce dl_task_check_affinity() to check proposed affinity sched: Allow task CPU affinity to be restricted on asymmetric systems sched: Split the guts of sched_setaffinity() into a helper function sched: Introduce task_struct::user_cpus_ptr to track requested affinity sched: Reject CPU affinity changes based on task_cpu_possible_mask() cpuset: Cleanup cpuset_cpus_allowed_fallback() use in select_fallback_rq() cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus() cpuset: Don't use the cpu_possible_mask as a last resort for cgroup v1 sched: Introduce task_cpu_possible_mask() to limit fallback rq selection sched: Cgroup SCHED_IDLE support sched/topology: Skip updating masks for non-online nodes Linux 5.14-rc6 lib: use PFN_PHYS() in devmem_is_allowed() ...
2 parents 1a7f67e + 702f438 commit 65266a7

File tree

787 files changed

+8629
-4082
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

787 files changed

+8629
-4082
lines changed

Documentation/ABI/testing/sysfs-devices-system-cpu

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -494,6 +494,15 @@ Description: AArch64 CPU registers
494494
'identification' directory exposes the CPU ID registers for
495495
identifying model and revision of the CPU.
496496

497+
What: /sys/devices/system/cpu/aarch32_el0
498+
Date: May 2021
499+
Contact: Linux ARM Kernel Mailing list <[email protected]>
500+
Description: Identifies the subset of CPUs in the system that can execute
501+
AArch32 (32-bit ARM) applications. If present, the same format as
502+
/sys/devices/system/cpu/{offline,online,possible,present} is used.
503+
If absent, then all or none of the CPUs can execute AArch32
504+
applications and execve() will behave accordingly.
505+
497506
What: /sys/devices/system/cpu/cpu#/cpu_capacity
498507
Date: December 2016
499508
Contact: Linux kernel mailing list <[email protected]>

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -287,6 +287,17 @@
287287
do not want to use tracing_snapshot_alloc() as it needs
288288
to be done where GFP_KERNEL allocations are allowed.
289289

290+
allow_mismatched_32bit_el0 [ARM64]
291+
Allow execve() of 32-bit applications and setting of the
292+
PER_LINUX32 personality on systems where only a strict
293+
subset of the CPUs support 32-bit EL0. When this
294+
parameter is present, the set of CPUs supporting 32-bit
295+
EL0 is indicated by /sys/devices/system/cpu/aarch32_el0
296+
and hot-unplug operations may be restricted.
297+
298+
See Documentation/arm64/asymmetric-32bit.rst for more
299+
information.
300+
290301
amd_iommu= [HW,X86-64]
291302
Pass parameters to the AMD IOMMU driver in the system.
292303
Possible values are:
Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
======================
2+
Asymmetric 32-bit SoCs
3+
======================
4+
5+
Author: Will Deacon <[email protected]>
6+
7+
This document describes the impact of asymmetric 32-bit SoCs on the
8+
execution of 32-bit (``AArch32``) applications.
9+
10+
Date: 2021-05-17
11+
12+
Introduction
13+
============
14+
15+
Some Armv9 SoCs suffer from a big.LITTLE misfeature where only a subset
16+
of the CPUs are capable of executing 32-bit user applications. On such
17+
a system, Linux by default treats the asymmetry as a "mismatch" and
18+
disables support for both the ``PER_LINUX32`` personality and
19+
``execve(2)`` of 32-bit ELF binaries, with the latter returning
20+
``-ENOEXEC``. If the mismatch is detected during late onlining of a
21+
64-bit-only CPU, then the onlining operation fails and the new CPU is
22+
unavailable for scheduling.
23+
24+
Surprisingly, these SoCs have been produced with the intention of
25+
running legacy 32-bit binaries. Unsurprisingly, that doesn't work very
26+
well with the default behaviour of Linux.
27+
28+
It seems inevitable that future SoCs will drop 32-bit support
29+
altogether, so if you're stuck in the unenviable position of needing to
30+
run 32-bit code on one of these transitionary platforms then you would
31+
be wise to consider alternatives such as recompilation, emulation or
32+
retirement. If neither of those options are practical, then read on.
33+
34+
Enabling kernel support
35+
=======================
36+
37+
Since the kernel support is not completely transparent to userspace,
38+
allowing 32-bit tasks to run on an asymmetric 32-bit system requires an
39+
explicit "opt-in" and can be enabled by passing the
40+
``allow_mismatched_32bit_el0`` parameter on the kernel command-line.
41+
42+
For the remainder of this document we will refer to an *asymmetric
43+
system* to mean an asymmetric 32-bit SoC running Linux with this kernel
44+
command-line option enabled.
45+
46+
Userspace impact
47+
================
48+
49+
32-bit tasks running on an asymmetric system behave in mostly the same
50+
way as on a homogeneous system, with a few key differences relating to
51+
CPU affinity.
52+
53+
sysfs
54+
-----
55+
56+
The subset of CPUs capable of running 32-bit tasks is described in
57+
``/sys/devices/system/cpu/aarch32_el0`` and is documented further in
58+
``Documentation/ABI/testing/sysfs-devices-system-cpu``.
59+
60+
**Note:** CPUs are advertised by this file as they are detected and so
61+
late-onlining of 32-bit-capable CPUs can result in the file contents
62+
being modified by the kernel at runtime. Once advertised, CPUs are never
63+
removed from the file.
64+
65+
``execve(2)``
66+
-------------
67+
68+
On a homogeneous system, the CPU affinity of a task is preserved across
69+
``execve(2)``. This is not always possible on an asymmetric system,
70+
specifically when the new program being executed is 32-bit yet the
71+
affinity mask contains 64-bit-only CPUs. In this situation, the kernel
72+
determines the new affinity mask as follows:
73+
74+
1. If the 32-bit-capable subset of the affinity mask is not empty,
75+
then the affinity is restricted to that subset and the old affinity
76+
mask is saved. This saved mask is inherited over ``fork(2)`` and
77+
preserved across ``execve(2)`` of 32-bit programs.
78+
79+
**Note:** This step does not apply to ``SCHED_DEADLINE`` tasks.
80+
See `SCHED_DEADLINE`_.
81+
82+
2. Otherwise, the cpuset hierarchy of the task is walked until an
83+
ancestor is found containing at least one 32-bit-capable CPU. The
84+
affinity of the task is then changed to match the 32-bit-capable
85+
subset of the cpuset determined by the walk.
86+
87+
3. On failure (i.e. out of memory), the affinity is changed to the set
88+
of all 32-bit-capable CPUs of which the kernel is aware.
89+
90+
A subsequent ``execve(2)`` of a 64-bit program by the 32-bit task will
91+
invalidate the affinity mask saved in (1) and attempt to restore the CPU
92+
affinity of the task using the saved mask if it was previously valid.
93+
This restoration may fail due to intervening changes to the deadline
94+
policy or cpuset hierarchy, in which case the ``execve(2)`` continues
95+
with the affinity unchanged.
96+
97+
Calls to ``sched_setaffinity(2)`` for a 32-bit task will consider only
98+
the 32-bit-capable CPUs of the requested affinity mask. On success, the
99+
affinity for the task is updated and any saved mask from a prior
100+
``execve(2)`` is invalidated.
101+
102+
``SCHED_DEADLINE``
103+
------------------
104+
105+
Explicit admission of a 32-bit deadline task to the default root domain
106+
(e.g. by calling ``sched_setattr(2)``) is rejected on an asymmetric
107+
32-bit system unless admission control is disabled by writing -1 to
108+
``/proc/sys/kernel/sched_rt_runtime_us``.
109+
110+
``execve(2)`` of a 32-bit program from a 64-bit deadline task will
111+
return ``-ENOEXEC`` if the root domain for the task contains any
112+
64-bit-only CPUs and admission control is enabled. Concurrent offlining
113+
of 32-bit-capable CPUs may still necessitate the procedure described in
114+
`execve(2)`_, in which case step (1) is skipped and a warning is
115+
emitted on the console.
116+
117+
**Note:** It is recommended that a set of 32-bit-capable CPUs are placed
118+
into a separate root domain if ``SCHED_DEADLINE`` is to be used with
119+
32-bit tasks on an asymmetric system. Failure to do so is likely to
120+
result in missed deadlines.
121+
122+
Cpusets
123+
-------
124+
125+
The affinity of a 32-bit task on an asymmetric system may include CPUs
126+
that are not explicitly allowed by the cpuset to which it is attached.
127+
This can occur as a result of the following two situations:
128+
129+
- A 64-bit task attached to a cpuset which allows only 64-bit CPUs
130+
executes a 32-bit program.
131+
132+
- All of the 32-bit-capable CPUs allowed by a cpuset containing a
133+
32-bit task are offlined.
134+
135+
In both of these cases, the new affinity is calculated according to step
136+
(2) of the process described in `execve(2)`_ and the cpuset hierarchy is
137+
unchanged irrespective of the cgroup version.
138+
139+
CPU hotplug
140+
-----------
141+
142+
On an asymmetric system, the first detected 32-bit-capable CPU is
143+
prevented from being offlined by userspace and any such attempt will
144+
return ``-EPERM``. Note that suspend is still permitted even if the
145+
primary CPU (i.e. CPU 0) is 64-bit-only.
146+
147+
KVM
148+
---
149+
150+
Although KVM will not advertise 32-bit EL0 support to any vCPUs on an
151+
asymmetric system, a broken guest at EL1 could still attempt to execute
152+
32-bit code at EL0. In this case, an exit from a vCPU thread in 32-bit
153+
mode will return to host userspace with an ``exit_reason`` of
154+
``KVM_EXIT_FAIL_ENTRY`` and will remain non-runnable until successfully
155+
re-initialised by a subsequent ``KVM_ARM_VCPU_INIT`` operation.

Documentation/arm64/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ ARM64 Architecture
1010
acpi_object_usage
1111
amu
1212
arm-acpi
13+
asymmetric-32bit
1314
booting
1415
cpu-feature-registers
1516
elf_hwcaps

Documentation/bpf/libbpf/libbpf_naming_convention.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,7 +108,7 @@ This bump in ABI version is at most once per kernel development cycle.
108108

109109
For example, if current state of ``libbpf.map`` is:
110110

111-
.. code-block:: c
111+
.. code-block:: none
112112
113113
LIBBPF_0.0.1 {
114114
global:
@@ -121,7 +121,7 @@ For example, if current state of ``libbpf.map`` is:
121121
, and a new symbol ``bpf_func_c`` is being introduced, then
122122
``libbpf.map`` should be changed like this:
123123

124-
.. code-block:: c
124+
.. code-block:: none
125125
126126
LIBBPF_0.0.1 {
127127
global:

Documentation/devicetree/bindings/iio/st,st-sensors.yaml

Lines changed: 0 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -152,47 +152,6 @@ allOf:
152152
maxItems: 1
153153
st,drdy-int-pin: false
154154

155-
- if:
156-
properties:
157-
compatible:
158-
enum:
159-
# Two intertial interrupts i.e. accelerometer/gyro interrupts
160-
- st,h3lis331dl-accel
161-
- st,l3g4200d-gyro
162-
- st,l3g4is-gyro
163-
- st,l3gd20-gyro
164-
- st,l3gd20h-gyro
165-
- st,lis2de12
166-
- st,lis2dw12
167-
- st,lis2hh12
168-
- st,lis2dh12-accel
169-
- st,lis331dl-accel
170-
- st,lis331dlh-accel
171-
- st,lis3de
172-
- st,lis3dh-accel
173-
- st,lis3dhh
174-
- st,lis3mdl-magn
175-
- st,lng2dm-accel
176-
- st,lps331ap-press
177-
- st,lsm303agr-accel
178-
- st,lsm303dlh-accel
179-
- st,lsm303dlhc-accel
180-
- st,lsm303dlm-accel
181-
- st,lsm330-accel
182-
- st,lsm330-gyro
183-
- st,lsm330d-accel
184-
- st,lsm330d-gyro
185-
- st,lsm330dl-accel
186-
- st,lsm330dl-gyro
187-
- st,lsm330dlc-accel
188-
- st,lsm330dlc-gyro
189-
- st,lsm9ds0-gyro
190-
- st,lsm9ds1-magn
191-
then:
192-
properties:
193-
interrupts:
194-
maxItems: 2
195-
196155
required:
197156
- compatible
198157
- reg

Documentation/gpu/rfc/i915_gem_lmem.rst

Lines changed: 0 additions & 109 deletions
Original file line numberDiff line numberDiff line change
@@ -18,114 +18,5 @@ real, with all the uAPI bits is:
1818
* Route shmem backend over to TTM SYSTEM for discrete
1919
* TTM purgeable object support
2020
* Move i915 buddy allocator over to TTM
21-
* MMAP ioctl mode(see `I915 MMAP`_)
22-
* SET/GET ioctl caching(see `I915 SET/GET CACHING`_)
2321
* Send RFC(with mesa-dev on cc) for final sign off on the uAPI
2422
* Add pciid for DG1 and turn on uAPI for real
25-
26-
New object placement and region query uAPI
27-
==========================================
28-
Starting from DG1 we need to give userspace the ability to allocate buffers from
29-
device local-memory. Currently the driver supports gem_create, which can place
30-
buffers in system memory via shmem, and the usual assortment of other
31-
interfaces, like dumb buffers and userptr.
32-
33-
To support this new capability, while also providing a uAPI which will work
34-
beyond just DG1, we propose to offer three new bits of uAPI:
35-
36-
DRM_I915_QUERY_MEMORY_REGIONS
37-
-----------------------------
38-
New query ID which allows userspace to discover the list of supported memory
39-
regions(like system-memory and local-memory) for a given device. We identify
40-
each region with a class and instance pair, which should be unique. The class
41-
here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
42-
like DG1.
43-
44-
Side note: The class/instance design is borrowed from our existing engine uAPI,
45-
where we describe every physical engine in terms of its class, and the
46-
particular instance, since we can have more than one per class.
47-
48-
In the future we also want to expose more information which can further
49-
describe the capabilities of a region.
50-
51-
.. kernel-doc:: include/uapi/drm/i915_drm.h
52-
:functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
53-
54-
GEM_CREATE_EXT
55-
--------------
56-
New ioctl which is basically just gem_create but now allows userspace to provide
57-
a chain of possible extensions. Note that if we don't provide any extensions and
58-
set flags=0 then we get the exact same behaviour as gem_create.
59-
60-
Side note: We also need to support PXP[1] in the near future, which is also
61-
applicable to integrated platforms, and adds its own gem_create_ext extension,
62-
which basically lets userspace mark a buffer as "protected".
63-
64-
.. kernel-doc:: include/uapi/drm/i915_drm.h
65-
:functions: drm_i915_gem_create_ext
66-
67-
I915_GEM_CREATE_EXT_MEMORY_REGIONS
68-
----------------------------------
69-
Implemented as an extension for gem_create_ext, we would now allow userspace to
70-
optionally provide an immutable list of preferred placements at creation time,
71-
in priority order, for a given buffer object. For the placements we expect
72-
them each to use the class/instance encoding, as per the output of the regions
73-
query. Having the list in priority order will be useful in the future when
74-
placing an object, say during eviction.
75-
76-
.. kernel-doc:: include/uapi/drm/i915_drm.h
77-
:functions: drm_i915_gem_create_ext_memory_regions
78-
79-
One fair criticism here is that this seems a little over-engineered[2]. If we
80-
just consider DG1 then yes, a simple gem_create.flags or something is totally
81-
all that's needed to tell the kernel to allocate the buffer in local-memory or
82-
whatever. However looking to the future we need uAPI which can also support
83-
upcoming Xe HP multi-tile architecture in a sane way, where there can be
84-
multiple local-memory instances for a given device, and so using both class and
85-
instance in our uAPI to describe regions is desirable, although specifically
86-
for DG1 it's uninteresting, since we only have a single local-memory instance.
87-
88-
Existing uAPI issues
89-
====================
90-
Some potential issues we still need to resolve.
91-
92-
I915 MMAP
93-
---------
94-
In i915 there are multiple ways to MMAP GEM object, including mapping the same
95-
object using different mapping types(WC vs WB), i.e multiple active mmaps per
96-
object. TTM expects one MMAP at most for the lifetime of the object. If it
97-
turns out that we have to backpedal here, there might be some potential
98-
userspace fallout.
99-
100-
I915 SET/GET CACHING
101-
--------------------
102-
In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
103-
DG1 doesn't support non-snooped pcie transactions, so we can just always
104-
allocate as WB for smem-only buffers. If/when our hw gains support for
105-
non-snooped pcie transactions then we must fix this mode at allocation time as
106-
a new GEM extension.
107-
108-
This is related to the mmap problem, because in general (meaning, when we're
109-
not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
110-
allocation mode.
111-
112-
Possible idea is to let the kernel picks the mmap mode for userspace from the
113-
following table:
114-
115-
smem-only: WB. Userspace does not need to call clflush.
116-
117-
smem+lmem: We only ever allow a single mode, so simply allocate this as uncached
118-
memory, and always give userspace a WC mapping. GPU still does snooped access
119-
here(assuming we can't turn it off like on DG1), which is a bit inefficient.
120-
121-
lmem only: always WC
122-
123-
This means on discrete you only get a single mmap mode, all others must be
124-
rejected. That's probably going to be a new default mode or something like
125-
that.
126-
127-
Links
128-
=====
129-
[1] https://patchwork.freedesktop.org/series/86798/
130-
131-
[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791

Documentation/i2c/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ Introduction
1717
busses/index
1818
i2c-topology
1919
muxes/i2c-mux-gpio
20+
i2c-sysfs
2021

2122
Writing device drivers
2223
======================

0 commit comments

Comments
 (0)