Skip to content

Commit d7a9590

Browse files
ChangSeokBaesuryasaimadhu
authored andcommitted
Documentation/x86: Add documentation for using dynamic XSTATE features
Explain how dynamic XSTATE features can be enabled via the architecture-specific prctl() along with dynamic sigframe size and first use trap handling. Fix: Documentation/x86/xstate.rst:15: WARNING: Title underline too short. as reported by Stephen Rothwell <[email protected]> Originally-by: Thomas Gleixner <[email protected]> Signed-off-by: Chang S. Bae <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
1 parent 868c250 commit d7a9590

File tree

2 files changed

+66
-0
lines changed

2 files changed

+66
-0
lines changed

Documentation/x86/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,3 +37,4 @@ x86-specific Documentation
3737
sgx
3838
features
3939
elf_auxvec
40+
xstate

Documentation/x86/xstate.rst

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
Using XSTATE features in user space applications
2+
================================================
3+
4+
The x86 architecture supports floating-point extensions which are
5+
enumerated via CPUID. Applications consult CPUID and use XGETBV to
6+
evaluate which features have been enabled by the kernel XCR0.
7+
8+
Up to AVX-512 and PKRU states, these features are automatically enabled by
9+
the kernel if available. Features like AMX TILE_DATA (XSTATE component 18)
10+
are enabled by XCR0 as well, but the first use of related instruction is
11+
trapped by the kernel because by default the required large XSTATE buffers
12+
are not allocated automatically.
13+
14+
Using dynamically enabled XSTATE features in user space applications
15+
--------------------------------------------------------------------
16+
17+
The kernel provides an arch_prctl(2) based mechanism for applications to
18+
request the usage of such features. The arch_prctl(2) options related to
19+
this are:
20+
21+
-ARCH_GET_XCOMP_SUPP
22+
23+
arch_prctl(ARCH_GET_XCOMP_SUPP, &features);
24+
25+
ARCH_GET_XCOMP_SUPP stores the supported features in userspace storage of
26+
type uint64_t. The second argument is a pointer to that storage.
27+
28+
-ARCH_GET_XCOMP_PERM
29+
30+
arch_prctl(ARCH_GET_XCOMP_PERM, &features);
31+
32+
ARCH_GET_XCOMP_PERM stores the features for which the userspace process
33+
has permission in userspace storage of type uint64_t. The second argument
34+
is a pointer to that storage.
35+
36+
-ARCH_REQ_XCOMP_PERM
37+
38+
arch_prctl(ARCH_REQ_XCOMP_PERM, feature_nr);
39+
40+
ARCH_REQ_XCOMP_PERM allows to request permission for a dynamically enabled
41+
feature or a feature set. A feature set can be mapped to a facility, e.g.
42+
AMX, and can require one or more XSTATE components to be enabled.
43+
44+
The feature argument is the number of the highest XSTATE component which
45+
is required for a facility to work.
46+
47+
When requesting permission for a feature, the kernel checks the
48+
availability. The kernel ensures that sigaltstacks in the process's tasks
49+
are large enough to accommodate the resulting large signal frame. It
50+
enforces this both during ARCH_REQ_XCOMP_SUPP and during any subsequent
51+
sigaltstack(2) calls. If an installed sigaltstack is smaller than the
52+
resulting sigframe size, ARCH_REQ_XCOMP_SUPP results in -ENOSUPP. Also,
53+
sigaltstack(2) results in -ENOMEM if the requested altstack is too small
54+
for the permitted features.
55+
56+
Permission, when granted, is valid per process. Permissions are inherited
57+
on fork(2) and cleared on exec(3).
58+
59+
The first use of an instruction related to a dynamically enabled feature is
60+
trapped by the kernel. The trap handler checks whether the process has
61+
permission to use the feature. If the process has no permission then the
62+
kernel sends SIGILL to the application. If the process has permission then
63+
the handler allocates a larger xstate buffer for the task so the large
64+
state can be context switched. In the unlikely cases that the allocation
65+
fails, the kernel sends SIGSEGV.

0 commit comments

Comments
 (0)