Skip to content

Commit 30d0255

Browse files
hansendcKAGA-KOKO
authored andcommitted
x86/fpu: Optimize out sigframe xfeatures when in init state
tl;dr: AMX state is ~8k. Signal frames can have space for this ~8k and each signal entry writes out all 8k even if it is zeros. Skip writing zeros for AMX to speed up signal delivery by about 4% overall when AMX is in its init state. This is a user-visible change to the sigframe ABI. == Hardware XSAVE Background == XSAVE state components may be tracked by the processor as being in their initial configuration. Software can detect which features are in this configuration by looking at the XSTATE_BV field in an XSAVE buffer or with the XGETBV(1) instruction. Both the XSAVE and XSAVEOPT instructions enumerate features s being in the initial configuration via the XSTATE_BV field in the XSAVE header, However, XSAVEOPT declines to actually write features in their initial configuration to the buffer. XSAVE writes the feature unconditionally, regardless of whether it is in the initial configuration or not. Basically, XSAVE users never need to inspect XSTATE_BV to determine if the feature has been written to the buffer. XSAVEOPT users *do* need to inspect XSTATE_BV. They might also need to clear out the buffer if they want to make an isolated change to the state, like modifying one register. == Software Signal / XSAVE Background == Signal frames have historically been written with XSAVE itself. Each state is written in its entirety, regardless of being in its initial configuration. In other words, the signal frame ABI uses the XSAVE behavior, not the XSAVEOPT behavior. == Problem == This means that any application which has acquired permission to use AMX via ARCH_REQ_XCOMP_PERM will write 8k of state to the signal frame. This 8k write will occur even when AMX was in its initial configuration and software *knows* this because of XSTATE_BV. This problem also exists to a lesser degree with AVX-512 and its 2k of state. However, AVX-512 use does not require ARCH_REQ_XCOMP_PERM and is more likely to have existing users which would be impacted by any change in behavior. == Solution == Stop writing out AMX xfeatures which are in their initial state to the signal frame. This effectively makes the signal frame XSAVE buffer look as if it were written with a combination of XSAVEOPT and XSAVE behavior. Userspace which handles XSAVEOPT- style buffers should be able to handle this naturally. For now, include only the AMX xfeatures: XTILE and XTILEDATA in this new behavior. These require new ABI to use anyway, which makes their users very unlikely to be broken. This XSAVEOPT-like behavior should be expected for all future dynamic xfeatures. It may also be extended to legacy features like AVX-512 in the future. Only attempt this optimization on systems with dynamic features. Disable dynamic feature support (XFD) if XGETBV1 is unavailable by adding a CPUID dependency. This has been measured to reduce the *overall* cycle cost of signal delivery by about 4%. Fixes: 2308ee5 ("x86/fpu/amx: Enable the AMX feature in 64-bit mode") Signed-off-by: Dave Hansen <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: "Chang S. Bae" <[email protected]> Link: https://lore.kernel.org/r/[email protected]
1 parent 879dbe9 commit 30d0255

File tree

5 files changed

+64
-2
lines changed

5 files changed

+64
-2
lines changed

Documentation/x86/xstate.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,3 +63,12 @@ kernel sends SIGILL to the application. If the process has permission then
6363
the handler allocates a larger xstate buffer for the task so the large
6464
state can be context switched. In the unlikely cases that the allocation
6565
fails, the kernel sends SIGSEGV.
66+
67+
Dynamic features in signal frames
68+
---------------------------------
69+
70+
Dynamcally enabled features are not written to the signal frame upon signal
71+
entry if the feature is in its initial configuration. This differs from
72+
non-dynamic features which are always written regardless of their
73+
configuration. Signal handlers can examine the XSAVE buffer's XSTATE_BV
74+
field to determine if a features was written.

arch/x86/include/asm/fpu/xcr.h

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
#define _ASM_X86_FPU_XCR_H
44

55
#define XCR_XFEATURE_ENABLED_MASK 0x00000000
6+
#define XCR_XFEATURE_IN_USE_MASK 0x00000001
67

78
static inline u64 xgetbv(u32 index)
89
{
@@ -20,4 +21,15 @@ static inline void xsetbv(u32 index, u64 value)
2021
asm volatile("xsetbv" :: "a" (eax), "d" (edx), "c" (index));
2122
}
2223

24+
/*
25+
* Return a mask of xfeatures which are currently being tracked
26+
* by the processor as being in the initial configuration.
27+
*
28+
* Callers should check X86_FEATURE_XGETBV1.
29+
*/
30+
static inline u64 xfeatures_in_use(void)
31+
{
32+
return xgetbv(XCR_XFEATURE_IN_USE_MASK);
33+
}
34+
2335
#endif /* _ASM_X86_FPU_XCR_H */

arch/x86/include/asm/fpu/xstate.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -92,6 +92,13 @@
9292
#define XFEATURE_MASK_FPSTATE (XFEATURE_MASK_USER_RESTORE | \
9393
XFEATURE_MASK_SUPERVISOR_SUPPORTED)
9494

95+
/*
96+
* Features in this mask have space allocated in the signal frame, but may not
97+
* have that space initialized when the feature is in its init state.
98+
*/
99+
#define XFEATURE_MASK_SIGFRAME_INITOPT (XFEATURE_MASK_XTILE | \
100+
XFEATURE_MASK_USER_DYNAMIC)
101+
95102
extern u64 xstate_fx_sw_bytes[USER_XSTATE_FX_SW_WORDS];
96103

97104
extern void __init update_regset_xstate_info(unsigned int size,

arch/x86/kernel/cpu/cpuid-deps.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ static const struct cpuid_dep cpuid_deps[] = {
7676
{ X86_FEATURE_SGX1, X86_FEATURE_SGX },
7777
{ X86_FEATURE_SGX2, X86_FEATURE_SGX1 },
7878
{ X86_FEATURE_XFD, X86_FEATURE_XSAVES },
79+
{ X86_FEATURE_XFD, X86_FEATURE_XGETBV1 },
7980
{ X86_FEATURE_AMX_TILE, X86_FEATURE_XFD },
8081
{}
8182
};

arch/x86/kernel/fpu/xstate.h

Lines changed: 35 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44

55
#include <asm/cpufeature.h>
66
#include <asm/fpu/xstate.h>
7+
#include <asm/fpu/xcr.h>
78

89
#ifdef CONFIG_X86_64
910
DECLARE_PER_CPU(u64, xfd_state);
@@ -198,6 +199,32 @@ static inline void os_xrstor_supervisor(struct fpstate *fpstate)
198199
XSTATE_XRESTORE(&fpstate->regs.xsave, lmask, hmask);
199200
}
200201

202+
/*
203+
* XSAVE itself always writes all requested xfeatures. Removing features
204+
* from the request bitmap reduces the features which are written.
205+
* Generate a mask of features which must be written to a sigframe. The
206+
* unset features can be optimized away and not written.
207+
*
208+
* This optimization is user-visible. Only use for states where
209+
* uninitialized sigframe contents are tolerable, like dynamic features.
210+
*
211+
* Users of buffers produced with this optimization must check XSTATE_BV
212+
* to determine which features have been optimized out.
213+
*/
214+
static inline u64 xfeatures_need_sigframe_write(void)
215+
{
216+
u64 xfeaures_to_write;
217+
218+
/* In-use features must be written: */
219+
xfeaures_to_write = xfeatures_in_use();
220+
221+
/* Also write all non-optimizable sigframe features: */
222+
xfeaures_to_write |= XFEATURE_MASK_USER_SUPPORTED &
223+
~XFEATURE_MASK_SIGFRAME_INITOPT;
224+
225+
return xfeaures_to_write;
226+
}
227+
201228
/*
202229
* Save xstate to user space xsave area.
203230
*
@@ -220,10 +247,16 @@ static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
220247
*/
221248
struct fpstate *fpstate = current->thread.fpu.fpstate;
222249
u64 mask = fpstate->user_xfeatures;
223-
u32 lmask = mask;
224-
u32 hmask = mask >> 32;
250+
u32 lmask;
251+
u32 hmask;
225252
int err;
226253

254+
/* Optimize away writing unnecessary xfeatures: */
255+
if (fpu_state_size_dynamic())
256+
mask &= xfeatures_need_sigframe_write();
257+
258+
lmask = mask;
259+
hmask = mask >> 32;
227260
xfd_validate_state(fpstate, mask, false);
228261

229262
stac();

0 commit comments

Comments
 (0)