Skip to content

Commit 0266e7c

Browse files
committed
mm: Add guard pages around a shadow stack.
The x86 Control-flow Enforcement Technology (CET) feature includes a new type of memory called shadow stack. This shadow stack memory has some unusual properties, which requires some core mm changes to function properly. The architecture of shadow stack constrains the ability of userspace to move the shadow stack pointer (SSP) in order to prevent corrupting or switching to other shadow stacks. The RSTORSSP instruction can move the SSP to different shadow stacks, but it requires a specially placed token in order to do this. However, the architecture does not prevent incrementing the stack pointer to wander onto an adjacent shadow stack. To prevent this in software, enforce guard pages at the beginning of shadow stack VMAs, such that there will always be a gap between adjacent shadow stacks. Make the gap big enough so that no userspace SSP changing operations (besides RSTORSSP), can move the SSP from one stack to the next. The SSP can be incremented or decremented by CALL, RET and INCSSP. CALL and RET can move the SSP by a maximum of 8 bytes, at which point the shadow stack would be accessed. The INCSSP instruction can also increment the shadow stack pointer. It is the shadow stack analog of an instruction like: addq $0x80, %rsp However, there is one important difference between an ADD on %rsp and INCSSP. In addition to modifying SSP, INCSSP also reads from the memory of the first and last elements that were "popped". It can be thought of as acting like this: READ_ONCE(ssp); // read+discard top element on stack ssp += nr_to_pop * 8; // move the shadow stack READ_ONCE(ssp-8); // read+discard last popped stack element The maximum distance INCSSP can move the SSP is 2040 bytes, before it would read the memory. Therefore, a single page gap will be enough to prevent any operation from shifting the SSP to an adjacent stack, since it would have to land in the gap at least once, causing a fault. This could be accomplished by using VM_GROWSDOWN, but this has a downside. The behavior would allow shadow stacks to grow, which is unneeded and adds a strange difference to how most regular stacks work. In the maple tree code, there is some logic for retrying the unmapped area search if a guard gap is violated. This retry should happen for shadow stack guard gap violations as well. This logic currently only checks for VM_GROWSDOWN for start gaps. Since shadow stacks also have a start gap as well, create an new define VM_STARTGAP_FLAGS to hold all the VM flag bits that have start gaps, and make mmap use it. Co-developed-by: Yu-cheng Yu <[email protected]> Signed-off-by: Yu-cheng Yu <[email protected]> Signed-off-by: Rick Edgecombe <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Reviewed-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: Mark Brown <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Tested-by: Pengfei Xu <[email protected]> Tested-by: John Allen <[email protected]> Tested-by: Kees Cook <[email protected]> Link: https://lore.kernel.org/all/20230613001108.3040476-17-rick.p.edgecombe%40intel.com
1 parent fd5439e commit 0266e7c

File tree

2 files changed

+50
-8
lines changed

2 files changed

+50
-8
lines changed

include/linux/mm.h

Lines changed: 48 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -342,7 +342,36 @@ extern unsigned int kobjsize(const void *objp);
342342
#endif /* CONFIG_ARCH_HAS_PKEYS */
343343

344344
#ifdef CONFIG_X86_USER_SHADOW_STACK
345-
# define VM_SHADOW_STACK VM_HIGH_ARCH_5 /* Should not be set with VM_SHARED */
345+
/*
346+
* This flag should not be set with VM_SHARED because of lack of support
347+
* core mm. It will also get a guard page. This helps userspace protect
348+
* itself from attacks. The reasoning is as follows:
349+
*
350+
* The shadow stack pointer(SSP) is moved by CALL, RET, and INCSSPQ. The
351+
* INCSSP instruction can increment the shadow stack pointer. It is the
352+
* shadow stack analog of an instruction like:
353+
*
354+
* addq $0x80, %rsp
355+
*
356+
* However, there is one important difference between an ADD on %rsp
357+
* and INCSSP. In addition to modifying SSP, INCSSP also reads from the
358+
* memory of the first and last elements that were "popped". It can be
359+
* thought of as acting like this:
360+
*
361+
* READ_ONCE(ssp); // read+discard top element on stack
362+
* ssp += nr_to_pop * 8; // move the shadow stack
363+
* READ_ONCE(ssp-8); // read+discard last popped stack element
364+
*
365+
* The maximum distance INCSSP can move the SSP is 2040 bytes, before
366+
* it would read the memory. Therefore a single page gap will be enough
367+
* to prevent any operation from shifting the SSP to an adjacent stack,
368+
* since it would have to land in the gap at least once, causing a
369+
* fault.
370+
*
371+
* Prevent using INCSSP to move the SSP between shadow stacks by
372+
* having a PAGE_SIZE guard gap.
373+
*/
374+
# define VM_SHADOW_STACK VM_HIGH_ARCH_5
346375
#else
347376
# define VM_SHADOW_STACK VM_NONE
348377
#endif
@@ -405,6 +434,8 @@ extern unsigned int kobjsize(const void *objp);
405434
#define VM_STACK_DEFAULT_FLAGS VM_DATA_DEFAULT_FLAGS
406435
#endif
407436

437+
#define VM_STARTGAP_FLAGS (VM_GROWSDOWN | VM_SHADOW_STACK)
438+
408439
#ifdef CONFIG_STACK_GROWSUP
409440
#define VM_STACK VM_GROWSUP
410441
#define VM_STACK_EARLY VM_GROWSDOWN
@@ -3273,15 +3304,26 @@ struct vm_area_struct *vma_lookup(struct mm_struct *mm, unsigned long addr)
32733304
return mtree_load(&mm->mm_mt, addr);
32743305
}
32753306

3307+
static inline unsigned long stack_guard_start_gap(struct vm_area_struct *vma)
3308+
{
3309+
if (vma->vm_flags & VM_GROWSDOWN)
3310+
return stack_guard_gap;
3311+
3312+
/* See reasoning around the VM_SHADOW_STACK definition */
3313+
if (vma->vm_flags & VM_SHADOW_STACK)
3314+
return PAGE_SIZE;
3315+
3316+
return 0;
3317+
}
3318+
32763319
static inline unsigned long vm_start_gap(struct vm_area_struct *vma)
32773320
{
3321+
unsigned long gap = stack_guard_start_gap(vma);
32783322
unsigned long vm_start = vma->vm_start;
32793323

3280-
if (vma->vm_flags & VM_GROWSDOWN) {
3281-
vm_start -= stack_guard_gap;
3282-
if (vm_start > vma->vm_start)
3283-
vm_start = 0;
3284-
}
3324+
vm_start -= gap;
3325+
if (vm_start > vma->vm_start)
3326+
vm_start = 0;
32853327
return vm_start;
32863328
}
32873329

mm/mmap.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1572,7 +1572,7 @@ static unsigned long unmapped_area(struct vm_unmapped_area_info *info)
15721572
gap = mas.index;
15731573
gap += (info->align_offset - gap) & info->align_mask;
15741574
tmp = mas_next(&mas, ULONG_MAX);
1575-
if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) { /* Avoid prev check if possible */
1575+
if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if possible */
15761576
if (vm_start_gap(tmp) < gap + length - 1) {
15771577
low_limit = tmp->vm_end;
15781578
mas_reset(&mas);
@@ -1624,7 +1624,7 @@ static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *info)
16241624
gap -= (gap - info->align_offset) & info->align_mask;
16251625
gap_end = mas.last;
16261626
tmp = mas_next(&mas, ULONG_MAX);
1627-
if (tmp && (tmp->vm_flags & VM_GROWSDOWN)) { /* Avoid prev check if possible */
1627+
if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if possible */
16281628
if (vm_start_gap(tmp) <= gap_end) {
16291629
high_limit = vm_start_gap(tmp);
16301630
mas_reset(&mas);

0 commit comments

Comments
 (0)