[TSan] Make Shadow/Meta region inclusive-exclusive #144647

Camsyn · 2025-06-18T07:13:15Z

This commit changes the interval shadow/meta address check from inclusive-inclusive ( $[\mathrm{start}, \mathrm{end}]$ ) to inclusive-exclusive ( $[\mathrm{start}, \mathrm{end})$ ), to resolve the ambiguity of the end point address. This also aligns the logic with the check for isAppMem (i.e., inclusive-exclusive), ensuring consistent behavior across all memory classifications.

The isShadowMem and isMetaMem checks previously used an inclusive-inclusive interval, i.e., $[\mathrm{start}, \mathrm{end}]$, which could lead to a boundary address being incorrectly classified as both Shadow and Meta memory, e.g., 0x3000_0000_0000 in Mapping48AddressSpace.
- What's more, even when Shadow doesn't border Meta, ShadowMem::end cannot be considered a legal shadow address, as TSan protects the gap, i.e., ProtectRange(ShadowEnd(), MetaShadowBeg());
ShadowMem/MetaMem addresses are derived from AppMem using an affine-like transformation (* factor + bias). This transformation includes two extra modifications: high- and low-order bits are masked out, and for Shadow Memory, an optional XOR operation may be applied to prevent conflicts with certain AppMem regions.
- Given that all AppMem regions are defined as inclusive-exclusive intervals, $[\mathrm{start}, \mathrm{end})$, the resulting Shadow/Meta regions should logically also be inclusive-exclusive.

Note: This change is purely for improving code consistency and should have no functional impact. In practice, the exact endpoint addresses of the Shadow/Meta regions are generally not reached.

This commit changes the interval shadow/meta address check from inclusive-inclusive ( [start, end] ) to inclusive-exclusive ( [start, end) ), to resolve the ambiguity of the end point address. This also aligns the logic with the check for `isAppMem`, ensuring consistent behavior across all memory classifications. 1. The `isShadowMem` and `isMetaMem` checks previously used an inclusive-inclusive interval, i.e., [start, end], which could lead to a boundary address being incorrectly classified as both Shadow and Meta memory, e.g., 0x3000_0000_0000 in `Mapping48AddressSpace`. - What's more, even when Shadow doesn't border Meta, `ShadowMem::end` cannot be considered a legal shadow address, as TSan protects the gap, i.e., `ProtectRange(ShadowEnd(), MetaShadowBeg());` 2. `ShadowMem`/`MetaMem` addresses are derived from `AppMem` using an affine-like transformation (`* factor + bias`). This transformation includes two extra modifications: high- and low-order bits are masked out, and for Shadow Memory, an optional XOR operation may be applied to prevent conflicts with certain AppMem regions. - Given that all AppMem regions are defined as inclusive-exclusive intervals, $[\mathrm{start}, \mathrm{end})$, the resulting Shadow/Meta regions should logically also be inclusive-exclusive. Note: This change is purely for improving code consistency and should have no functional impact. In practice, the exact endpoint addresses of the Shadow/Meta regions are generally not reached.

llvmbot · 2025-06-18T07:13:46Z

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Kunqiu Chen (Camsyn)

Changes

This commit changes the interval shadow/meta address check from inclusive-inclusive ( $[\mathrm{start}, \mathrm{end}]$ ) to inclusive-exclusive ( $[\mathrm{start}, \mathrm{end})$ ), to resolve the ambiguity of the end point address. This also aligns the logic with the check for isAppMem (i.e., inclusive-exclusive), ensuring consistent behavior across all memory classifications.

The isShadowMem and isMetaMem checks previously used an inclusive-inclusive interval, i.e., $[\mathrm{start}, \mathrm{end}]$, which could lead to a boundary address being incorrectly classified as both Shadow and Meta memory, e.g., 0x3000_0000_0000 in Mapping48AddressSpace.
- What's more, even when Shadow doesn't border Meta, ShadowMem::end cannot be considered a legal shadow address, as TSan protects the gap, i.e., ProtectRange(ShadowEnd(), MetaShadowBeg());
ShadowMem/MetaMem addresses are derived from AppMem using an affine-like transformation (* factor + bias). This transformation includes two extra modifications: high- and low-order bits are masked out, and for Shadow Memory, an optional XOR operation may be applied to prevent conflicts with certain AppMem regions.
- Given that all AppMem regions are defined as inclusive-exclusive intervals, $[\mathrm{start}, \mathrm{end})$, the resulting Shadow/Meta regions should logically also be inclusive-exclusive.

Note: This change is purely for improving code consistency and should have no functional impact. In practice, the exact endpoint addresses of the Shadow/Meta regions are generally not reached.

Full diff: https://github.com/llvm/llvm-project/pull/144647.diff

2 Files Affected:

(modified) compiler-rt/lib/tsan/rtl/tsan_platform.h (+2-2)
(modified) compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp (+3-3)

diff --git a/compiler-rt/lib/tsan/rtl/tsan_platform.h b/compiler-rt/lib/tsan/rtl/tsan_platform.h
index 354f6da6a64a1..ada594bc11fc7 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_platform.h
+++ b/compiler-rt/lib/tsan/rtl/tsan_platform.h
@@ -931,7 +931,7 @@ bool IsAppMem(uptr mem) { return SelectMapping<IsAppMemImpl>(mem); }
 struct IsShadowMemImpl {
   template <typename Mapping>
   static bool Apply(uptr mem) {
-    return mem >= Mapping::kShadowBeg && mem <= Mapping::kShadowEnd;
+    return mem >= Mapping::kShadowBeg && mem < Mapping::kShadowEnd;
   }
 };
 
@@ -943,7 +943,7 @@ bool IsShadowMem(RawShadow *p) {
 struct IsMetaMemImpl {
   template <typename Mapping>
   static bool Apply(uptr mem) {
-    return mem >= Mapping::kMetaShadowBeg && mem <= Mapping::kMetaShadowEnd;
+    return mem >= Mapping::kMetaShadowBeg && mem < Mapping::kMetaShadowEnd;
   }
 };
 
diff --git a/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp b/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
index cf07686d968dc..b0ce5680c95c1 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp
@@ -525,7 +525,7 @@ ALWAYS_INLINE USED void UnalignedMemoryAccess(ThreadState* thr, uptr pc,
 void ShadowSet(RawShadow* p, RawShadow* end, RawShadow v) {
   DCHECK_LE(p, end);
   DCHECK(IsShadowMem(p));
-  DCHECK(IsShadowMem(end));
+  DCHECK(p == end || IsShadowMem(end - 1));
   UNUSED const uptr kAlign = kShadowCnt * kShadowSize;
   DCHECK_EQ(reinterpret_cast<uptr>(p) % kAlign, 0);
   DCHECK_EQ(reinterpret_cast<uptr>(end) % kAlign, 0);
@@ -675,7 +675,7 @@ void MemoryAccessRangeT(ThreadState* thr, uptr pc, uptr addr, uptr size) {
     Printf("Access to non app mem start: %p\n", (void*)addr);
     DCHECK(IsAppMem(addr));
   }
-  if (!IsAppMem(addr + size - 1)) {
+  if (size > 0 && !IsAppMem(addr + size - 1)) {
     Printf("Access to non app mem end: %p\n", (void*)(addr + size - 1));
     DCHECK(IsAppMem(addr + size - 1));
   }
@@ -686,7 +686,7 @@ void MemoryAccessRangeT(ThreadState* thr, uptr pc, uptr addr, uptr size) {
 
   RawShadow* shadow_mem_end = reinterpret_cast<RawShadow*>(
       reinterpret_cast<uptr>(shadow_mem) + size * kShadowMultiplier - 1);
-  if (!IsShadowMem(shadow_mem_end)) {
+  if (size > 0 && !IsShadowMem(shadow_mem_end)) {
     Printf("Bad shadow end addr: %p (%p)\n", shadow_mem_end,
            (void*)(addr + size - 1));
     Printf(

melver · 2025-06-18T07:33:44Z

compiler-rt/lib/tsan/rtl/tsan_rtl_access.cpp

    DCHECK(IsAppMem(addr));
  }
-  if (!IsAppMem(addr + size - 1)) {
+  if (size > 0 && !IsAppMem(addr + size - 1)) {


This is called from MemoryAccessRange, which checks if size==0, and since size is uptr size will always be > 0. Therefore this check is pointless.

I understand this.
But MemoryAccessRangeT did not guarantee size != 0 in its context -- is it possible for MemoryAccessRangeT to be called in the future without checking for size == 0?

It just so happens that these checks are wrapped in SANITIZER_DEBUG macro, so I think adding this check is better than adding an assertion DCHECK(size != 0).

But MemoryAccessRangeT did not guarantee size != 0 in its context -- is it possible for MemoryAccessRangeT to be called in the future without checking for size == 0?

We don't know.

But it's clear currently an invariant here is size != 0, so why not add a DCHECK and catch if that assumption changes in future?

But it's clear currently an invariant here is size != 0, so why not add a DCHECK and catch if that assumption changes in future?

Done.

thurstond

Thanks for the patch!

thurstond · 2025-06-18T19:33:19Z

@hctim FYI Meta is becoming exclusive

vitalybuka

Unfortunately most of sanitizer code uses inclusive end,
which is annoying given that typical C++ convention is exclusive end.

LGTM to incrementally change, especially if there is an issue.

Camsyn · 2025-06-20T07:34:18Z

This change introduces some test fails of mmap_lots.cpp.

After analysis, I believe it is due to the following reasons:

Large mmap -> MemoryRangeSet -> only set the first and last page of shadow.

static void MemoryRangeSet(uptr addr, uptr size, RawShadow val) {
  if (size == 0)
    return;
  RawShadow* begin = MemToShadow(addr);
  // end == ShadowMem::end
  RawShadow* end = begin + size / kShadowCell * kShadowCnt;
  ...
  RawShadow* mid2 = RoundDown(end, kPageSize);
  // What if mid2 == end == ShadowMem::end?
  ShadowSet(mid2, end, val);
  

void ShadowSet(RawShadow* p, RawShadow* end, RawShadow v) {
  DCHECK_LE(p, end); // O: p == end
  DCHECK(IsShadowMem(p)); // X : p == ShadowMem::end 
  DCHECK(p == end || IsShadowMem(end - 1));

I think this actually exposes some degree of flaw in the current implementation.

Possible fixes:

Do not allow p == end in ShadowSet
- This is reasonable because both callers of ShadowSet (i.e., MemoryRangeSet and __tsan_java_move) are actually guaranteed that size > 0.
- Although MemoryRangeSet's dedicated handling for overly large mmap bypasses this limit, we can append a guard that checks for size > 0, i.e., if (mid2 < end) ShadowSet(mid2, end, val);
```
//   DCHECK_LE(p, end);
DCHECK_LT(p, end);
```

Modify the assertion in ShadowSet as follows:

// DCHECK(IsShadowMem(p));
DCHECK(p == end || IsShadowMem(p));

ShadowSet : Just return while p == end.

 void ShadowSet(RawShadow* p, RawShadow* end, RawShadow v) {
    if (p == end) return;

Camsyn requested review from fmayer, melver and vitalybuka June 18, 2025 07:13

llvmbot added compiler-rt compiler-rt:tsan Thread sanitizer compiler-rt:sanitizer labels Jun 18, 2025

melver requested a review from dvyukov June 18, 2025 07:29

melver reviewed Jun 18, 2025

View reviewed changes

melver requested a review from thurstond June 18, 2025 08:52

Add a DCHECK adhering to the suggestion

2e26762

thurstond approved these changes Jun 18, 2025

View reviewed changes

fmayer approved these changes Jun 18, 2025

View reviewed changes

vitalybuka approved these changes Jun 18, 2025

View reviewed changes

Camsyn merged commit 681db06 into llvm:main Jun 19, 2025
7 checks passed

Camsyn mentioned this pull request Jun 20, 2025

[TSan] Fix p == end == ShadowMem::end in ShadowSet #144994

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TSan] Make Shadow/Meta region inclusive-exclusive #144647

[TSan] Make Shadow/Meta region inclusive-exclusive #144647

Uh oh!

Camsyn commented Jun 18, 2025

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

melver Jun 18, 2025

Uh oh!

Camsyn Jun 18, 2025 •

edited

Loading

Uh oh!

melver Jun 18, 2025

Uh oh!

Camsyn Jun 18, 2025

Uh oh!

thurstond left a comment

Uh oh!

thurstond commented Jun 18, 2025

Uh oh!

vitalybuka left a comment

Uh oh!

Uh oh!

Camsyn commented Jun 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[TSan] Make Shadow/Meta region inclusive-exclusive #144647

[TSan] Make Shadow/Meta region inclusive-exclusive #144647

Uh oh!

Conversation

Camsyn commented Jun 18, 2025

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

melver Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

Camsyn Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

melver Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

Camsyn Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

thurstond left a comment

Choose a reason for hiding this comment

Uh oh!

thurstond commented Jun 18, 2025

Uh oh!

vitalybuka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Camsyn commented Jun 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Camsyn Jun 18, 2025 •

edited

Loading