Skip to content

Commit 9971844

Browse files
[analyzer] Canonicalize the Decls of FieldRegions (#156668)
When calculating the offset of a FieldRegion, we need to find out which field index the given field refers to. Previously, if for some reason the field was not found, then the `Idx` passed to `Layout.getFieldOffset` was out of bounds and caused undefined behavior when dereferenced an out of bounds element in `ASTVector::FieldOffsets::operator[]`, which asserts this in debug builds, but exposes the undefined behavior in release builds. In this patch, I refactored how we enumerate the fields, and gracefully handle the scenario where the field is not found. That case is still bad, but at least it should not expose the undefined behavior in release builds, and should assert earlier in debug builds than before. The motivational case was transformed into a regression test, that would fail if no canonicalization would happen when creating a FieldRegion. This was reduced from a production crash. In the test case, due to how modules work, there would be multiple copies of the same template specialization in the AST. This could lead into inconsistent state when the FieldRegion's Decl was different to the RecordDecl's field - because one referred to the first and the other to the second. This made `calculateOffset` fail to compute the field index, triggering the undefined behavior in production. While this inconsistency gets fixed now, I think the assertion is still warranted to avoid potential undefined behavior in release builds. CPP-6691,CPP-6849 Co-authored-by: Marco Borgeaud <[email protected]>
1 parent a23a5b0 commit 9971844

File tree

3 files changed

+55
-13
lines changed

3 files changed

+55
-13
lines changed

clang/include/clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1548,8 +1548,8 @@ class MemRegionManager {
15481548
/// a specified FieldDecl. 'superRegion' corresponds to the containing
15491549
/// memory region (which typically represents the memory representing
15501550
/// a structure or class).
1551-
const FieldRegion *getFieldRegion(const FieldDecl *fd,
1552-
const SubRegion* superRegion);
1551+
const FieldRegion *getFieldRegion(const FieldDecl *FD,
1552+
const SubRegion *SuperRegion);
15531553

15541554
const FieldRegion *getFieldRegionWithSuper(const FieldRegion *FR,
15551555
const SubRegion *superRegion) {

clang/lib/StaticAnalyzer/Core/MemRegion.cpp

Lines changed: 18 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1268,10 +1268,10 @@ const SymbolicRegion *MemRegionManager::getSymbolicHeapRegion(SymbolRef Sym) {
12681268
return getSubRegion<SymbolicRegion>(Sym, getHeapRegion());
12691269
}
12701270

1271-
const FieldRegion*
1272-
MemRegionManager::getFieldRegion(const FieldDecl *d,
1273-
const SubRegion* superRegion){
1274-
return getSubRegion<FieldRegion>(d, superRegion);
1271+
const FieldRegion *
1272+
MemRegionManager::getFieldRegion(const FieldDecl *FD,
1273+
const SubRegion *SuperRegion) {
1274+
return getSubRegion<FieldRegion>(FD->getCanonicalDecl(), SuperRegion);
12751275
}
12761276

12771277
const ObjCIvarRegion*
@@ -1704,16 +1704,23 @@ static RegionOffset calculateOffset(const MemRegion *R) {
17041704
if (SymbolicOffsetBase)
17051705
continue;
17061706

1707-
// Get the field number.
1708-
unsigned idx = 0;
1709-
for (RecordDecl::field_iterator FI = RD->field_begin(),
1710-
FE = RD->field_end(); FI != FE; ++FI, ++idx) {
1711-
if (FR->getDecl() == *FI)
1712-
break;
1707+
assert(FR->getDecl()->getCanonicalDecl() == FR->getDecl());
1708+
auto MaybeFieldIdx = [FR, RD]() -> std::optional<unsigned> {
1709+
for (auto [Idx, Field] : llvm::enumerate(RD->fields())) {
1710+
if (FR->getDecl() == Field->getCanonicalDecl())
1711+
return Idx;
1712+
}
1713+
return std::nullopt;
1714+
}();
1715+
1716+
if (!MaybeFieldIdx.has_value()) {
1717+
assert(false && "Field not found");
1718+
goto Finish; // Invalid offset.
17131719
}
1720+
17141721
const ASTRecordLayout &Layout = R->getContext().getASTRecordLayout(RD);
17151722
// This is offset in bits.
1716-
Offset += Layout.getFieldOffset(idx);
1723+
Offset += Layout.getFieldOffset(MaybeFieldIdx.value());
17171724
break;
17181725
}
17191726
}
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
// RUN: rm -rf %t
2+
// RUN: mkdir -p %t
3+
// RUN: split-file %s %t
4+
//
5+
// DEFINE: %{common-flags}= -std=c++20 -I %t -fprebuilt-module-path=%t
6+
//
7+
// RUN: %clang_cc1 %{common-flags} %t/other.cppm -emit-module-interface -o %t/other.pcm
8+
// RUN: %clang_analyze_cc1 -analyzer-checker=core %{common-flags} %t/entry.cppm -verify
9+
10+
11+
//--- MyStruct.h
12+
template <typename T> struct MyStruct {
13+
T data = 0;
14+
};
15+
template struct MyStruct<int>; // Explicit template instantiation.
16+
17+
//--- other.cppm
18+
module;
19+
#include "MyStruct.h"
20+
export module other;
21+
static void implicit_instantiate_MyStruct() {
22+
MyStruct<int> var;
23+
(void)var;
24+
}
25+
26+
//--- entry.cppm
27+
// expected-no-diagnostics
28+
module;
29+
#include "MyStruct.h"
30+
module other;
31+
32+
void entry_point() {
33+
MyStruct<int> var; // no-crash
34+
(void)var;
35+
}

0 commit comments

Comments
 (0)