Skip to content
Draft
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 9 additions & 42 deletions clang/lib/StaticAnalyzer/Checkers/ArrayBoundChecker.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -478,17 +478,6 @@ static Messages getNonTaintMsgs(const ASTContext &ACtx,
std::string(Buf)};
}

static Messages getTaintMsgs(const MemSpaceRegion *Space,
const SubRegion *Region, const char *OffsetName,
bool AlsoMentionUnderflow) {
std::string RegName = getRegionName(Space, Region);
return {formatv("Potential out of bound access to {0} with tainted {1}",
RegName, OffsetName),
formatv("Access of {0} with a tainted {1} that may be {2}too large",
RegName, OffsetName,
AlsoMentionUnderflow ? "negative or " : "")};
}

const NoteTag *StateUpdateReporter::createNoteTag(CheckerContext &C) const {
// Don't create a note tag if we didn't assume anything:
if (!AssumedNonNegative && !AssumedUpperBound)
Expand Down Expand Up @@ -686,37 +675,15 @@ void ArrayBoundChecker::performCheck(const Expr *E, CheckerContext &C) const {
C.addTransition(ExceedsUpperBound, SUR.createNoteTag(C));
return;
}

BadOffsetKind Problem = AlsoMentionUnderflow
? BadOffsetKind::Indeterminate
: BadOffsetKind::Overflowing;
Messages Msgs =
getNonTaintMsgs(C.getASTContext(), Space, Reg, ByteOffset,
*KnownSize, Location, Problem);
reportOOB(C, ExceedsUpperBound, Msgs, ByteOffset, KnownSize);
return;
}
// ...and it can be valid as well...
if (isTainted(State, ByteOffset)) {
// ...but it's tainted, so report an error.

// Diagnostic detail: saying "tainted offset" is always correct, but
// the common case is that 'idx' is tainted in 'arr[idx]' and then it's
// nicer to say "tainted index".
const char *OffsetName = "offset";
if (const auto *ASE = dyn_cast<ArraySubscriptExpr>(E))
if (isTainted(State, ASE->getIdx(), C.getLocationContext()))
OffsetName = "index";

Messages Msgs =
getTaintMsgs(Space, Reg, OffsetName, AlsoMentionUnderflow);
reportOOB(C, ExceedsUpperBound, Msgs, ByteOffset, KnownSize,
/*IsTaintBug=*/true);
return;
}
// ...and it isn't tainted, so the checker will (optimistically) assume
// that the offset is in bounds and mention this in the note tag.
SUR.recordUpperBoundAssumption(*KnownSize);

BadOffsetKind Problem = AlsoMentionUnderflow
? BadOffsetKind::Indeterminate
: BadOffsetKind::Overflowing;
Messages Msgs = getNonTaintMsgs(C.getASTContext(), Space, Reg, ByteOffset,
*KnownSize, Location, Problem);
reportOOB(C, ExceedsUpperBound, Msgs, ByteOffset, KnownSize);
return;
}

// Actually update the state. The "if" only fails in the extremely unlikely
Expand Down Expand Up @@ -758,7 +725,7 @@ void ArrayBoundChecker::reportOOB(CheckerContext &C, ProgramStateRef ErrorState,
std::optional<NonLoc> Extent,
bool IsTaintBug /*=false*/) const {

ExplodedNode *ErrorNode = C.generateErrorNode(ErrorState);
ExplodedNode *ErrorNode = C.generateNonFatalErrorNode(ErrorState);
if (!ErrorNode)
return;

Expand Down
84 changes: 46 additions & 38 deletions clang/test/Analysis/ArrayBound/assumption-reporting.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ int TenElements[10];

int irrelevantAssumptions(int arg) {
int a = TenElements[arg];
// Here the analyzer assumes that `arg` is in bounds, but doesn't report this
// because `arg` is not interesting for the bug.
// expected-warning@-1 {{Out of bound access to memory around 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative or overflowing index, while it holds only 10 'int' elements}}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was working as intended. This checker isn't trying to force the user to check the bounds before every array access. Normally it's only telling the user that the OOB access definitely happens. Otherwise, even if the value isn't sufficiently constrained, this doesn't mean that it can actually take all the values it's constrained to. There may be implicit constraints that we aren't aware of.

And then there's the tainted case where we confirm that the value isn't merely unknown, it's actually read from an untrusted source. Like, when we've observed a literal read operation from user input or from network. In this case we're certain that there can be no implicit constraints. The constraints we've recorded so far are all we have. The value can truly be any value within those constraints. Then that's definitely a true positive.

So in this case we can't tell whether this was a false negative or a true negative. The checker is designed to stay silent here.

So if you're looking to catch this case, the right solution is to probably acknowledge the taint source. Eg., you may be interested in an "aggressive" mode that treats all function parameters as taint sources. Or, indeed, treat all unknown values as if they're tainted, which effectively means turning off the taint logic entirely like you did in your patch. But we gotta ask the user to explicitly opt in into such an aggressive mode because most of the in-the-wild users don't usually like this sort of stuff. For them the checker is working as intended.

int b = TenElements[13];
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at index 13, while it holds only 10 'int' elements}}
Expand All @@ -26,8 +26,10 @@ int irrelevantAssumptions(int arg) {

int assumingBoth(int arg) {
int a = TenElements[arg];
// expected-note@-1 {{Assuming index is non-negative and less than 10, the number of 'int' elements in 'TenElements'}}
int b = TenElements[arg]; // no additional note, we already assumed that 'arg' is in bounds
// expected-warning@-1 {{Out of bound access to memory around 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative or overflowing index, while it holds only 10 'int' elements}}
int b = TenElements[arg]; // expected-warning{{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-1 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
int c = TenElements[arg + 10];
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
Expand All @@ -39,9 +41,11 @@ int assumingBothPointerToMiddle(int arg) {
// will speak about the "byte offset" measured from the beginning of the TenElements.
int *p = TenElements + 2;
int a = p[arg];
// expected-note@-1 {{Assuming byte offset is non-negative and less than 40, the extent of 'TenElements'}}
// expected-warning@-1 {{Out of bound access to memory around 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative or overflowing index, while it holds only 10 'int' elements}}

int b = TenElements[arg]; // This is normal access, and only the lower bound is new.
int b = TenElements[arg]; // expected-warning{{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-1 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
int c = TenElements[arg + 10];
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
Expand All @@ -62,15 +66,16 @@ int assumingLower(int arg) {
}

int assumingUpper(int arg) {
// expected-note@+2 {{Assuming 'arg' is >= 0}}
// expected-note@+1 {{Taking false branch}}
// expected-note@+2 2{{Assuming 'arg' is >= 0}}
// expected-note@+1 2{{Taking false branch}}
if (arg < 0)
return 0;
int a = TenElements[arg];
// expected-note@-1 {{Assuming index is less than 10, the number of 'int' elements in 'TenElements'}}
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
int b = TenElements[arg - 10];
// expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative index}}
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
return a + b;
}

Expand All @@ -82,12 +87,13 @@ int assumingUpperIrrelevant(int arg) {
// filter out assumptions that are logically irrelevant but "touch"
// interesting symbols; eventually it would be good to add support for this.

// expected-note@+2 {{Assuming 'arg' is >= 0}}
// expected-note@+1 {{Taking false branch}}
// expected-note@+2 2{{Assuming 'arg' is >= 0}}
// expected-note@+1 2{{Taking false branch}}
if (arg < 0)
return 0;
int a = TenElements[arg];
// expected-note@-1 {{Assuming index is less than 10, the number of 'int' elements in 'TenElements'}}
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
int b = TenElements[arg + 10];
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
Expand All @@ -96,10 +102,11 @@ int assumingUpperIrrelevant(int arg) {

int assumingUpperUnsigned(unsigned arg) {
int a = TenElements[arg];
// expected-note@-1 {{Assuming index is less than 10, the number of 'int' elements in 'TenElements'}}
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
int b = TenElements[(int)arg - 10];
// expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative index}}
// expected-warning@-1 {{Out of bound access to memory around 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative or overflowing index, while it holds only 10 'int' elements}}
return a + b;
}

Expand All @@ -120,8 +127,10 @@ short assumingConvertedToCharP(int arg) {
// result type of the subscript operator.
char *cp = (char*)TenElements;
char a = cp[arg];
// expected-note@-1 {{Assuming index is non-negative and less than 40, the number of 'char' elements in 'TenElements'}}
char b = cp[arg]; // no additional note, we already assumed that 'arg' is in bounds
// expected-warning@-1 {{Out of bound access to memory around 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative or overflowing index, while it holds only 40 'char' elements}}
char b = cp[arg]; // expected-warning{{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-1 {{Access of 'TenElements' at an overflowing index, while it holds only 40 'char' elements}}
char c = cp[arg + 40];
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at an overflowing index, while it holds only 40 'char' elements}}
Expand All @@ -138,14 +147,14 @@ int assumingConvertedToIntP(struct foo f, int arg) {
// When indices are reported, the note will use the element type that's the
// result type of the subscript operator.
int a = ((int*)(f.a))[arg];
// expected-note@-1 {{Assuming index is non-negative and less than 2, the number of 'int' elements in 'f.a'}}
// However, if the extent of the memory region is not divisible by the
// element size, the checker measures the offset and extent in bytes.
// expected-warning@-1 {{Out of bound access to memory around 'f.a'}}
// expected-note@-2 {{Access of 'f.a' at a negative or overflowing index, while it holds only 2 'int' elements}}
int b = ((int*)(f.b))[arg];
// expected-note@-1 {{Assuming byte offset is less than 5, the extent of 'f.b'}}
// expected-warning@-1 {{Out of bound access to memory after the end of 'f.b'}}
// expected-note@-2 {{Access of 'f.b' at an overflowing byte offset, while it holds only 5 bytes}}
int c = TenElements[arg-2];
// expected-warning@-1 {{Out of bound access to memory preceding 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative index}}
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
return a + b + c;
}

Expand All @@ -154,16 +163,14 @@ int assumingPlainOffset(struct foo f, int arg) {
// shorter "offset" instead of "byte offset" when it's irrelevant that the
// offset is measured in bytes.

// expected-note@+2 {{Assuming 'arg' is < 2}}
// expected-note@+1 {{Taking false branch}}
// expected-note@+2 2{{Assuming 'arg' is < 2}}
// expected-note@+1 2{{Taking false branch}}
if (arg >= 2)
return 0;

int b = ((int*)(f.b))[arg];
// expected-note@-1 {{Assuming byte offset is non-negative and less than 5, the extent of 'f.b'}}
// FIXME: this should be {{Assuming offset is non-negative}}
// but the current simplification algorithm doesn't realize that arg <= 1
// implies that the byte offset arg*4 will be less than 5.
// expected-warning@-1 {{Out of bound access to memory around 'f.b'}}
// expected-note@-2 {{Access of 'f.b' at a negative or overflowing byte offset, while it holds only 5 bytes}}

int c = TenElements[arg+10];
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
Expand All @@ -181,13 +188,14 @@ int assumingExtent(int arg) {
int *mem = (int*)malloc(arg);

mem[12] = 123;
// expected-note@-1 {{Assuming index '12' is less than the number of 'int' elements in the heap area}}
// expected-warning@-1 {{Out of bound access to memory after the end of the heap area}}
// expected-note@-2 {{Access of 'int' element in the heap area at index 12}}

free(mem);

return TenElements[arg];
// expected-warning@-1 {{Out of bound access to memory after the end of 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at an overflowing index, while it holds only 10 'int' elements}}
// expected-warning@-1 {{Out of bound access to memory around 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative or overflowing index, while it holds only 10 'int' elements}}
}

int *extentInterestingness(int arg) {
Expand All @@ -196,7 +204,8 @@ int *extentInterestingness(int arg) {
int *mem = (int*)malloc(arg);

TenElements[arg] = 123;
// expected-note@-1 {{Assuming index is non-negative and less than 10, the number of 'int' elements in 'TenElements'}}
// expected-warning@-1 {{Out of bound access to memory around 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative or overflowing index, while it holds only 10 'int' elements}}

return &mem[12];
// expected-warning@-1 {{Out of bound access to memory after the end of the heap area}}
Expand All @@ -208,8 +217,7 @@ int triggeredByAnyReport(int arg) {
// not limited to ArrayBound reports but will appear on any bug report (that
// marks the relevant symbol as interesting).
TenElements[arg + 10] = 8;
// expected-note@-1 {{Assuming index is non-negative and less than 10, the number of 'int' elements in 'TenElements'}}
// expected-warning@-1 {{Out of bound access to memory around 'TenElements'}}
// expected-note@-2 {{Access of 'TenElements' at a negative or overflowing index, while it holds only 10 'int' elements}}
return 1024 >> arg;
// expected-warning@-1 {{Right operand is negative in right shift}}
// expected-note@-2 {{The result of right shift is undefined because the right operand is negative}}
}
Loading
Loading