Skip to content

Commit 8ae4b67

Browse files
committed
Improve the comments
This commit cleans up some typos thet were reported by the reviewers and tries to provide better explanations for some parts of the patch that turned out to be confusing.
1 parent 77fb227 commit 8ae4b67

File tree

4 files changed

+50
-33
lines changed

4 files changed

+50
-33
lines changed

clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h

Lines changed: 24 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -121,19 +121,28 @@ struct EvalCallOptions {
121121
EvalCallOptions() {}
122122
};
123123

124-
/// Simple control flow statements like `if` only produce a single state split,
125-
/// so the fact that they are included in the source code implies that both
126-
/// branches are possible (at least under some conditions) and the analyzer can
127-
/// freely assume either of them. (This is not entirely true, because there may
128-
/// be unmarked logical correlations between `if` statements, but is a good
129-
/// enough heuristic and the analyzer strongly relies on it.)
130-
/// On the other hand, in a loop the state may be split repeatedly at each
131-
/// evaluation of the loop condition, and this can lead to following "weak"
132-
/// assumptions even though the code does not imply that they're valid and the
133-
/// programmer intended to cover them.
134-
/// This function is called to mark the `State` when the engine makes an
135-
/// assumption which is weak. Checkers may use this heuristical mark to discard
136-
/// result and reduce the amount of false positives.
124+
/// Simple control flow statements like `if` can only produce a single two-way
125+
/// state split, so when the analyzer cannot determine the value of the
126+
/// condition, it can assume either of the two options, because the fact that
127+
/// they are in the source code implies that the programmer thought that they
128+
/// are possible (at least under some conditions).
129+
/// (Note that this heuristic is not entirely correct when there are _several_
130+
/// `if` statements with unmarked logical connections between them, but it's
131+
/// still good enough and the analyzer heavily relies on it.)
132+
/// In contrast with this, a single loop statement can produce multiple state
133+
/// splits, and we cannot always single out safe assumptions where we can say
134+
/// that "the programmer included this loop in the source code, so they clearly
135+
/// thought that this execution path is possible".
136+
/// However, the analyzer wants to explore the code in and after the loop, so
137+
/// it makes assumptions about the loop condition (to get a concrete execution
138+
/// path) even when they are not justified.
139+
/// This function is called by the engine to mark the `State` when it makes an
140+
/// assumption which is "weak". Checkers may use this heuristical mark to
141+
/// discard the result and reduce the amount of false positives.
142+
/// TODO: Instead of just marking these branches for checker-specific handling,
143+
/// we could discard them completely. I suspect that this could eliminate some
144+
/// false positives without suppressing too many true positives, but I didn't
145+
/// have time to measure its effects.
137146
ProgramStateRef recordWeakLoopAssumption(ProgramStateRef State);
138147

139148
/// Returns true if `recordWeakLoopAssumption()` was called on the execution
@@ -341,9 +350,9 @@ class ExprEngine {
341350
ExplodedNode *Pred);
342351

343352
/// ProcessBranch - Called by CoreEngine. Used to generate successor
344-
/// nodes by processing the 'effects' of a branch condition.
353+
/// nodes by processing the 'effects' of a branch condition.
345354
/// If the branch condition is a loop condition, IterationsFinishedInLoop is
346-
/// the number of already finished iterations (0, 1, 2...); otherwise it's
355+
/// the number of already finished iterations (0, 1, 2, ...); otherwise it's
347356
/// std::nullopt.
348357
void processBranch(const Stmt *Condition, NodeBuilderContext &BuilderCtx,
349358
ExplodedNode *Pred, ExplodedNodeSet &Dst,

clang/lib/StaticAnalyzer/Core/CoreEngine.cpp

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -448,13 +448,12 @@ void CoreEngine::HandleBranch(const Stmt *Cond, const Stmt *Term,
448448
Counter.getNumVisited(LC->getStackFrame(), B->getBlockID());
449449
std::optional<unsigned> IterationsFinishedInLoop = std::nullopt;
450450
if (isa<ForStmt, WhileStmt, CXXForRangeStmt>(Term)) {
451-
// FIXME: This code approximates the number of finished iteration based on
451+
// FIXME: This code approximates the number of finished iterations based on
452452
// the block count, i.e. the number of evaluations of the terminator block
453453
// on the current execution path (which includes the current evaluation, so
454-
// is always at least 1). This is probably acceptable for the
455-
// checker-specific false positive suppression that currently uses this
456-
// value, but it would be better to calcuate an accurate count of
457-
// iterations.
454+
// is always >= 1). This is probably acceptable for the checker-specific
455+
// false positive suppression that currently uses this value, but it would
456+
// be better to calcuate an accurate count of iterations.
458457
assert(BlockCount >= 1);
459458
IterationsFinishedInLoop = BlockCount - 1;
460459
} else if (isa<DoStmt>(Term)) {

clang/lib/StaticAnalyzer/Core/ExprEngine.cpp

Lines changed: 12 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -226,9 +226,8 @@ bool clang::ento::seenWeakLoopAssumption(ProgramStateRef State) {
226226

227227
// This trait points to the last expression (logical operator) where an eager
228228
// assumption introduced a state split (i.e. both cases were feasible). This is
229-
// used by the WeakLoopAssumption heuristic to find situations where the an
230-
// eager assumption introduces a state split within the evaluation of a loop
231-
// condition.
229+
// used by the WeakLoopAssumption heuristic to find situations where an eager
230+
// assumption introduces a state split in the evaluation of a loop condition.
232231
REGISTER_TRAIT_WITH_PROGRAMSTATE(LastEagerlyAssumeAssumptionAt, const Expr *)
233232

234233
//===----------------------------------------------------------------------===//
@@ -2838,8 +2837,12 @@ void ExprEngine::processBranch(
28382837
const Expr *EagerlyAssumeExpr =
28392838
PrevState->get<LastEagerlyAssumeAssumptionAt>();
28402839
const Expr *ConditionExpr = dyn_cast<Expr>(Condition);
2841-
if (ConditionExpr)
2840+
if (ConditionExpr) {
2841+
// Ignore casts to ensure equivalent behavior with and without
2842+
// eagerly-assume. This is a mostly theoretical question an I don't see a
2843+
// good reason for putting casts around a conditional expression.
28422844
ConditionExpr = ConditionExpr->IgnoreParenCasts();
2845+
}
28432846
bool DidEagerlyAssume = EagerlyAssumeExpr == ConditionExpr;
28442847
bool BothFeasible = (DidEagerlyAssume || (StTrue && StFalse)) &&
28452848
builder.isFeasible(true) && builder.isFeasible(false);
@@ -2852,9 +2855,11 @@ void ExprEngine::processBranch(
28522855
// When programmers write a loop, they imply that at least two
28532856
// iterations are possible (otherwise they would just write an `if`),
28542857
// but the third iteration is not implied: there are situations where
2855-
// the programmer knows that there won't be a third iteration (e.g.
2856-
// they iterate over a structure that has <= 2 elements) but this is
2857-
// not marked in the source code.
2858+
// the programmer knows that there won't be a third iteration, but
2859+
// this is not marked in the source code. (For example, the ffmpeg
2860+
// project has 2-element arrays which are accessed from loops where
2861+
// the number of steps is opaque and the analyzer cannot deduce that
2862+
// there are <= 2 iterations.)
28582863
// Checkers may use this heuristic mark to discard results found on
28592864
// branches that contain this "weak" assumption.
28602865
StTrue = recordWeakLoopAssumption(StTrue);

clang/test/Analysis/out-of-bounds.c

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -210,10 +210,10 @@ int loop_suppress_after_zero_iterations(unsigned len) {
210210
// Previously this would have produced an overflow warning because splitting
211211
// the state on the loop condition introduced an execution path where the
212212
// analyzer thinks that len == 0.
213-
// There are very many situations where the programmer knows that an argument
214-
// is positive, but this is not indicated in the source code, so we must
215-
// avoid reporting errors (especially out of bounds errors) on these
216-
// branches, because otherwise we'd get prohibitively many false positives.
213+
// There are many situations where the programmer knows that an argument is
214+
// positive, but this is not indicated in the source code, so we must avoid
215+
// reporting errors (especially out of bounds errors) on these branches,
216+
// because otherwise we'd get prohibitively many false positives.
217217
return GlobalArray[len - 1]; // no-warning
218218
}
219219

@@ -231,7 +231,8 @@ void loop_suppress_in_third_iteration(int len) {
231231
for (int i = 0; i < len; i++) {
232232
// We should suppress array bounds errors on the third and later iterations
233233
// of loops, because sometimes programmers write a loop in sitiuations
234-
// where they know that there will be at most two iterations.
234+
// where they know that there will be at most two iterations, but the
235+
// analyzer cannot deduce this property.
235236
buf[i] = 1; // no-warning
236237
}
237238
}
@@ -263,7 +264,10 @@ void loop_suppress_in_third_iteration_logical_and(int len, int flag) {
263264
void loop_suppress_in_third_iteration_logical_and_2(int len, int flag) {
264265
int buf[2] = {0};
265266
for (int i = 0; flag && i < len; i++) {
266-
// If the two operands of '&&' are flipped, the suppression works.
267+
// If the two operands of '&&' are flipped, the suppression works, because
268+
// then 'flag' is the terminator statement associated with '&&' (which
269+
// determines whether the short-circuiting happens or not) and 'i < len' is
270+
// the terminator statement of the loop itself.
267271
buf[i] = 1; // no-warning
268272
}
269273
}

0 commit comments

Comments
 (0)