-
Notifications
You must be signed in to change notification settings - Fork 14.9k
[Sema] Diagnose use of if/else-if condition variable inside else-if/else branch(s) #156436
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So from a high-level point of view, I definitely think there is value in having this because I personally have made this mistake so many times, and honestly, I think we should just straight-up error on it (or at least make it a warning that defaults to an error) because I cannot think of a situation where doing this would ever be valid.
That said, using an AST visitor for this is... probably not great because we now do an AST traversal on every else
branch ever, which sounds like it might hurt performance quite a bit—that said, this is pretty much my only concern w/ this patch, so if it turns out that this isn’t the case at all then this approach is fine I suppose.
CC @nikic could you put this on llvm-compile-time-tracker so we know what the overhead of this is if any?
At the same time, I’m not sure I can think of a better way to do this. I suppose another option might be to add a flag to VarDecl
that keeps track of whether it’s declared in the condition of an if
statement and then somehow have scope information to figure out where a given use of it is relative to that if
statement, but that sounds like it’d get a bit complicated, and I’m not sure running this sort of check on every DRE would be any better...
(Side note: sometimes when there are performance considerations we prefer to move these sorts of diagnostics into the static analyser, but I think this is situation is bad enough to where we want to do this in the compiler proper—at least I can think of many instances where this would have helped me quite a bit.)
@@ -13594,6 +13594,9 @@ def warn_acc_var_referenced_lacks_op | |||
"reference has no effect">, | |||
InGroup<DiagGroup<"openacc-var-lacks-operation">>, | |||
DefaultError; | |||
def warn_out_of_scope_var_usage | |||
: Warning<"variable %0 used in else/else if block is out of scope">, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
: Warning<"variable %0 used in else/else if block is out of scope">, | |
: Warning<"variable %0 used in 'else'/'else if' block is out of scope">, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable is in scope per the C++ spec, so that's technically untrue and/or misleading.
I'd rather go for something like 'variable %s declared in if block is {null|zero|falsey|etc} in this branch'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Alcaro As you said 'variable %s declared in if block is {null|zero|falsey|etc} in this branch'. Should we write if/if-else instead of if, because the variable can be declared in if-else block too right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd vote no. The variable is declared in the if part of the if-statement, not in the else.
If you mean if/else-if, then I'd still vote no - else-if is not a thing in the C++ spec, it's just an if-statement whose statement-false is another if-statement (and the variable is still declared inside the if part). No need to be pedantic at the expense of conciseness.
|
||
if (ConditionVar) { | ||
struct ElseVariableUsageChecker | ||
: public RecursiveASTVisitor<ElseVariableUsageChecker> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you’re going to do it this way, please use DynamicRecursiveASTVisitor
instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Sirraide RecursiveASTVisitor uses CRTP method which is already optimized if I am not wrong. So, don't you think that DynamicRecursiveASTVisitor will slow down the compilation process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CRTP-based version has a horrible binary size (and compile-time) overhead and we’ve actively tried to stop using it as much as possible; see #110040 for more context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should do another pass and migrate any visitors that may have been introduced in the meantime or are still left unmigrated
VarDecl *ConditionVar = nullptr; | ||
|
||
if (auto *CondVar = Cond.get().first) { | ||
ConditionVar = dyn_cast<VarDecl>(CondVar); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VarDecl *ConditionVar = nullptr; | |
if (auto *CondVar = Cond.get().first) { | |
ConditionVar = dyn_cast<VarDecl>(CondVar); | |
} | |
VarDecl *ConditionVar = dyn_cast_if_present<VarDecl>(Cond.get().first); |
Oh yeah, two more things I was going to point out: we should definiltely have some tests in unevaluated contexts (e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also have done this a ton, but it is something we need to leave legal, as it IS a pattern folks actually use intentionally. So making this an error isn't something we can really do.
Doing this as a visitor is, IMO, not the right solution here. We can actually do this via https://clang.llvm.org/doxygen/classclang_1_1Scope.html I think instead? I wonder if anyone else has better ideas, but a visitor on every variable reference inside of an else is not acceptable.
Right, I think I wasn’t quite awake yet earlier because for some reason I was only thinking about the very specific use case where the After thinking about this some more, I wonder if it would make sense to restrict this warning to situations where we both
Not doing 1. would cause us to diagnose cases such as if (Foo* x = foo(); not x) {
// ...
} else {
bar(*x); // We should not diagnose this.
} And 2. should help minimise false positives; e.g. using a variable that is known to be null in the else branch feels like a bit of a weird thing to do—though I guess someone could reuse it by assigning to it.
Yeah, that’s more or less what I meant by ‘somehow have scope information to figure out where a given use of it is’. |
You can test this locally with the following command:git-clang-format --diff origin/main HEAD --extensions cpp -- clang/test/Sema/warn-conditional-scope.cpp clang/lib/Sema/SemaStmt.cpp
View the diff from clang-format here.diff --git a/clang/lib/Sema/SemaStmt.cpp b/clang/lib/Sema/SemaStmt.cpp
index f0f5c6238..a6dc71973 100644
--- a/clang/lib/Sema/SemaStmt.cpp
+++ b/clang/lib/Sema/SemaStmt.cpp
@@ -21,11 +21,11 @@
#include "clang/AST/ExprCXX.h"
#include "clang/AST/ExprObjC.h"
#include "clang/AST/IgnoreExpr.h"
+#include "clang/AST/RecursiveASTVisitor.h"
#include "clang/AST/StmtCXX.h"
#include "clang/AST/StmtObjC.h"
#include "clang/AST/TypeLoc.h"
#include "clang/AST/TypeOrdering.h"
-#include "clang/AST/RecursiveASTVisitor.h"
#include "clang/Basic/TargetInfo.h"
#include "clang/Lex/Preprocessor.h"
#include "clang/Sema/EnterExpressionEvaluationContext.h"
@@ -978,26 +978,28 @@ StmtResult Sema::ActOnIfStmt(SourceLocation IfLoc,
if (auto *CondVar = Cond.get().first) {
ConditionVar = dyn_cast<VarDecl>(CondVar);
- }
+ }
if (ConditionVar) {
struct ElseVariableUsageChecker
- : public RecursiveASTVisitor<ElseVariableUsageChecker> {
+ : public RecursiveASTVisitor<ElseVariableUsageChecker> {
VarDecl *TargetVar;
Sema &SemaRef;
- ElseVariableUsageChecker(VarDecl *Var, Sema &S) : TargetVar(Var), SemaRef(S) {}
+ ElseVariableUsageChecker(VarDecl *Var, Sema &S)
+ : TargetVar(Var), SemaRef(S) {}
bool VisitDeclRefExpr(DeclRefExpr *DRE) {
if (DRE->getDecl() == TargetVar) {
- SemaRef.Diag(DRE->getBeginLoc(), diag::warn_out_of_scope_var_usage) << TargetVar->getName();
+ SemaRef.Diag(DRE->getBeginLoc(), diag::warn_out_of_scope_var_usage)
+ << TargetVar->getName();
}
return true;
}
- };
+ };
ElseVariableUsageChecker Checker(ConditionVar, *this);
Checker.TraverseStmt(elseStmt);
- }
+ }
}
if (ConstevalOrNegatedConsteval ||
|
I think that is reasonable, though we'd need to make sure
|
Yes, std::expected would be a false positive. But how would you tell apart std::expected, where a falsey object still contains something usable, from std::unique_ptr, where it does not? One option would be simply ignore that - this warning applies to raw pointers only, not even integers. It can always be expanded later, if we can think of how.
I wouldn't get my hopes up, not until forgetting a return statement errors by default. Though I do agree both should error by default. |
Yeah, this is pretty much what I had in mind. We could hard-code some common standard-library types too (e.g.
Tell me about it... |
Not seeing an impact when applying this patch -- but that's probably expected as the warning is not enabled by default? I'd probably additional changes to enable it. |
It does the analysis all the time, but the situation in which it uses the analysis (if with condition var, also has an else) is reasonably limited. So you'd need a fairly sizable 'else' for the visitor to have to do a bunch of work. I don't anticipate the situation there to be particularly common in our build suite though. That said, this is still going to be a pretty bad perf in cases that DO have that, just not in our test suite. |
I agree the worst case would be bad; if I'm reading things correctly, it'd be quadratic for things like if (int* a = something()) return *a;
else if (int* b = something()) return *b;
else if (int* c = something()) return *c;
else if (int* d = something()) return *d;
else if (int* e = something()) return *e;
else return -1; where it'd check the Though I'd expect such code patterns to be rare in practice, and the constant factor to be good. (Also github stop re-checking mergeability every single time someone updates main, it messes with the scrollbar) |
Any specific code changes you want to suggest? |
I suggested something using the |
return ptr[0] * ptr[0]; | ||
} | ||
else if (int *ptr2 = get_something_else()) { | ||
// expected-warning@+1{{variable ptr used in else/else if block is out of scope}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this diagnostic is a good explanation of the problem.
The issue is that ptr
must be a nullptr
in the else
branch and therefore dereferencing it will be undefined behavior.
It is definitely in scope otherwise that would be ill-formed.
This patch adds a warning when variable(s) declared in if/else-if scope are used in else-if/else branch(s)