-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[clang-tidy] Added bugprone-unsequenced-global-accesses check #130421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-clang-tidy @llvm/pr-subscribers-clang-tools-extra Author: cc (ConcreteCactus) ChangesThis checker attempts to cover the global variable case in rule EXP-30 in the SEI CERT C and rule EXP-50 in the SEI CERT C++ Coding Standards. It recurses into functions in the same translation unit and can handle global struct and union members as well. Patch is 40.93 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/130421.diff 5 Files Affected:
diff --git a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
index b780a85bdf3fe..09e3f481e9d1d 100644
--- a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
@@ -20,6 +20,7 @@
#include "CastingThroughVoidCheck.h"
#include "ChainedComparisonCheck.h"
#include "ComparePointerToMemberVirtualFunctionCheck.h"
+#include "ConflictingGlobalAccesses.h"
#include "CopyConstructorInitCheck.h"
#include "CrtpConstructorAccessibilityCheck.h"
#include "DanglingHandleCheck.h"
@@ -127,6 +128,8 @@ class BugproneModule : public ClangTidyModule {
"bugprone-chained-comparison");
CheckFactories.registerCheck<ComparePointerToMemberVirtualFunctionCheck>(
"bugprone-compare-pointer-to-member-virtual-function");
+ CheckFactories.registerCheck<ConflictingGlobalAccesses>(
+ "bugprone-conflicting-global-accesses");
CheckFactories.registerCheck<CopyConstructorInitCheck>(
"bugprone-copy-constructor-init");
CheckFactories.registerCheck<DanglingHandleCheck>(
diff --git a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
index e310ea9c94543..9754df6c06995 100644
--- a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
@@ -16,6 +16,7 @@ add_clang_library(clangTidyBugproneModule STATIC
CastingThroughVoidCheck.cpp
ChainedComparisonCheck.cpp
ComparePointerToMemberVirtualFunctionCheck.cpp
+ ConflictingGlobalAccesses.cpp
CopyConstructorInitCheck.cpp
CrtpConstructorAccessibilityCheck.cpp
DanglingHandleCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp b/clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
new file mode 100644
index 0000000000000..48906bf715ab2
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
@@ -0,0 +1,748 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "ConflictingGlobalAccesses.h"
+
+#include "clang/AST/RecursiveASTVisitor.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::bugprone {
+
+namespace {
+// An AccesKind represents one access to a global variable.
+//
+// The unchecked versions represent reads/writes that are not handled by
+// -Wunsequenced. (e.g. accesses inside functions).
+using AccessKind = uint8_t;
+static constexpr AccessKind AkRead = 0;
+static constexpr AccessKind AkWrite = 1;
+static constexpr AccessKind AkUncheckedRead = 2;
+static constexpr AccessKind AkUncheckedWrite = 3;
+
+static constexpr uint8_t AkCount = 4;
+
+// The TraversalResultKind represents a set of accesses.
+// Bits are corresponding to the AccessKind enum values.
+using TraversalResultKind = uint8_t;
+static constexpr TraversalResultKind TrInvalid = 0;
+static constexpr TraversalResultKind TrRead = 1 << AkRead;
+static constexpr TraversalResultKind TrWrite = 1 << AkWrite;
+static constexpr TraversalResultKind TrUncheckedWrite = 1 << AkUncheckedWrite;
+
+// To represent fields in structs or unions we use numbered FieldIndices. The
+// FieldIndexArray represents one field inside a global struct/union system.
+// The FieldIndexArray can be thought of as a path inside a tree.
+using FieldIndex = uint16_t;
+static constexpr FieldIndex FiUnion = 0x8000;
+
+// Note: This bit signals whether the field is a *field of* a struct or a
+// union, not whether the type of the field itself is a struct or a union.
+using FieldIndexArray = SmallVector<FieldIndex>;
+
+/// One traversal recurses into one side of a binary expression or one
+/// parameter of a function call. At least two of these traversals are used to
+/// find conflicting accesses.
+///
+/// A TraversalResult represents one traversal.
+struct TraversalResult {
+ int IndexCreated; // We use indices to keep track of which
+ // traversal we are in currently. The current
+ // index is stored in GlobalRWVisitor with the
+ // name TraversalIndex.
+ SourceLocation Loc[AkCount];
+ TraversalResultKind Kind;
+
+ TraversalResult();
+ TraversalResult(int Index, SourceLocation Loc, AccessKind Access);
+ void addNewAccess(SourceLocation Loc, AccessKind Access);
+};
+
+/// The result of a number of traversals.
+class TraversalAggregation {
+ DeclarationName DeclName; // The name of the global variable being checked.
+
+ // We only store the result of two traversals as two conflicting accesses
+ // are enough to detect undefined behavior. The two stored TraversalResults
+ // have different traversal indices.
+ //
+ // Note: Sometimes multiple traversals are merged into one
+ // TraversalResult.
+ TraversalResult MainPart, OtherPart;
+ // Pairings that are not reportable: Read-Read, Read-Write,
+ // Read-UncheckedRead, Write-Write, UncheckedRead-UncheckedRead.
+
+public:
+ TraversalAggregation();
+ TraversalAggregation(DeclarationName Name, SourceLocation Loc,
+ AccessKind Access, int Index);
+ void addGlobalRW(SourceLocation Loc, AccessKind Access, int Index);
+ DeclarationName getDeclName() const;
+
+ bool isValid() const;
+
+ // If there is a conflict and that conflict isn't reported by -Wunsequenced
+ // then we report the conflict.
+ bool shouldBeReported() const;
+ bool hasConflictingOperations() const;
+
+private:
+ bool hasTwoAccesses() const;
+ bool isReportedByWunsequenced() const;
+};
+
+/// The ObjectAccessTree stores the TraversalAggregations of one global
+/// struct/union. Because each field can be handled as a single variable, the
+/// tree stores one TraversalAggregation for every field.
+///
+/// Note: structs, classes, and unions are called objects in the code.
+struct ObjectAccessTree {
+ using FieldMap = llvm::DenseMap<int, std::unique_ptr<ObjectAccessTree>>;
+ TraversalAggregation OwnAccesses;
+
+ // In a union, new fields should inherit from UnionTemporalAccesses
+ // instead of OwnAccesses. That's because an access to a field of a union is
+ // also an access to every other field of the same union.
+ TraversalAggregation UnionTemporalAccesses;
+
+ // We try to be lazy and only store fields that are actually accessed.
+ FieldMap Fields;
+ bool IsUnion;
+
+ ObjectAccessTree(TraversalAggregation Own);
+
+ void addFieldToAll(SourceLocation Loc, AccessKind Access, int Index);
+ void addFieldToAllExcept(uint16_t ExceptIndex, SourceLocation Loc,
+ AccessKind Access, int Index);
+};
+
+/// This object is the root of all ObjectAccessTrees.
+class ObjectTraversalAggregation {
+ DeclarationName DeclName; // The name of the global struct/union.
+ ObjectAccessTree AccessTree;
+
+public:
+ ObjectTraversalAggregation(DeclarationName Name, SourceLocation Loc,
+ FieldIndexArray FieldIndices, AccessKind Access,
+ int Index);
+ void addFieldRW(SourceLocation Loc, FieldIndexArray FieldIndices,
+ AccessKind Access, int Index);
+ DeclarationName getDeclName() const;
+ bool shouldBeReported() const;
+
+private:
+ bool shouldBeReportedRec(const ObjectAccessTree *Node) const;
+};
+
+/// GlobalRWVisitor (global read write visitor) does all the traversals.
+class GlobalRWVisitor : public RecursiveASTVisitor<GlobalRWVisitor> {
+ friend RecursiveASTVisitor<GlobalRWVisitor>;
+
+public:
+ GlobalRWVisitor(bool IsWritePossibleThroughFunctionParam);
+
+ // startTraversal is called to start a new traversal. It increments the
+ // TraversalIndex, which in turn will generate new TraversalResults.
+ void startTraversal(Expr *E);
+
+ const llvm::SmallVector<TraversalAggregation> &getGlobalsFound() const;
+
+ const llvm::SmallVector<ObjectTraversalAggregation>
+ &getObjectGlobalsFound() const;
+
+protected:
+ // RecursiveASTVisitor overrides
+ bool VisitDeclRefExpr(DeclRefExpr *S);
+ bool VisitUnaryOperator(UnaryOperator *Op);
+ bool VisitBinaryOperator(BinaryOperator *Op);
+ bool VisitCallExpr(CallExpr *CE);
+ bool VisitMemberExpr(MemberExpr *ME);
+
+private:
+ void visitCallExprArgs(CallExpr *CE);
+ void traverseFunctionsToBeChecked();
+
+ llvm::SmallVector<TraversalAggregation> GlobalsFound;
+ llvm::SmallVector<ObjectTraversalAggregation> ObjectGlobalsFound;
+
+ // We check inside functions only if the functions hasn't been checked in
+ // the current traversal. We use this array to check if the function is
+ // already registered to be checked.
+ llvm::SmallVector<std::pair<DeclarationName, Stmt *>> FunctionsToBeChecked;
+
+ // The TraversalIndex is used to differentiate between two sides of a binary
+ // expression or the parameters of a function. Every traversal represents
+ // one such expression and the TraversalIndex is incremented between them.
+ int TraversalIndex;
+
+ // Accesses that are inside functions are not checked by -Wunsequenced,
+ // therefore we keep track of whether we are inside of a function or not.
+ bool IsInFunction;
+
+ // Same as the HandleMutableFunctionParametersAsWrites option.
+ bool IsWritePossibleThroughFunctionParam;
+
+ void addGlobal(DeclarationName Name, SourceLocation Loc, bool IsWrite,
+ bool IsUnchecked);
+ void addGlobal(const DeclRefExpr *DR, bool IsWrite, bool IsUnchecked);
+ void addField(DeclarationName Name, FieldIndexArray FieldIndices,
+ SourceLocation Loc, bool IsWrite, bool IsUnchecked);
+ bool handleModified(const Expr *Modified, bool IsUnchecked);
+ bool handleModifiedVariable(const DeclRefExpr *DE, bool IsUnchecked);
+ bool handleAccessedObject(const Expr *E, bool IsWrite, bool IsUnchecked);
+ bool isVariable(const Expr *E);
+};
+} // namespace
+
+static bool isGlobalDecl(const VarDecl *VD) {
+ return VD && VD->hasGlobalStorage() && VD->getLocation().isValid() &&
+ !VD->getType().isConstQualified();
+}
+
+AST_MATCHER_P(Expr, twoGlobalWritesBetweenSequencePoints, const LangStandard *,
+ LangStd) {
+ assert(LangStd);
+
+ const Expr *E = &Node;
+
+ if (const BinaryOperator *Op = dyn_cast<BinaryOperator>(E)) {
+ const BinaryOperator::Opcode Code = Op->getOpcode();
+ if (Code == BO_LAnd || Code == BO_LOr || Code == BO_Comma) {
+ return false;
+ }
+
+ if (Op->isAssignmentOp() && isa<DeclRefExpr>(Op->getLHS())) {
+ return false;
+ }
+
+ if (LangStd->isCPlusPlus17() &&
+ (Code == BO_Shl || Code == BO_Shr || Code == BO_PtrMemD ||
+ Code == BO_PtrMemI || Op->isAssignmentOp())) {
+ return false;
+ }
+
+ return true;
+ }
+
+ if (isa<CallExpr>(E)) {
+ return true;
+ }
+
+ return false;
+}
+
+ConflictingGlobalAccesses::ConflictingGlobalAccesses(StringRef Name,
+ ClangTidyContext *Context)
+ : ClangTidyCheck(Name, Context),
+ HandleMutableFunctionParametersAsWrites(
+ Options.get("HandleMutableFunctionParametersAsWrites", false))
+{}
+
+void ConflictingGlobalAccesses::storeOptions(ClangTidyOptions::OptionMap& Opts)
+{
+ Options.store(Opts, "HandleMutableFunctionParametersAsWrites",
+ HandleMutableFunctionParametersAsWrites);
+}
+
+void ConflictingGlobalAccesses::registerMatchers(MatchFinder *Finder) {
+
+ const LangStandard *LangStd =
+ &LangStandard::getLangStandardForKind(getLangOpts().LangStd);
+
+ ast_matchers::internal::Matcher<Expr> GlobalAccessMatcher =
+ twoGlobalWritesBetweenSequencePoints(LangStd);
+
+ Finder->addMatcher(
+ stmt(traverse(TK_AsIs, expr(GlobalAccessMatcher).bind("gw"))), this);
+}
+
+void ConflictingGlobalAccesses::check(const MatchFinder::MatchResult &Result) {
+ const Expr *E = Result.Nodes.getNodeAs<Expr>("gw");
+ assert(E);
+
+ GlobalRWVisitor Visitor(HandleMutableFunctionParametersAsWrites);
+ if (const auto *Op = dyn_cast<BinaryOperator>(E)) {
+ Visitor.startTraversal(Op->getLHS());
+ Visitor.startTraversal(Op->getRHS());
+
+ } else if (const auto *CE = dyn_cast<CallExpr>(E)) {
+ for (uint32_t I = 0; I < CE->getNumArgs(); I++) {
+ Visitor.startTraversal(const_cast<Expr *>(CE->getArg(I)));
+ }
+ }
+
+ const llvm::SmallVector<TraversalAggregation> &Globals =
+ Visitor.getGlobalsFound();
+
+ for (uint32_t I = 0; I < Globals.size(); I++) {
+ if (Globals[I].shouldBeReported()) {
+ diag(E->getBeginLoc(), "read/write conflict on global variable " +
+ Globals[I].getDeclName().getAsString());
+ }
+ }
+ const llvm::SmallVector<ObjectTraversalAggregation> &ObjectGlobals =
+ Visitor.getObjectGlobalsFound();
+ for (uint32_t I = 0; I < ObjectGlobals.size(); I++) {
+ if (ObjectGlobals[I].shouldBeReported()) {
+ diag(E->getBeginLoc(), "read/write conflict on the field of the global "
+ "object " +
+ ObjectGlobals[I].getDeclName().getAsString());
+ }
+ }
+}
+
+GlobalRWVisitor::GlobalRWVisitor(bool IsWritePossibleThroughFunctionParam)
+ : TraversalIndex(0), IsInFunction(false),
+ IsWritePossibleThroughFunctionParam(IsWritePossibleThroughFunctionParam)
+{}
+
+void GlobalRWVisitor::startTraversal(Expr *E) {
+ TraversalIndex++;
+ FunctionsToBeChecked.clear();
+ IsInFunction = false;
+ TraverseStmt(E);
+
+ // We keep a list of functions to be checked during traversal so that they are
+ // not checked multiple times.
+ traverseFunctionsToBeChecked();
+}
+
+void GlobalRWVisitor::traverseFunctionsToBeChecked() {
+ IsInFunction = true;
+
+ // We could find more functions to be checked while checking functions.
+ // Because a simple iterator could get invalidated, we index into the array.
+ for (size_t I = 0; I < FunctionsToBeChecked.size(); ++I) {
+ TraverseStmt(FunctionsToBeChecked[I].second);
+ }
+}
+
+bool GlobalRWVisitor::isVariable(const Expr *E) {
+ const Type *T = E->getType().getTypePtrOrNull();
+ assert(T);
+
+ return isa<DeclRefExpr>(E) && (!T->isRecordType() || T->isUnionType());
+}
+
+bool GlobalRWVisitor::VisitDeclRefExpr(DeclRefExpr *DR) {
+ if (!isa<VarDecl>(DR->getDecl())) {
+ return true;
+ }
+ const auto *VD = dyn_cast<VarDecl>(DR->getDecl());
+ if (!isVariable(DR)) {
+ return handleAccessedObject(DR, /*IsWrite*/ false, /*IsUnchecked*/ false);
+ }
+ if (isGlobalDecl(VD)) {
+ addGlobal(VD->getDeclName(), VD->getBeginLoc(), /*IsWrite*/ false,
+ /*IsUnchecked*/ false);
+ return true;
+ }
+ return true;
+}
+
+bool GlobalRWVisitor::VisitMemberExpr(MemberExpr *ME) {
+ return handleAccessedObject(ME, /*IsWrite*/ false, /*IsUnchecked*/ false);
+}
+
+bool GlobalRWVisitor::handleModifiedVariable(const DeclRefExpr *DR,
+ bool IsUnchecked) {
+ if (!isa<VarDecl>(DR->getDecl())) {
+ return true;
+ }
+ const auto *VD = dyn_cast<VarDecl>(DR->getDecl());
+
+ if (isGlobalDecl(VD)) {
+ addGlobal(VD->getDeclName(), VD->getBeginLoc(), /*IsWrite*/ true,
+ IsUnchecked);
+ return false;
+ }
+ return true;
+}
+
+bool GlobalRWVisitor::handleAccessedObject(const Expr *E, bool IsWrite,
+ bool IsUnchecked) {
+ const Expr *CurrentNode = E;
+ int NodeCount = 0;
+ while (isa<MemberExpr>(CurrentNode)) {
+ const MemberExpr *CurrentField = dyn_cast<MemberExpr>(CurrentNode);
+
+ if (CurrentField->isArrow()) {
+ return true;
+ }
+
+ const ValueDecl *Decl = CurrentField->getMemberDecl();
+ if (!isa<FieldDecl>(Decl)) {
+ return true;
+ }
+
+ CurrentNode = CurrentField->getBase();
+ NodeCount++;
+ }
+
+ if (!isa<DeclRefExpr>(CurrentNode)) {
+ return true;
+ }
+ const DeclRefExpr *Base = dyn_cast<DeclRefExpr>(CurrentNode);
+
+ if (!isa<VarDecl>(Base->getDecl())) {
+ return true;
+ }
+ const VarDecl *BaseDecl = dyn_cast<VarDecl>(Base->getDecl());
+
+ if (!isGlobalDecl(BaseDecl)) {
+ return true;
+ }
+
+ FieldIndexArray FieldIndices(NodeCount);
+ CurrentNode = E;
+ while (isa<MemberExpr>(CurrentNode)) {
+ const MemberExpr *CurrentField = dyn_cast<MemberExpr>(CurrentNode);
+ const FieldDecl *Decl = dyn_cast<FieldDecl>(CurrentField->getMemberDecl());
+ assert(Decl);
+
+ FieldIndices[NodeCount - 1] = Decl->getFieldIndex();
+ const RecordDecl *Record = Decl->getParent();
+ assert(Record);
+
+ if (Record->isUnion()) {
+ FieldIndices[NodeCount - 1] |= FiUnion;
+ }
+
+ CurrentNode = CurrentField->getBase();
+ NodeCount--;
+ }
+
+ addField(BaseDecl->getDeclName(), FieldIndices, Base->getBeginLoc(), IsWrite,
+ IsUnchecked);
+ return false;
+}
+
+bool GlobalRWVisitor::handleModified(const Expr *Modified, bool IsUnchecked) {
+ assert(Modified);
+
+ if (isVariable(Modified)) {
+ return handleModifiedVariable(dyn_cast<DeclRefExpr>(Modified), IsUnchecked);
+ }
+
+ return handleAccessedObject(Modified, /*IsWrite*/ true, IsUnchecked);
+}
+
+bool GlobalRWVisitor::VisitUnaryOperator(UnaryOperator *Op) {
+ UnaryOperator::Opcode Code = Op->getOpcode();
+ if (Code == UO_PostInc || Code == UO_PostDec || Code == UO_PreInc ||
+ Code == UO_PreDec) {
+ return handleModified(Op->getSubExpr(), /*IsUnchecked*/ false);
+ }
+ return true;
+}
+
+bool GlobalRWVisitor::VisitBinaryOperator(BinaryOperator *Op) {
+ if (Op->isAssignmentOp()) {
+ return handleModified(Op->getLHS(), /*IsUnchecked*/ false);
+ }
+
+ return true;
+}
+
+void GlobalRWVisitor::visitCallExprArgs(CallExpr *CE) {
+ const Type *CT = CE->getCallee()->getType().getTypePtrOrNull();
+ if (const auto *PT = dyn_cast_if_present<PointerType>(CT)) {
+ CT = PT->getPointeeType().getTypePtrOrNull();
+ }
+ const auto *ProtoType = dyn_cast_if_present<FunctionProtoType>(CT);
+
+ for (uint32_t I = 0; I < CE->getNumArgs(); I++) {
+ Expr *Arg = CE->getArg(I);
+
+ if (!ProtoType || I >= ProtoType->getNumParams()) {
+ continue;
+ }
+
+ if (const auto *Op = dyn_cast<UnaryOperator>(Arg)) {
+ if (Op->getOpcode() != UO_AddrOf) {
+ continue;
+ }
+
+ if (const auto *PtrType = dyn_cast_if_present<PointerType>(
+ ProtoType->getParamType(I).getTypePtrOrNull())) {
+ if (PtrType->getPointeeType().isConstQualified()) {
+ continue;
+ }
+
+ if (handleModified(Op->getSubExpr(), /*IsUnchecked*/ true)) {
+ continue;
+ }
+ }
+ }
+
+ if (const auto *RefType = dyn_cast_if_present<ReferenceType>(
+ ProtoType->getParamType(I).getTypePtrOrNull())) {
+ if (RefType->getPointeeType().isConstQualified()) {
+ continue;
+ }
+
+ if (handleModified(Arg, /*IsUnchecked*/ true)) {
+ continue;
+ }
+ }
+ }
+}
+
+bool GlobalRWVisitor::VisitCallExpr(CallExpr *CE) {
+
+ if (IsWritePossibleThroughFunctionParam || isa<CXXOperatorCallExpr>(CE)) {
+ visitCallExprArgs(CE);
+ }
+
+ if (!isa_and_nonnull<FunctionDecl>(CE->getCalleeDecl())) {
+ return true;
+ }
+ const auto *FD = dyn_cast<FunctionDecl>(CE->getCalleeDecl());
+
+ if (!FD->hasBody()) {
+ return true;
+ }
+
+ for (const std::pair<DeclarationName, Stmt *> &Fun : FunctionsToBeChecked) {
+ if (Fun.first == FD->getDeclName()) {
+ return true;
+ }
+ }
+ FunctionsToBeChecked.emplace_back(FD->getDeclName(), FD->getBody());
+
+ return true;
+}
+
+const llvm::SmallVector<TraversalAggregation> &
+GlobalRWVisitor::getGlobalsFound() const {
+ return GlobalsFound;
+}
+
+const llvm::SmallVector<ObjectTraversalAggregation> &
+GlobalRWVisitor::getObjectGlobalsFound() const {
+ return ObjectGlobalsFound;
+}
+
+void GlobalRWVisitor::addGlobal(DeclarationName Name, SourceLocation Loc,
+ bool IsWrite, bool IsUnchecked) {
+ AccessKind Access = (IsInFunction || IsUnchecked)
+ ? (IsWrite ? AkUncheckedWrite : AkUncheckedRead)
+ : (IsWrite ? AkWrite : AkRead);
+ for (uint32_t I = 0; I < GlobalsFound.size(); I++) {
+ if (GlobalsFound[I].getDeclName() == Name) {
+ Global...
[truncated]
|
PiotrZSL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about name, some simpler name would be better.
Missing documentation/release notes.
Add check need to be simplified.
clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.h
Outdated
Show resolved
Hide resolved
clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.h
Outdated
Show resolved
Hide resolved
clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
Outdated
Show resolved
Hide resolved
|
If this is a CERT check, please add the corresponding aliases towards the CERT module. |
clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.h
Outdated
Show resolved
Hide resolved
clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
Outdated
Show resolved
Hide resolved
clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
Outdated
Show resolved
Hide resolved
clang-tools-extra/test/clang-tidy/checkers/bugprone/conflicting-global-accesses.cpp
Outdated
Show resolved
Hide resolved
clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
Outdated
Show resolved
Hide resolved
clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
Outdated
Show resolved
Hide resolved
clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
Outdated
Show resolved
Hide resolved
clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
Outdated
Show resolved
Hide resolved
6879af3 to
7b02721
Compare
0196b1b to
983fb8e
Compare
|
Hi Everyone, Can I get a re-review on this PR? I think I addressed most of the comments. |
vbvictor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly left comments in docs about general improvements. Didn't look closely at the code, but here are some general recommendations that you should do before major review:
There are a lot of new code that is relatively hard to review, you should try to follow LLVM coding guidelines, most importantly these:
- Don’t Use Braces on Simple Single-Statement Bodies of if/else/loop Statements (many violations, with this fix the code should become smaller by line count, thus easier to review)
- Use range-based for loops wherever possible
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto code-block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the code block directive as requested, but I wasn't sure how it works with the option directive, so I added the code block outside of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works inside option, for example look at https://clang.llvm.org/extra/clang-tidy/checks/cppcoreguidelines/avoid-do-while.html.
You should put code under option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, code-blocks need to start with a new line, I'm not sure if they are rendered correctly.
Did you run ninja docs-clang-tools-html to check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, my ninja can't find target docs-clang-tools-html. Maybe I didn't enable its project when I was generating with cmake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need Sphinx installed in your system.
Please test your docs that they look correct.
https://clang.llvm.org/extra/clang-tidy/Contributing.html#documenting-your-check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can read full guide at https://clang.llvm.org/extra/clang-tidy/Contributing.html, it says how and what you should install and do.
Also, Fix clang-format issues and clang-tidy issues (if present), Currently there isn't a CI job for clang-tidy run so please do it locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you I'll test and fix the documentation and the formatting issues. I'll run clang-tidy as well.
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This option need some rephrasing and clarification IMO.
It should be something like:
"When `true`, treat calls of functions that accept a mutable reference or a pointer as a write to the variable that was passed to the function."
I'm not totally satisfied with my wording, but I tried to eliminate this ambiguity:
What does "handling parameters as writes" mean here? Especially word "handling".
Also, give an example of some real code that doesn't get flagged without this option and gets flagged with the option enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I clarified the description and added a more descriptive code example. Hope this helps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should not explicitly exclude cases that are covered by "-Wunsequenced". It's okay for check to have duplicate functionality with compiler flags. Moreover, for gcc there is "-Wsequence-point" and we don't know to what extent "-Wunsequenced" can find bugs, so It's better just to mention that functionality of this check may overlap with existing compiler flags such as "-Wunsequenced", "-Wsequence-point".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I wasn't sure whether to include/exclude -Wunsequenced. I removed the code parts that excluded flagging those cases, and noted later in the documentation that there could be some overlap.
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
clang-tools-extra/docs/clang-tidy/checks/bugprone/conflicting-global-accesses.rst
Outdated
Show resolved
Hide resolved
|
|
||
| .. option:: HandleMutableFunctionParametersAsWrites | ||
|
|
||
| When `true`, treat function calls with mutable reference or pointer parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whole description of the option should be with 2 spaces indent
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
This check attempts to detect unsequenced accesses to global
variables. It recurses into function calls in the same translation unit,
and can handle fields on global structs/unions.
|
Looking at the check code in general, I have some concerns whether this check should be implemented as a Clang Static Analyzer (CSA) check instead of a clang-tidy check. I kindly ask for an opinion of clang-tidy maintainers on this matter, @HerrCai0907 @carlosgalvezp @PiotrZSL. Also, kindly ask CSA maintainers to take a brief look at this checker. From your point of view, will it be easier and less code consuming to implement such checker as a part of CSA? @steakhal @NagyDonat @haoNoQ. |
| return globalVar + incFun(); // This is not detected by -Wunsequenced. | ||
| } | ||
|
|
||
| This clang-tidy check attempts to detect such cases. It recurses into functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| This clang-tidy check attempts to detect such cases. It recurses into functions | |
| This check attempts to detect such cases. It recurses into functions |
|
|
||
| .. option:: HandleMutableFunctionParametersAsWrites | ||
|
|
||
| When ``true``, treat function calls with mutable reference or pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| When ``true``, treat function calls with mutable reference or pointer | |
| When `true`, treat function calls with mutable reference or pointer |
| When ``true``, treat function calls with mutable reference or pointer | ||
| parameters as writes to the parameter. | ||
|
|
||
| The default value is ``false``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The default value is ``false``. | |
| Default is `false`. |
| The default value is ``false``. | ||
|
|
||
| For example, the following code block will get flagged if | ||
| ``HandleMutableFunctionParametersAsWrites`` is ``true``: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ``HandleMutableFunctionParametersAsWrites`` is ``true``: | |
| :option:`HandleMutableFunctionParametersAsWrites` is `true`: |
| int a = globalVar + func(globalVar); | ||
| } | ||
|
|
||
| When ``HandleMutableFunctionParametersAsWrites`` is set to `true`, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| When ``HandleMutableFunctionParametersAsWrites`` is set to `true`, the | |
| When :option:`HandleMutableFunctionParametersAsWrites` is set to `true`, the |
| sequencing is defined for the ``+`` operator, a write to ``globalVar`` | ||
| inside ``c`` would be undefined behavior. | ||
|
|
||
| When ``HandleMutableFunctionParametersAsWrites`` is set to ``false``, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| When ``HandleMutableFunctionParametersAsWrites`` is set to ``false``, the | |
| When :option:`HandleMutableFunctionParametersAsWrites` is set to `false`, the |
Oh good question. Generally it's a bit more nuanced than that. The static analyzer is doing a lot more than AST traversal; it's doing so-called "path sensitive" analysis which tries to enumerate different execution paths in the program. Which is very good at finding realistic execution paths on which something "bad" may happen, and not suitable for all other kinds of analysis. Yes, I think this checker could be implemented as a path-sensitive checker in the static analyzer. It does indeed search for "bad" things that may happen, which is exactly what the path-sensitive engine is good at. Path-sensitive analysis is more precise, it may help you naturally eliminate certain classes of false positives that would be extremely difficult to eliminate manually. However, it's also significantly more expensive, so it'll run slower for the users who aren't already using path-other sensitive checks (the performance cost will be negligible for users who are already enabling at least one other path-sensitive checker). Additionally, because of being more expensive and explosive in complexity, it may suffer from more false negatives, simply because it's running out of the analysis "budget". This won't necessarily eliminate a lot of code from the checker. Path-sensitive checkers often end up more simple and elegant. You'll still need to identify the places where the execution may be unsequenced, we won't do that for you. But then you'll put markers in the path-sensitive state when you're inside those statements, and focus on responding to global variable accesses while those markers are present. We'll do the rest of the traversal for you. Path-sensitive analysis may also save you a lot of follow-up polish work (I haven't looked at how much of that you've already done). Now, in practice, the best thing about path-sensitive analysis is that it naturally helps you avoid false positives of the following nature: int globalVar;
int foo(bool shouldInc) {
if (shouldInc)
globalVar++;
return globalVar;
}
void bar() {
globalVar + foo(false); // should not warn
}In this case the problematic mutation cannot happen in practice because the branch on which the increment happens is entirely unreachable in the context of These kinds of false positives may happen in a variety of different ways and you'll have a very hard time pattern-matching them all down on the AST level without a powerful principled solution that the path-sensitive engine implements. If you're using path-sensitive analysis, the false positive is eliminated automatically, by construction, without any effort on your part. You will simply never notice the increment of the global variable when you're analyzing Whether you actually worry about these false positives, is entirely up to you though. It's a matter of your relationship with your users, a matter of understanding their demands, and also a matter of observing the behavior of the checker on real-world code. I cannot answer that question for you, you'll need to experiment. We've had a similar situation in our "don't call virtual methods on a partially constructed object" checker (https://clang.llvm.org/docs/analyzer/checkers.html#cplusplus-purevirtualcall-c, https://clang.llvm.org/docs/analyzer/checkers.html#optin-cplusplus-virtualcall). It turned out that in many cases the code was deliberately written in such a way that the constructor may call a function in turn may eventually call a virtual method but in reality the constructor passes a flag that makes sure the virtual method is never called from inside that function. So our initial attempt to avoid path-sensitive analysis has failed and we had to rewrite the checker as a path-sensitive checker. Path-sensitive analysis also offers other bonuses. For example, it's able to track references and aliasing to some extent, so you're getting the following case covered "for free": int globalVar;
int foo(int &var) {
return var++;
}
void bar() {
int &var = globalVar;
globalVar + foo(var); // should warn
}It's also able to resolve calls through function pointers and virtual calls in some cases: int globalVar;
int foo() {
return globalVar++;
}
void bar() {
int (*func)() = foo;
globalVar + func(); // should warn
}But these are relatively minor and they probably won't ruin your checker if you don't support them. So, overall, up to you. If you go for path-sensitive analysis, it would be a major rewrite, and not without downsides. You probably don't need to do it just for the purposes of elegance or code reuse. (Though I definitely won't stop you. I love elegance and code reuse.) I think the main question you need to answer is whether the increased precision - in terms of both false positives and false negatives - is of value to you. The potential false negatives due to budget limitations are probably the second most important factor. But either way, it all depends on your experimental data and on your relationship with your future users. |
|
Ok I kinda forgot about the elephant in the room. With path sensitivity you'll be able to handle more kinds of variables. Not just global variables but also stuff in heap memory. As well as someone else's stack variables passed by reference down the stack. Or it could be, like, just some pointer or reference passed to us from god knows where. We don't have to know where and by whom exactly the memory it points to has been allocated. We just need to know whether it's the same memory address at two different points of the program. The static analyzer will model such aliasing for you and give you that information for free, so that at any point of time you know which pointer points to which storage, even if you can't (or don't want to) describe that storage in terms of source code. It'd magically work correctly even if that information depends on the exact execution path up to that point. You don't even need to handle all these cases separately. It'd just be "some memory" to you. This could make your check significantly more powerful. And that's very hard without path sensitivity. FWIW one case you can probably handle without path sensitivity is: member variables inside a method. Eg., in class MyClass {
int memberVar;
public:
int foo() {
memberVar++;
return memberVar;
}
void bar() {
memberVar + foo(); // should warn
}
};you don't really need to know what You can probably handle references similarly, since they also cannot be reassigned at runtime. It doesn't matter that you don't know what the reference refers to: int foo(int &refVar) {
refVar++;
return refVar;
}
void bar(int &refVar) {
refVar + foo(refVar); // should warn
}But when it comes to pointers and heap storage, you'll probably have a very hard time with path sensitivity. |
This piece of complexity is also trivialized by path sensitive analysis. You'll simply ask whether it's an access to the same memory address or not. No need to describe what the address is or who owns it. |
|
Thank you for your long and detailed answer! Indeed, the check can benefit a lot from path-sensitive analysis:
So from my perspective, this check can live as a clang-tidy check. However, looking at all benefits of path-sensitive analysis of CSA, the check would look a lot better as a CSA check. I understand that CSA checks might be harder to write since CSA itself has higher entry threshold than clang-tidy, but it's hard to give a thoughtful and careful review of such big clang-tidy check, especially that has non-trivial manual AST-visiting. |
|
It does not look like traditional clang-tidy check. Personally I prefer to move it to CSA instead of clang-tidy. |
|
Hi Everyone, Thank you for the detailed suggestions. Now that I have a better idea of what CSA is, I also think that this check could be better implemented in clang-static-analyzer. Being able to reduce the number of false positives using path sensitive analysis is in my opinion the most important advantage of using CSA. What do you think? Should I close this PR, and open a new one when I'm ready with the CSA checker? By the way, I don't have any users. This check is more like a University project I picked. The reason I started with clang-tidy is simply because that was the suggestion my teacher gave me. |
To keep everything organized, I'd suggest creating a new branch for CSA checker and close this PR (you can always reopen it later). For the CSA part, you can start by reading this developer guide (probably a bit outdated?). This original white paper is very helpful in understanding the core principles of the CSA and how it all started. haoNoQ's guide gives excellent practical description of how to write checkers and what functions the CSA has. |
Thank you, I'll open a new PR when I'm ready with the CSA checker. |
This checker attempts to cover the global variable case in rule EXP-30 in the SEI CERT C and rule EXP-50 in the SEI CERT C++ Coding Standards.
It recurses into functions in the same translation unit and can handle global struct and union members as well.