Skip to content

Conversation

@ConcreteCactus
Copy link

@ConcreteCactus ConcreteCactus commented Mar 8, 2025

This checker attempts to cover the global variable case in rule EXP-30 in the SEI CERT C and rule EXP-50 in the SEI CERT C++ Coding Standards.

It recurses into functions in the same translation unit and can handle global struct and union members as well.

@github-actions
Copy link

github-actions bot commented Mar 8, 2025

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@ConcreteCactus ConcreteCactus changed the title added Conflicting Global Accesses checker [clang-tidy] Added Conflicting Global Accesses checker Mar 9, 2025
@ConcreteCactus ConcreteCactus marked this pull request as ready for review March 31, 2025 16:14
@llvmbot
Copy link
Member

llvmbot commented Mar 31, 2025

@llvm/pr-subscribers-clang-tidy

@llvm/pr-subscribers-clang-tools-extra

Author: cc (ConcreteCactus)

Changes

This checker attempts to cover the global variable case in rule EXP-30 in the SEI CERT C and rule EXP-50 in the SEI CERT C++ Coding Standards.

It recurses into functions in the same translation unit and can handle global struct and union members as well.


Patch is 40.93 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/130421.diff

5 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp (+3)
  • (modified) clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt (+1)
  • (added) clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp (+748)
  • (added) clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.h (+77)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/conflicting-global-accesses.cpp (+290)
diff --git a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
index b780a85bdf3fe..09e3f481e9d1d 100644
--- a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
@@ -20,6 +20,7 @@
 #include "CastingThroughVoidCheck.h"
 #include "ChainedComparisonCheck.h"
 #include "ComparePointerToMemberVirtualFunctionCheck.h"
+#include "ConflictingGlobalAccesses.h"
 #include "CopyConstructorInitCheck.h"
 #include "CrtpConstructorAccessibilityCheck.h"
 #include "DanglingHandleCheck.h"
@@ -127,6 +128,8 @@ class BugproneModule : public ClangTidyModule {
         "bugprone-chained-comparison");
     CheckFactories.registerCheck<ComparePointerToMemberVirtualFunctionCheck>(
         "bugprone-compare-pointer-to-member-virtual-function");
+    CheckFactories.registerCheck<ConflictingGlobalAccesses>(
+        "bugprone-conflicting-global-accesses");
     CheckFactories.registerCheck<CopyConstructorInitCheck>(
         "bugprone-copy-constructor-init");
     CheckFactories.registerCheck<DanglingHandleCheck>(
diff --git a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
index e310ea9c94543..9754df6c06995 100644
--- a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
@@ -16,6 +16,7 @@ add_clang_library(clangTidyBugproneModule STATIC
   CastingThroughVoidCheck.cpp
   ChainedComparisonCheck.cpp
   ComparePointerToMemberVirtualFunctionCheck.cpp
+  ConflictingGlobalAccesses.cpp
   CopyConstructorInitCheck.cpp
   CrtpConstructorAccessibilityCheck.cpp
   DanglingHandleCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp b/clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
new file mode 100644
index 0000000000000..48906bf715ab2
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/ConflictingGlobalAccesses.cpp
@@ -0,0 +1,748 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "ConflictingGlobalAccesses.h"
+
+#include "clang/AST/RecursiveASTVisitor.h"
+
+using namespace clang::ast_matchers;
+
+namespace clang::tidy::bugprone {
+
+namespace {
+// An AccesKind represents one access to a global variable.
+//
+// The unchecked versions represent reads/writes that are not handled by
+// -Wunsequenced. (e.g. accesses inside functions).
+using AccessKind = uint8_t;
+static constexpr AccessKind AkRead = 0;
+static constexpr AccessKind AkWrite = 1;
+static constexpr AccessKind AkUncheckedRead = 2;
+static constexpr AccessKind AkUncheckedWrite = 3;
+
+static constexpr uint8_t AkCount = 4;
+
+// The TraversalResultKind represents a set of accesses.
+// Bits are corresponding to the AccessKind enum values.
+using TraversalResultKind = uint8_t;
+static constexpr TraversalResultKind TrInvalid = 0;
+static constexpr TraversalResultKind TrRead = 1 << AkRead;
+static constexpr TraversalResultKind TrWrite = 1 << AkWrite;
+static constexpr TraversalResultKind TrUncheckedWrite = 1 << AkUncheckedWrite;
+
+// To represent fields in structs or unions we use numbered FieldIndices. The
+// FieldIndexArray represents one field inside a global struct/union system.
+// The FieldIndexArray can be thought of as a path inside a tree.
+using FieldIndex = uint16_t;
+static constexpr FieldIndex FiUnion = 0x8000;
+
+// Note: This bit signals whether the field is a *field of* a struct or a
+// union, not whether the type of the field itself is a struct or a union.
+using FieldIndexArray = SmallVector<FieldIndex>;
+
+/// One traversal recurses into one side of a binary expression or one
+/// parameter of a function call. At least two of these traversals are used to
+/// find conflicting accesses.
+///
+/// A TraversalResult represents one traversal.
+struct TraversalResult {
+  int IndexCreated; // We use indices to keep track of which
+                    // traversal we are in currently. The current
+                    // index is stored in GlobalRWVisitor with the
+                    // name TraversalIndex.
+  SourceLocation Loc[AkCount];
+  TraversalResultKind Kind;
+
+  TraversalResult();
+  TraversalResult(int Index, SourceLocation Loc, AccessKind Access);
+  void addNewAccess(SourceLocation Loc, AccessKind Access);
+};
+
+/// The result of a number of traversals.
+class TraversalAggregation {
+  DeclarationName DeclName; // The name of the global variable being checked.
+
+  // We only store the result of two traversals as two conflicting accesses
+  // are enough to detect undefined behavior. The two stored TraversalResults
+  // have different traversal indices.
+  //
+  // Note: Sometimes multiple traversals are merged into one
+  // TraversalResult.
+  TraversalResult MainPart, OtherPart;
+  // Pairings that are not reportable: Read-Read, Read-Write,
+  // Read-UncheckedRead, Write-Write, UncheckedRead-UncheckedRead.
+
+public:
+  TraversalAggregation();
+  TraversalAggregation(DeclarationName Name, SourceLocation Loc,
+                       AccessKind Access, int Index);
+  void addGlobalRW(SourceLocation Loc, AccessKind Access, int Index);
+  DeclarationName getDeclName() const;
+
+  bool isValid() const;
+
+  // If there is a conflict and that conflict isn't reported by -Wunsequenced
+  // then we report the conflict.
+  bool shouldBeReported() const;
+  bool hasConflictingOperations() const;
+
+private:
+  bool hasTwoAccesses() const;
+  bool isReportedByWunsequenced() const;
+};
+
+/// The ObjectAccessTree stores the TraversalAggregations of one global
+/// struct/union. Because each field can be handled as a single variable, the
+/// tree stores one TraversalAggregation for every field.
+///
+/// Note: structs, classes, and unions are called objects in the code.
+struct ObjectAccessTree {
+  using FieldMap = llvm::DenseMap<int, std::unique_ptr<ObjectAccessTree>>;
+  TraversalAggregation OwnAccesses;
+
+  // In a union, new fields should inherit from UnionTemporalAccesses
+  // instead of OwnAccesses. That's because an access to a field of a union is
+  // also an access to every other field of the same union.
+  TraversalAggregation UnionTemporalAccesses;
+
+  // We try to be lazy and only store fields that are actually accessed.
+  FieldMap Fields;
+  bool IsUnion;
+
+  ObjectAccessTree(TraversalAggregation Own);
+
+  void addFieldToAll(SourceLocation Loc, AccessKind Access, int Index);
+  void addFieldToAllExcept(uint16_t ExceptIndex, SourceLocation Loc,
+                           AccessKind Access, int Index);
+};
+
+/// This object is the root of all ObjectAccessTrees.
+class ObjectTraversalAggregation {
+  DeclarationName DeclName; // The name of the global struct/union.
+  ObjectAccessTree AccessTree;
+
+public:
+  ObjectTraversalAggregation(DeclarationName Name, SourceLocation Loc,
+                             FieldIndexArray FieldIndices, AccessKind Access,
+                             int Index);
+  void addFieldRW(SourceLocation Loc, FieldIndexArray FieldIndices,
+                  AccessKind Access, int Index);
+  DeclarationName getDeclName() const;
+  bool shouldBeReported() const;
+
+private:
+  bool shouldBeReportedRec(const ObjectAccessTree *Node) const;
+};
+
+/// GlobalRWVisitor (global read write visitor) does all the traversals.
+class GlobalRWVisitor : public RecursiveASTVisitor<GlobalRWVisitor> {
+  friend RecursiveASTVisitor<GlobalRWVisitor>;
+
+public:
+  GlobalRWVisitor(bool IsWritePossibleThroughFunctionParam);
+
+  // startTraversal is called to start a new traversal. It increments the
+  // TraversalIndex, which in turn will generate new TraversalResults.
+  void startTraversal(Expr *E);
+
+  const llvm::SmallVector<TraversalAggregation> &getGlobalsFound() const;
+
+  const llvm::SmallVector<ObjectTraversalAggregation>
+      &getObjectGlobalsFound() const;
+
+protected:
+  // RecursiveASTVisitor overrides
+  bool VisitDeclRefExpr(DeclRefExpr *S);
+  bool VisitUnaryOperator(UnaryOperator *Op);
+  bool VisitBinaryOperator(BinaryOperator *Op);
+  bool VisitCallExpr(CallExpr *CE);
+  bool VisitMemberExpr(MemberExpr *ME);
+
+private:
+  void visitCallExprArgs(CallExpr *CE);
+  void traverseFunctionsToBeChecked();
+
+  llvm::SmallVector<TraversalAggregation> GlobalsFound;
+  llvm::SmallVector<ObjectTraversalAggregation> ObjectGlobalsFound;
+
+  // We check inside functions only if the functions hasn't been checked in
+  // the current traversal. We use this array to check if the function is
+  // already registered to be checked.
+  llvm::SmallVector<std::pair<DeclarationName, Stmt *>> FunctionsToBeChecked;
+
+  // The TraversalIndex is used to differentiate between two sides of a binary
+  // expression or the parameters of a function. Every traversal represents
+  // one such expression and the TraversalIndex is incremented between them.
+  int TraversalIndex;
+
+  // Accesses that are inside functions are not checked by -Wunsequenced,
+  // therefore we keep track of whether we are inside of a function or not.
+  bool IsInFunction;
+
+  // Same as the HandleMutableFunctionParametersAsWrites option.
+  bool IsWritePossibleThroughFunctionParam;
+
+  void addGlobal(DeclarationName Name, SourceLocation Loc, bool IsWrite,
+                 bool IsUnchecked);
+  void addGlobal(const DeclRefExpr *DR, bool IsWrite, bool IsUnchecked);
+  void addField(DeclarationName Name, FieldIndexArray FieldIndices,
+                SourceLocation Loc, bool IsWrite, bool IsUnchecked);
+  bool handleModified(const Expr *Modified, bool IsUnchecked);
+  bool handleModifiedVariable(const DeclRefExpr *DE, bool IsUnchecked);
+  bool handleAccessedObject(const Expr *E, bool IsWrite, bool IsUnchecked);
+  bool isVariable(const Expr *E);
+};
+} // namespace
+
+static bool isGlobalDecl(const VarDecl *VD) {
+  return VD && VD->hasGlobalStorage() && VD->getLocation().isValid() &&
+         !VD->getType().isConstQualified();
+}
+
+AST_MATCHER_P(Expr, twoGlobalWritesBetweenSequencePoints, const LangStandard *,
+              LangStd) {
+  assert(LangStd);
+
+  const Expr *E = &Node;
+
+  if (const BinaryOperator *Op = dyn_cast<BinaryOperator>(E)) {
+    const BinaryOperator::Opcode Code = Op->getOpcode();
+    if (Code == BO_LAnd || Code == BO_LOr || Code == BO_Comma) {
+      return false;
+    }
+
+    if (Op->isAssignmentOp() && isa<DeclRefExpr>(Op->getLHS())) {
+      return false;
+    }
+
+    if (LangStd->isCPlusPlus17() &&
+        (Code == BO_Shl || Code == BO_Shr || Code == BO_PtrMemD ||
+         Code == BO_PtrMemI || Op->isAssignmentOp())) {
+      return false;
+    }
+
+    return true;
+  }
+
+  if (isa<CallExpr>(E)) {
+    return true;
+  }
+
+  return false;
+}
+
+ConflictingGlobalAccesses::ConflictingGlobalAccesses(StringRef Name,
+                                                     ClangTidyContext *Context)
+    : ClangTidyCheck(Name, Context),
+      HandleMutableFunctionParametersAsWrites(
+              Options.get("HandleMutableFunctionParametersAsWrites", false))
+{}
+
+void ConflictingGlobalAccesses::storeOptions(ClangTidyOptions::OptionMap& Opts)
+{
+    Options.store(Opts, "HandleMutableFunctionParametersAsWrites",
+                  HandleMutableFunctionParametersAsWrites);
+}
+
+void ConflictingGlobalAccesses::registerMatchers(MatchFinder *Finder) {
+
+  const LangStandard *LangStd =
+      &LangStandard::getLangStandardForKind(getLangOpts().LangStd);
+
+  ast_matchers::internal::Matcher<Expr> GlobalAccessMatcher =
+      twoGlobalWritesBetweenSequencePoints(LangStd);
+
+  Finder->addMatcher(
+      stmt(traverse(TK_AsIs, expr(GlobalAccessMatcher).bind("gw"))), this);
+}
+
+void ConflictingGlobalAccesses::check(const MatchFinder::MatchResult &Result) {
+  const Expr *E = Result.Nodes.getNodeAs<Expr>("gw");
+  assert(E);
+
+  GlobalRWVisitor Visitor(HandleMutableFunctionParametersAsWrites);
+  if (const auto *Op = dyn_cast<BinaryOperator>(E)) {
+    Visitor.startTraversal(Op->getLHS());
+    Visitor.startTraversal(Op->getRHS());
+
+  } else if (const auto *CE = dyn_cast<CallExpr>(E)) {
+    for (uint32_t I = 0; I < CE->getNumArgs(); I++) {
+      Visitor.startTraversal(const_cast<Expr *>(CE->getArg(I)));
+    }
+  }
+
+  const llvm::SmallVector<TraversalAggregation> &Globals =
+      Visitor.getGlobalsFound();
+
+  for (uint32_t I = 0; I < Globals.size(); I++) {
+    if (Globals[I].shouldBeReported()) {
+      diag(E->getBeginLoc(), "read/write conflict on global variable " +
+                                 Globals[I].getDeclName().getAsString());
+    }
+  }
+  const llvm::SmallVector<ObjectTraversalAggregation> &ObjectGlobals =
+      Visitor.getObjectGlobalsFound();
+  for (uint32_t I = 0; I < ObjectGlobals.size(); I++) {
+    if (ObjectGlobals[I].shouldBeReported()) {
+      diag(E->getBeginLoc(), "read/write conflict on the field of the global "
+                             "object " +
+                                 ObjectGlobals[I].getDeclName().getAsString());
+    }
+  }
+}
+
+GlobalRWVisitor::GlobalRWVisitor(bool IsWritePossibleThroughFunctionParam) 
+    : TraversalIndex(0), IsInFunction(false),
+      IsWritePossibleThroughFunctionParam(IsWritePossibleThroughFunctionParam)
+{}
+
+void GlobalRWVisitor::startTraversal(Expr *E) {
+  TraversalIndex++;
+  FunctionsToBeChecked.clear();
+  IsInFunction = false;
+  TraverseStmt(E);
+
+  // We keep a list of functions to be checked during traversal so that they are
+  // not checked multiple times.
+  traverseFunctionsToBeChecked();
+}
+
+void GlobalRWVisitor::traverseFunctionsToBeChecked() {
+  IsInFunction = true;
+
+  // We could find more functions to be checked while checking functions.
+  // Because a simple iterator could get invalidated, we index into the array.
+  for (size_t I = 0; I < FunctionsToBeChecked.size(); ++I) {
+    TraverseStmt(FunctionsToBeChecked[I].second);
+  }
+}
+
+bool GlobalRWVisitor::isVariable(const Expr *E) {
+  const Type *T = E->getType().getTypePtrOrNull();
+  assert(T);
+
+  return isa<DeclRefExpr>(E) && (!T->isRecordType() || T->isUnionType());
+}
+
+bool GlobalRWVisitor::VisitDeclRefExpr(DeclRefExpr *DR) {
+  if (!isa<VarDecl>(DR->getDecl())) {
+    return true;
+  }
+  const auto *VD = dyn_cast<VarDecl>(DR->getDecl());
+  if (!isVariable(DR)) {
+    return handleAccessedObject(DR, /*IsWrite*/ false, /*IsUnchecked*/ false);
+  }
+  if (isGlobalDecl(VD)) {
+    addGlobal(VD->getDeclName(), VD->getBeginLoc(), /*IsWrite*/ false,
+              /*IsUnchecked*/ false);
+    return true;
+  }
+  return true;
+}
+
+bool GlobalRWVisitor::VisitMemberExpr(MemberExpr *ME) {
+  return handleAccessedObject(ME, /*IsWrite*/ false, /*IsUnchecked*/ false);
+}
+
+bool GlobalRWVisitor::handleModifiedVariable(const DeclRefExpr *DR,
+                                             bool IsUnchecked) {
+  if (!isa<VarDecl>(DR->getDecl())) {
+    return true;
+  }
+  const auto *VD = dyn_cast<VarDecl>(DR->getDecl());
+
+  if (isGlobalDecl(VD)) {
+    addGlobal(VD->getDeclName(), VD->getBeginLoc(), /*IsWrite*/ true,
+              IsUnchecked);
+    return false;
+  }
+  return true;
+}
+
+bool GlobalRWVisitor::handleAccessedObject(const Expr *E, bool IsWrite,
+                                           bool IsUnchecked) {
+  const Expr *CurrentNode = E;
+  int NodeCount = 0;
+  while (isa<MemberExpr>(CurrentNode)) {
+    const MemberExpr *CurrentField = dyn_cast<MemberExpr>(CurrentNode);
+
+    if (CurrentField->isArrow()) {
+      return true;
+    }
+
+    const ValueDecl *Decl = CurrentField->getMemberDecl();
+    if (!isa<FieldDecl>(Decl)) {
+      return true;
+    }
+
+    CurrentNode = CurrentField->getBase();
+    NodeCount++;
+  }
+
+  if (!isa<DeclRefExpr>(CurrentNode)) {
+    return true;
+  }
+  const DeclRefExpr *Base = dyn_cast<DeclRefExpr>(CurrentNode);
+
+  if (!isa<VarDecl>(Base->getDecl())) {
+    return true;
+  }
+  const VarDecl *BaseDecl = dyn_cast<VarDecl>(Base->getDecl());
+
+  if (!isGlobalDecl(BaseDecl)) {
+    return true;
+  }
+
+  FieldIndexArray FieldIndices(NodeCount);
+  CurrentNode = E;
+  while (isa<MemberExpr>(CurrentNode)) {
+    const MemberExpr *CurrentField = dyn_cast<MemberExpr>(CurrentNode);
+    const FieldDecl *Decl = dyn_cast<FieldDecl>(CurrentField->getMemberDecl());
+    assert(Decl);
+
+    FieldIndices[NodeCount - 1] = Decl->getFieldIndex();
+    const RecordDecl *Record = Decl->getParent();
+    assert(Record);
+
+    if (Record->isUnion()) {
+      FieldIndices[NodeCount - 1] |= FiUnion;
+    }
+
+    CurrentNode = CurrentField->getBase();
+    NodeCount--;
+  }
+
+  addField(BaseDecl->getDeclName(), FieldIndices, Base->getBeginLoc(), IsWrite,
+           IsUnchecked);
+  return false;
+}
+
+bool GlobalRWVisitor::handleModified(const Expr *Modified, bool IsUnchecked) {
+  assert(Modified);
+
+  if (isVariable(Modified)) {
+    return handleModifiedVariable(dyn_cast<DeclRefExpr>(Modified), IsUnchecked);
+  }
+
+  return handleAccessedObject(Modified, /*IsWrite*/ true, IsUnchecked);
+}
+
+bool GlobalRWVisitor::VisitUnaryOperator(UnaryOperator *Op) {
+  UnaryOperator::Opcode Code = Op->getOpcode();
+  if (Code == UO_PostInc || Code == UO_PostDec || Code == UO_PreInc ||
+      Code == UO_PreDec) {
+    return handleModified(Op->getSubExpr(), /*IsUnchecked*/ false);
+  }
+  return true;
+}
+
+bool GlobalRWVisitor::VisitBinaryOperator(BinaryOperator *Op) {
+  if (Op->isAssignmentOp()) {
+    return handleModified(Op->getLHS(), /*IsUnchecked*/ false);
+  }
+
+  return true;
+}
+
+void GlobalRWVisitor::visitCallExprArgs(CallExpr *CE) {
+  const Type *CT = CE->getCallee()->getType().getTypePtrOrNull();
+  if (const auto *PT = dyn_cast_if_present<PointerType>(CT)) {
+    CT = PT->getPointeeType().getTypePtrOrNull();
+  }
+  const auto *ProtoType = dyn_cast_if_present<FunctionProtoType>(CT);
+
+  for (uint32_t I = 0; I < CE->getNumArgs(); I++) {
+    Expr *Arg = CE->getArg(I);
+
+    if (!ProtoType || I >= ProtoType->getNumParams()) {
+      continue;
+    }
+
+    if (const auto *Op = dyn_cast<UnaryOperator>(Arg)) {
+      if (Op->getOpcode() != UO_AddrOf) {
+        continue;
+      }
+
+      if (const auto *PtrType = dyn_cast_if_present<PointerType>(
+              ProtoType->getParamType(I).getTypePtrOrNull())) {
+        if (PtrType->getPointeeType().isConstQualified()) {
+          continue;
+        }
+
+        if (handleModified(Op->getSubExpr(), /*IsUnchecked*/ true)) {
+          continue;
+        }
+      }
+    }
+
+    if (const auto *RefType = dyn_cast_if_present<ReferenceType>(
+            ProtoType->getParamType(I).getTypePtrOrNull())) {
+      if (RefType->getPointeeType().isConstQualified()) {
+        continue;
+      }
+
+      if (handleModified(Arg, /*IsUnchecked*/ true)) {
+        continue;
+      }
+    }
+  }
+}
+
+bool GlobalRWVisitor::VisitCallExpr(CallExpr *CE) {
+
+  if (IsWritePossibleThroughFunctionParam || isa<CXXOperatorCallExpr>(CE)) {
+    visitCallExprArgs(CE);
+  }
+
+  if (!isa_and_nonnull<FunctionDecl>(CE->getCalleeDecl())) {
+    return true;
+  }
+  const auto *FD = dyn_cast<FunctionDecl>(CE->getCalleeDecl());
+
+  if (!FD->hasBody()) {
+    return true;
+  }
+
+  for (const std::pair<DeclarationName, Stmt *> &Fun : FunctionsToBeChecked) {
+    if (Fun.first == FD->getDeclName()) {
+      return true;
+    }
+  }
+  FunctionsToBeChecked.emplace_back(FD->getDeclName(), FD->getBody());
+
+  return true;
+}
+
+const llvm::SmallVector<TraversalAggregation> &
+GlobalRWVisitor::getGlobalsFound() const {
+  return GlobalsFound;
+}
+
+const llvm::SmallVector<ObjectTraversalAggregation> &
+GlobalRWVisitor::getObjectGlobalsFound() const {
+  return ObjectGlobalsFound;
+}
+
+void GlobalRWVisitor::addGlobal(DeclarationName Name, SourceLocation Loc,
+                                bool IsWrite, bool IsUnchecked) {
+  AccessKind Access = (IsInFunction || IsUnchecked)
+                          ? (IsWrite ? AkUncheckedWrite : AkUncheckedRead)
+                          : (IsWrite ? AkWrite : AkRead);
+  for (uint32_t I = 0; I < GlobalsFound.size(); I++) {
+    if (GlobalsFound[I].getDeclName() == Name) {
+      Global...
[truncated]

Copy link
Member

@PiotrZSL PiotrZSL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about name, some simpler name would be better.
Missing documentation/release notes.

Add check need to be simplified.

@carlosgalvezp
Copy link
Contributor

If this is a CERT check, please add the corresponding aliases towards the CERT module.

@ConcreteCactus ConcreteCactus force-pushed the main branch 3 times, most recently from 6879af3 to 7b02721 Compare April 27, 2025 14:18
@ConcreteCactus ConcreteCactus force-pushed the main branch 2 times, most recently from 0196b1b to 983fb8e Compare May 28, 2025 15:41
@ConcreteCactus
Copy link
Author

Hi Everyone,
I fixed a crash in the ExecutionVisitor. It wasn't checking whether the definition of a class exists before trying to recurse into the destructor.

Can I get a re-review on this PR? I think I addressed most of the comments.

Copy link
Contributor

@vbvictor vbvictor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly left comments in docs about general improvements. Didn't look closely at the code, but here are some general recommendations that you should do before major review:

There are a lot of new code that is relatively hard to review, you should try to follow LLVM coding guidelines, most importantly these:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto code-block

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the code block directive as requested, but I wasn't sure how it works with the option directive, so I added the code block outside of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works inside option, for example look at https://clang.llvm.org/extra/clang-tidy/checks/cppcoreguidelines/avoid-do-while.html.
You should put code under option

Copy link
Contributor

@vbvictor vbvictor Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, code-blocks need to start with a new line, I'm not sure if they are rendered correctly.
Did you run ninja docs-clang-tools-html to check?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, my ninja can't find target docs-clang-tools-html. Maybe I didn't enable its project when I was generating with cmake.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need Sphinx installed in your system.
Please test your docs that they look correct.
https://clang.llvm.org/extra/clang-tidy/Contributing.html#documenting-your-check

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can read full guide at https://clang.llvm.org/extra/clang-tidy/Contributing.html, it says how and what you should install and do.
Also, Fix clang-format issues and clang-tidy issues (if present), Currently there isn't a CI job for clang-tidy run so please do it locally.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you I'll test and fix the documentation and the formatting issues. I'll run clang-tidy as well.

Comment on lines 62 to 64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option need some rephrasing and clarification IMO.
It should be something like:
"When `true`, treat calls of functions that accept a mutable reference or a pointer as a write to the variable that was passed to the function."
I'm not totally satisfied with my wording, but I tried to eliminate this ambiguity:
What does "handling parameters as writes" mean here? Especially word "handling".

Also, give an example of some real code that doesn't get flagged without this option and gets flagged with the option enabled.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified the description and added a more descriptive code example. Hope this helps.

Comment on lines 10 to 12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should not explicitly exclude cases that are covered by "-Wunsequenced". It's okay for check to have duplicate functionality with compiler flags. Moreover, for gcc there is "-Wsequence-point" and we don't know to what extent "-Wunsequenced" can find bugs, so It's better just to mention that functionality of this check may overlap with existing compiler flags such as "-Wunsequenced", "-Wsequence-point".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I wasn't sure whether to include/exclude -Wunsequenced. I removed the code parts that excluded flagging those cases, and noted later in the documentation that there could be some overlap.

@vbvictor vbvictor requested a review from EugeneZelenko June 1, 2025 14:45
@ConcreteCactus ConcreteCactus changed the title [clang-tidy] Added Conflicting Global Accesses checker [clang-tidy] Added bugprone-conflicting-global-accesses check Jun 10, 2025
@ConcreteCactus ConcreteCactus changed the title [clang-tidy] Added bugprone-conflicting-global-accesses check [clang-tidy] Added bugprone-unsequenced-global-accesses check Jun 10, 2025

.. option:: HandleMutableFunctionParametersAsWrites

When `true`, treat function calls with mutable reference or pointer parameters
Copy link
Contributor

@vbvictor vbvictor Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whole description of the option should be with 2 spaces indent

@github-actions
Copy link

github-actions bot commented Jun 10, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

    This check attempts to detect unsequenced accesses to global
    variables. It recurses into function calls in the same translation unit,
    and can handle fields on global structs/unions.
@vbvictor
Copy link
Contributor

Looking at the check code in general, I have some concerns whether this check should be implemented as a Clang Static Analyzer (CSA) check instead of a clang-tidy check.
My main reason is that we now have a lot of manual node traversing, even an ExecutionVisitor. This whole traversing part must be already implemented in the core engine of the CSA. Even in Getting Involved guide it mentions that CSA should be used for control flow analysis, and we clearly have it here.

I kindly ask for an opinion of clang-tidy maintainers on this matter, @HerrCai0907 @carlosgalvezp @PiotrZSL.

Also, kindly ask CSA maintainers to take a brief look at this checker. From your point of view, will it be easier and less code consuming to implement such checker as a part of CSA? @steakhal @NagyDonat @haoNoQ.

@steakhal
Copy link
Contributor

CC @michael-jabbour-sonarsource

return globalVar + incFun(); // This is not detected by -Wunsequenced.
}

This clang-tidy check attempts to detect such cases. It recurses into functions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This clang-tidy check attempts to detect such cases. It recurses into functions
This check attempts to detect such cases. It recurses into functions


.. option:: HandleMutableFunctionParametersAsWrites

When ``true``, treat function calls with mutable reference or pointer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When ``true``, treat function calls with mutable reference or pointer
When `true`, treat function calls with mutable reference or pointer

When ``true``, treat function calls with mutable reference or pointer
parameters as writes to the parameter.

The default value is ``false``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The default value is ``false``.
Default is `false`.

The default value is ``false``.

For example, the following code block will get flagged if
``HandleMutableFunctionParametersAsWrites`` is ``true``:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
``HandleMutableFunctionParametersAsWrites`` is ``true``:
:option:`HandleMutableFunctionParametersAsWrites` is `true`:

int a = globalVar + func(globalVar);
}

When ``HandleMutableFunctionParametersAsWrites`` is set to `true`, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When ``HandleMutableFunctionParametersAsWrites`` is set to `true`, the
When :option:`HandleMutableFunctionParametersAsWrites` is set to `true`, the

sequencing is defined for the ``+`` operator, a write to ``globalVar``
inside ``c`` would be undefined behavior.

When ``HandleMutableFunctionParametersAsWrites`` is set to ``false``, the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When ``HandleMutableFunctionParametersAsWrites`` is set to ``false``, the
When :option:`HandleMutableFunctionParametersAsWrites` is set to `false`, the

@haoNoQ
Copy link
Collaborator

haoNoQ commented Jun 12, 2025

Looking at the check code in general, I have some concerns whether this check should be implemented as a Clang Static Analyzer (CSA) check instead of a clang-tidy check. My main reason is that we now have a lot of manual node traversing, even an ExecutionVisitor. This whole traversing part must be already implemented in the core engine of the CSA. Even in Getting Involved guide it mentions that CSA should be used for control flow analysis, and we clearly have it here.

I kindly ask for an opinion of clang-tidy maintainers on this matter, @HerrCai0907 @carlosgalvezp @PiotrZSL.

Also, kindly ask CSA maintainers to take a brief look at this checker. From your point of view, will it be easier and less code consuming to implement such checker as a part of CSA? @steakhal @NagyDonat @haoNoQ.

Oh good question. Generally it's a bit more nuanced than that. The static analyzer is doing a lot more than AST traversal; it's doing so-called "path sensitive" analysis which tries to enumerate different execution paths in the program. Which is very good at finding realistic execution paths on which something "bad" may happen, and not suitable for all other kinds of analysis.

Yes, I think this checker could be implemented as a path-sensitive checker in the static analyzer. It does indeed search for "bad" things that may happen, which is exactly what the path-sensitive engine is good at. Path-sensitive analysis is more precise, it may help you naturally eliminate certain classes of false positives that would be extremely difficult to eliminate manually. However, it's also significantly more expensive, so it'll run slower for the users who aren't already using path-other sensitive checks (the performance cost will be negligible for users who are already enabling at least one other path-sensitive checker). Additionally, because of being more expensive and explosive in complexity, it may suffer from more false negatives, simply because it's running out of the analysis "budget".

This won't necessarily eliminate a lot of code from the checker. Path-sensitive checkers often end up more simple and elegant. You'll still need to identify the places where the execution may be unsequenced, we won't do that for you. But then you'll put markers in the path-sensitive state when you're inside those statements, and focus on responding to global variable accesses while those markers are present. We'll do the rest of the traversal for you. Path-sensitive analysis may also save you a lot of follow-up polish work (I haven't looked at how much of that you've already done).


Now, in practice, the best thing about path-sensitive analysis is that it naturally helps you avoid false positives of the following nature:

int globalVar;

int foo(bool shouldInc) {
  if (shouldInc)
    globalVar++;
  return globalVar;
}

void bar() {
  globalVar + foo(false); // should not warn
}

In this case the problematic mutation cannot happen in practice because the branch on which the increment happens is entirely unreachable in the context of bar(). The function foo() may increment the variable in other situations but it will never do that when called from inside bar() specifically.

These kinds of false positives may happen in a variety of different ways and you'll have a very hard time pattern-matching them all down on the AST level without a powerful principled solution that the path-sensitive engine implements. If you're using path-sensitive analysis, the false positive is eliminated automatically, by construction, without any effort on your part. You will simply never notice the increment of the global variable when you're analyzing bar() so you won't have to worry about reacting to it incorrectly. The analyzer simply knows that this execution path does not exist, so it'll never show it to you.

Whether you actually worry about these false positives, is entirely up to you though. It's a matter of your relationship with your users, a matter of understanding their demands, and also a matter of observing the behavior of the checker on real-world code. I cannot answer that question for you, you'll need to experiment.

We've had a similar situation in our "don't call virtual methods on a partially constructed object" checker (https://clang.llvm.org/docs/analyzer/checkers.html#cplusplus-purevirtualcall-c, https://clang.llvm.org/docs/analyzer/checkers.html#optin-cplusplus-virtualcall). It turned out that in many cases the code was deliberately written in such a way that the constructor may call a function in turn may eventually call a virtual method but in reality the constructor passes a flag that makes sure the virtual method is never called from inside that function. So our initial attempt to avoid path-sensitive analysis has failed and we had to rewrite the checker as a path-sensitive checker.


Path-sensitive analysis also offers other bonuses. For example, it's able to track references and aliasing to some extent, so you're getting the following case covered "for free":

int globalVar;

int foo(int &var) {
  return var++;
}

void bar() {
  int &var = globalVar;
  globalVar + foo(var); // should warn
}

It's also able to resolve calls through function pointers and virtual calls in some cases:

int globalVar;

int foo() {
  return globalVar++;
}

void bar() {
  int (*func)() = foo;
  globalVar + func(); // should warn
}

But these are relatively minor and they probably won't ruin your checker if you don't support them.


So, overall, up to you. If you go for path-sensitive analysis, it would be a major rewrite, and not without downsides. You probably don't need to do it just for the purposes of elegance or code reuse. (Though I definitely won't stop you. I love elegance and code reuse.)

I think the main question you need to answer is whether the increased precision - in terms of both false positives and false negatives - is of value to you. The potential false negatives due to budget limitations are probably the second most important factor. But either way, it all depends on your experimental data and on your relationship with your future users.

@haoNoQ
Copy link
Collaborator

haoNoQ commented Jun 12, 2025

Ok I kinda forgot about the elephant in the room.

With path sensitivity you'll be able to handle more kinds of variables. Not just global variables but also stuff in heap memory. As well as someone else's stack variables passed by reference down the stack. Or it could be, like, just some pointer or reference passed to us from god knows where. We don't have to know where and by whom exactly the memory it points to has been allocated. We just need to know whether it's the same memory address at two different points of the program.

The static analyzer will model such aliasing for you and give you that information for free, so that at any point of time you know which pointer points to which storage, even if you can't (or don't want to) describe that storage in terms of source code. It'd magically work correctly even if that information depends on the exact execution path up to that point. You don't even need to handle all these cases separately. It'd just be "some memory" to you.

This could make your check significantly more powerful. And that's very hard without path sensitivity.


FWIW one case you can probably handle without path sensitivity is: member variables inside a method. Eg., in

class MyClass {
  int memberVar;

public:
  int foo() {
    memberVar++;
    return memberVar;
  }

  void bar() {
    memberVar + foo(); // should warn
  }
};

you don't really need to know what this points to, as long as you confirm that it points to the same thing inside foo() and bar(). Which is trivial because the foo() invocation inside bar() is invoked directly on this. This is so easy because the this pointer cannot be overwritten at runtime, so you don't have to worry about code like this = &other.

You can probably handle references similarly, since they also cannot be reassigned at runtime. It doesn't matter that you don't know what the reference refers to:

int foo(int &refVar) {
  refVar++;
  return refVar;
}

void bar(int &refVar) {
  refVar + foo(refVar); // should warn
}

But when it comes to pointers and heap storage, you'll probably have a very hard time with path sensitivity.

@haoNoQ
Copy link
Collaborator

haoNoQ commented Jun 12, 2025

// To represent fields in structs or unions we use numbered FieldIndices.

This piece of complexity is also trivialized by path sensitive analysis. You'll simply ask whether it's an access to the same memory address or not. No need to describe what the address is or who owns it.

@vbvictor
Copy link
Contributor

Thank you for your long and detailed answer! Indeed, the check can benefit a lot from path-sensitive analysis:

  • Easier traversing with less boilerplate code.
  • Easier handling of edge-cases like struct fields and unions, support of local variables.
  • Fewer false-positives with more positives.

So from my perspective, this check can live as a clang-tidy check. However, looking at all benefits of path-sensitive analysis of CSA, the check would look a lot better as a CSA check. I understand that CSA checks might be harder to write since CSA itself has higher entry threshold than clang-tidy, but it's hard to give a thoughtful and careful review of such big clang-tidy check, especially that has non-trivial manual AST-visiting.

@HerrCai0907
Copy link
Contributor

It does not look like traditional clang-tidy check. Personally I prefer to move it to CSA instead of clang-tidy.

@ConcreteCactus
Copy link
Author

Hi Everyone,

Thank you for the detailed suggestions. Now that I have a better idea of what CSA is, I also think that this check could be better implemented in clang-static-analyzer. Being able to reduce the number of false positives using path sensitive analysis is in my opinion the most important advantage of using CSA.

What do you think? Should I close this PR, and open a new one when I'm ready with the CSA checker?

By the way, I don't have any users. This check is more like a University project I picked. The reason I started with clang-tidy is simply because that was the suggestion my teacher gave me.

@vbvictor
Copy link
Contributor

Should I close this PR, and open a new one when I'm ready with the CSA checker?

To keep everything organized, I'd suggest creating a new branch for CSA checker and close this PR (you can always reopen it later).

For the CSA part, you can start by reading this developer guide (probably a bit outdated?). This original white paper is very helpful in understanding the core principles of the CSA and how it all started. haoNoQ's guide gives excellent practical description of how to write checkers and what functions the CSA has.

@ConcreteCactus
Copy link
Author

To keep everything organized, I'd suggest creating a new branch for CSA checker and close this PR (you can always reopen it later).

For the CSA part, you can start by reading this developer guide (probably a bit outdated?). This original white paper is very helpful in understanding the core principles of the CSA and how it all started. haoNoQ's guide gives excellent practical description of how to write checkers and what functions the CSA has.

Thank you, I'll open a new PR when I'm ready with the CSA checker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants