Skip to content

Conversation

@gamesh411
Copy link
Contributor

@gamesh411 gamesh411 commented Sep 23, 2024

This patch's primary driving force is reducing code duplication in BlockInCriticalSectionChecker and PthreadLockChecker.
This is mainly NFC. The user-facing API is the same, but at least one false positive has been fixed, and the internal state representation has changed.

@gamesh411 gamesh411 force-pushed the thread-modeling-upstreaming branch from b660744 to c9219d6 Compare September 23, 2024 12:43
@github-actions
Copy link

github-actions bot commented Sep 23, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 82e314e3664d2c8768212e74f751187d51950b87 e09f147f8298e1d8a0657e8dafd4f96f57519b8b --extensions c,cpp,h -- clang/lib/StaticAnalyzer/Checkers/MutexModeling.cpp clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingAPI.h clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingDefs.h clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingDomain.h clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingGDM.h clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexRegionExtractor.h clang/lib/StaticAnalyzer/Checkers/BlockInCriticalSectionChecker.cpp clang/lib/StaticAnalyzer/Checkers/PthreadLockChecker.cpp clang/test/Analysis/analyzer-enabled-checkers.c clang/test/Analysis/block-in-critical-section.cpp clang/test/Analysis/pthreadlock_state.c clang/test/Analysis/pthreadlock_state_nottracked.c clang/test/Analysis/std-c-library-functions-arg-enabled-checkers.c
View the diff from clang-format here.
diff --git a/clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingAPI.h b/clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingAPI.h
index bcc076a256..a597d860e4 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingAPI.h
+++ b/clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingAPI.h
@@ -53,9 +53,9 @@ MakeFirstArgExtractor(ArrayRef<StringRef> NameParts, int NumArgsRequired = 1,
       CallDescription{MatchAs, NameParts, NumArgsRequired}};
 }
 
-inline auto MakeMemberExtractor(ArrayRef<StringRef> NameParts,
-                                int NumArgsRequired = 0,
-                                CallDescription::Mode MatchAs = CDM::CXXMethod) {
+inline auto
+MakeMemberExtractor(ArrayRef<StringRef> NameParts, int NumArgsRequired = 0,
+                    CallDescription::Mode MatchAs = CDM::CXXMethod) {
   return MemberMutexExtractor{
       CallDescription{MatchAs, NameParts, NumArgsRequired}};
 }
diff --git a/clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingDefs.h b/clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingDefs.h
index d0b2f5957b..48d1bedffb 100644
--- a/clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingDefs.h
+++ b/clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingDefs.h
@@ -22,8 +22,8 @@ namespace clang::ento::mutex_modeling {
 
 static auto getHandledEvents(){return std::vector<EventDescriptor> {
   // - Pthread
-  EventDescriptor{MakeFirstArgExtractor({"pthread_mutex_init"}), EventKind::Init,
-                  LibraryKind::Pthread},
+  EventDescriptor{MakeFirstArgExtractor({"pthread_mutex_init"}),
+                  EventKind::Init, LibraryKind::Pthread},
 #if 0
              // TODO: pthread_rwlock_init(2 arguments).
              // TODO: lck_mtx_init(3 arguments).

@gamesh411 gamesh411 force-pushed the thread-modeling-upstreaming branch 4 times, most recently from 3e6439e to 5f80c4a Compare October 7, 2024 00:17
@gamesh411 gamesh411 changed the title [clang][analyzer][WIP] Introduce modeling for threading related checkers [clang][analyzer] Introduce modeling for threading related checkers Oct 7, 2024
@gamesh411 gamesh411 marked this pull request as ready for review October 7, 2024 08:42
@gamesh411 gamesh411 requested review from NagyDonat and haoNoQ October 7, 2024 08:42
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Oct 7, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 7, 2024

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Endre Fülöp (gamesh411)

Changes

This patch's primary driving force is reducing code duplication in BlockInCriticalSectionChecker and PthreadLockChecker.
This is mainly NFC. The user-facing API is the same, but at least one false positive has been fixed, and the internal state representation has changed.


Patch is 115.54 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/109636.diff

14 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Checkers/Checkers.td (+8-1)
  • (modified) clang/lib/StaticAnalyzer/Checkers/BlockInCriticalSectionChecker.cpp (+55-318)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt (+1)
  • (added) clang/lib/StaticAnalyzer/Checkers/MutexModeling.cpp (+773)
  • (added) clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingAPI.h (+244)
  • (added) clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingDomain.h (+126)
  • (added) clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexModelingGDM.h (+169)
  • (added) clang/lib/StaticAnalyzer/Checkers/MutexModeling/MutexRegionExtractor.h (+139)
  • (modified) clang/lib/StaticAnalyzer/Checkers/PthreadLockChecker.cpp (+148-609)
  • (modified) clang/test/Analysis/analyzer-enabled-checkers.c (+1)
  • (modified) clang/test/Analysis/block-in-critical-section.cpp (+8-13)
  • (modified) clang/test/Analysis/pthreadlock_state.c (+48-5)
  • (modified) clang/test/Analysis/pthreadlock_state_nottracked.c (+5)
  • (modified) clang/test/Analysis/std-c-library-functions-arg-enabled-checkers.c (+1)
diff --git a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
index 349040c15eeb83..b41580dcfba575 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
@@ -219,6 +219,11 @@ def DereferenceChecker : Checker<"NullDereference">,
   ]>,
   Documentation<HasDocumentation>;
 
+def MutexModeling : Checker<"MutexModeling">,
+  HelpText<"Model mutexes lock and unlock events">,
+  Documentation<NotDocumented>,
+  Hidden;
+
 def NonNullParamChecker : Checker<"NonNullParamChecker">,
   HelpText<"Check for null pointers passed as arguments to a function whose "
            "arguments are references or marked with the 'nonnull' attribute">,
@@ -307,6 +312,7 @@ def StackAddrAsyncEscapeChecker : Checker<"StackAddressAsyncEscape">,
 
 def PthreadLockBase : Checker<"PthreadLockBase">,
   HelpText<"Helper registering multiple checks.">,
+  Dependencies<[MutexModeling]>,
   Documentation<NotDocumented>,
   Hidden;
 
@@ -505,6 +511,7 @@ def UnixAPIMisuseChecker : Checker<"API">,
 
 def BlockInCriticalSectionChecker : Checker<"BlockInCriticalSection">,
   HelpText<"Check for calls to blocking functions inside a critical section">,
+  Dependencies<[MutexModeling]>,
   Documentation<HasDocumentation>;
 
 def DynamicMemoryModeling: Checker<"DynamicMemoryModeling">,
@@ -610,7 +617,7 @@ def ChrootChecker : Checker<"Chroot">,
 
 def PthreadLockChecker : Checker<"PthreadLock">,
   HelpText<"Simple lock -> unlock checker">,
-  Dependencies<[PthreadLockBase]>,
+  Dependencies<[PthreadLockBase, MutexModeling]>,
   Documentation<HasDocumentation>;
 
 def SimpleStreamChecker : Checker<"SimpleStream">,
diff --git a/clang/lib/StaticAnalyzer/Checkers/BlockInCriticalSectionChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/BlockInCriticalSectionChecker.cpp
index 7460781799d08a..e5be2dfc19ff42 100644
--- a/clang/lib/StaticAnalyzer/Checkers/BlockInCriticalSectionChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/BlockInCriticalSectionChecker.cpp
@@ -14,165 +14,24 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "MutexModeling/MutexModelingAPI.h"
+
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
-#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h"
-#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramStateTrait.h"
-#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState_Fwd.h"
-#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.h"
-#include "llvm/ADT/STLExtras.h"
-#include "llvm/ADT/SmallString.h"
-#include "llvm/ADT/StringExtras.h"
 
-#include <iterator>
 #include <utility>
-#include <variant>
 
 using namespace clang;
 using namespace ento;
+using namespace mutex_modeling;
 
 namespace {
-
-struct CritSectionMarker {
-  const Expr *LockExpr{};
-  const MemRegion *LockReg{};
-
-  void Profile(llvm::FoldingSetNodeID &ID) const {
-    ID.Add(LockExpr);
-    ID.Add(LockReg);
-  }
-
-  [[nodiscard]] constexpr bool
-  operator==(const CritSectionMarker &Other) const noexcept {
-    return LockExpr == Other.LockExpr && LockReg == Other.LockReg;
-  }
-  [[nodiscard]] constexpr bool
-  operator!=(const CritSectionMarker &Other) const noexcept {
-    return !(*this == Other);
-  }
-};
-
-class CallDescriptionBasedMatcher {
-  CallDescription LockFn;
-  CallDescription UnlockFn;
-
-public:
-  CallDescriptionBasedMatcher(CallDescription &&LockFn,
-                              CallDescription &&UnlockFn)
-      : LockFn(std::move(LockFn)), UnlockFn(std::move(UnlockFn)) {}
-  [[nodiscard]] bool matches(const CallEvent &Call, bool IsLock) const {
-    if (IsLock) {
-      return LockFn.matches(Call);
-    }
-    return UnlockFn.matches(Call);
-  }
-};
-
-class FirstArgMutexDescriptor : public CallDescriptionBasedMatcher {
-public:
-  FirstArgMutexDescriptor(CallDescription &&LockFn, CallDescription &&UnlockFn)
-      : CallDescriptionBasedMatcher(std::move(LockFn), std::move(UnlockFn)) {}
-
-  [[nodiscard]] const MemRegion *getRegion(const CallEvent &Call, bool) const {
-    return Call.getArgSVal(0).getAsRegion();
-  }
-};
-
-class MemberMutexDescriptor : public CallDescriptionBasedMatcher {
-public:
-  MemberMutexDescriptor(CallDescription &&LockFn, CallDescription &&UnlockFn)
-      : CallDescriptionBasedMatcher(std::move(LockFn), std::move(UnlockFn)) {}
-
-  [[nodiscard]] const MemRegion *getRegion(const CallEvent &Call, bool) const {
-    return cast<CXXMemberCall>(Call).getCXXThisVal().getAsRegion();
-  }
-};
-
-class RAIIMutexDescriptor {
-  mutable const IdentifierInfo *Guard{};
-  mutable bool IdentifierInfoInitialized{};
-  mutable llvm::SmallString<32> GuardName{};
-
-  void initIdentifierInfo(const CallEvent &Call) const {
-    if (!IdentifierInfoInitialized) {
-      // In case of checking C code, or when the corresponding headers are not
-      // included, we might end up query the identifier table every time when
-      // this function is called instead of early returning it. To avoid this, a
-      // bool variable (IdentifierInfoInitialized) is used and the function will
-      // be run only once.
-      const auto &ASTCtx = Call.getState()->getStateManager().getContext();
-      Guard = &ASTCtx.Idents.get(GuardName);
-    }
-  }
-
-  template <typename T> bool matchesImpl(const CallEvent &Call) const {
-    const T *C = dyn_cast<T>(&Call);
-    if (!C)
-      return false;
-    const IdentifierInfo *II =
-        cast<CXXRecordDecl>(C->getDecl()->getParent())->getIdentifier();
-    return II == Guard;
-  }
-
-public:
-  RAIIMutexDescriptor(StringRef GuardName) : GuardName(GuardName) {}
-  [[nodiscard]] bool matches(const CallEvent &Call, bool IsLock) const {
-    initIdentifierInfo(Call);
-    if (IsLock) {
-      return matchesImpl<CXXConstructorCall>(Call);
-    }
-    return matchesImpl<CXXDestructorCall>(Call);
-  }
-  [[nodiscard]] const MemRegion *getRegion(const CallEvent &Call,
-                                           bool IsLock) const {
-    const MemRegion *LockRegion = nullptr;
-    if (IsLock) {
-      if (std::optional<SVal> Object = Call.getReturnValueUnderConstruction()) {
-        LockRegion = Object->getAsRegion();
-      }
-    } else {
-      LockRegion = cast<CXXDestructorCall>(Call).getCXXThisVal().getAsRegion();
-    }
-    return LockRegion;
-  }
-};
-
-using MutexDescriptor =
-    std::variant<FirstArgMutexDescriptor, MemberMutexDescriptor,
-                 RAIIMutexDescriptor>;
-
 class BlockInCriticalSectionChecker : public Checker<check::PostCall> {
 private:
-  const std::array<MutexDescriptor, 8> MutexDescriptors{
-      // NOTE: There are standard library implementations where some methods
-      // of `std::mutex` are inherited from an implementation detail base
-      // class, and those aren't matched by the name specification {"std",
-      // "mutex", "lock"}.
-      // As a workaround here we omit the class name and only require the
-      // presence of the name parts "std" and "lock"/"unlock".
-      // TODO: Ensure that CallDescription understands inherited methods.
-      MemberMutexDescriptor(
-          {/*MatchAs=*/CDM::CXXMethod,
-           /*QualifiedName=*/{"std", /*"mutex",*/ "lock"},
-           /*RequiredArgs=*/0},
-          {CDM::CXXMethod, {"std", /*"mutex",*/ "unlock"}, 0}),
-      FirstArgMutexDescriptor({CDM::CLibrary, {"pthread_mutex_lock"}, 1},
-                              {CDM::CLibrary, {"pthread_mutex_unlock"}, 1}),
-      FirstArgMutexDescriptor({CDM::CLibrary, {"mtx_lock"}, 1},
-                              {CDM::CLibrary, {"mtx_unlock"}, 1}),
-      FirstArgMutexDescriptor({CDM::CLibrary, {"pthread_mutex_trylock"}, 1},
-                              {CDM::CLibrary, {"pthread_mutex_unlock"}, 1}),
-      FirstArgMutexDescriptor({CDM::CLibrary, {"mtx_trylock"}, 1},
-                              {CDM::CLibrary, {"mtx_unlock"}, 1}),
-      FirstArgMutexDescriptor({CDM::CLibrary, {"mtx_timedlock"}, 1},
-                              {CDM::CLibrary, {"mtx_unlock"}, 1}),
-      RAIIMutexDescriptor("lock_guard"),
-      RAIIMutexDescriptor("unique_lock")};
-
   const CallDescriptionSet BlockingFunctions{{CDM::CLibrary, {"sleep"}},
                                              {CDM::CLibrary, {"getc"}},
                                              {CDM::CLibrary, {"fgets"}},
@@ -182,147 +41,25 @@ class BlockInCriticalSectionChecker : public Checker<check::PostCall> {
   const BugType BlockInCritSectionBugType{
       this, "Call to blocking function in critical section", "Blocking Error"};
 
-  void reportBlockInCritSection(const CallEvent &call, CheckerContext &C) const;
-
-  [[nodiscard]] const NoteTag *createCritSectionNote(CritSectionMarker M,
-                                                     CheckerContext &C) const;
-
-  [[nodiscard]] std::optional<MutexDescriptor>
-  checkDescriptorMatch(const CallEvent &Call, CheckerContext &C,
-                       bool IsLock) const;
-
-  void handleLock(const MutexDescriptor &Mutex, const CallEvent &Call,
-                  CheckerContext &C) const;
-
-  void handleUnlock(const MutexDescriptor &Mutex, const CallEvent &Call,
-                    CheckerContext &C) const;
-
   [[nodiscard]] bool isBlockingInCritSection(const CallEvent &Call,
                                              CheckerContext &C) const;
 
+  void reportBlockInCritSection(const CallEvent &call, CheckerContext &C) const;
+
 public:
-  /// Process unlock.
-  /// Process lock.
-  /// Process blocking functions (sleep, getc, fgets, read, recv)
+  BlockInCriticalSectionChecker();
   void checkPostCall(const CallEvent &Call, CheckerContext &C) const;
 };
 
 } // end anonymous namespace
 
-REGISTER_LIST_WITH_PROGRAMSTATE(ActiveCritSections, CritSectionMarker)
-
-// Iterator traits for ImmutableList data structure
-// that enable the use of STL algorithms.
-// TODO: Move these to llvm::ImmutableList when overhauling immutable data
-// structures for proper iterator concept support.
-template <>
-struct std::iterator_traits<
-    typename llvm::ImmutableList<CritSectionMarker>::iterator> {
-  using iterator_category = std::forward_iterator_tag;
-  using value_type = CritSectionMarker;
-  using difference_type = std::ptrdiff_t;
-  using reference = CritSectionMarker &;
-  using pointer = CritSectionMarker *;
-};
-
-std::optional<MutexDescriptor>
-BlockInCriticalSectionChecker::checkDescriptorMatch(const CallEvent &Call,
-                                                    CheckerContext &C,
-                                                    bool IsLock) const {
-  const auto Descriptor =
-      llvm::find_if(MutexDescriptors, [&Call, IsLock](auto &&Descriptor) {
-        return std::visit(
-            [&Call, IsLock](auto &&DescriptorImpl) {
-              return DescriptorImpl.matches(Call, IsLock);
-            },
-            Descriptor);
-      });
-  if (Descriptor != MutexDescriptors.end())
-    return *Descriptor;
-  return std::nullopt;
-}
-
-static const MemRegion *skipStdBaseClassRegion(const MemRegion *Reg) {
-  while (Reg) {
-    const auto *BaseClassRegion = dyn_cast<CXXBaseObjectRegion>(Reg);
-    if (!BaseClassRegion || !isWithinStdNamespace(BaseClassRegion->getDecl()))
-      break;
-    Reg = BaseClassRegion->getSuperRegion();
-  }
-  return Reg;
-}
-
-static const MemRegion *getRegion(const CallEvent &Call,
-                                  const MutexDescriptor &Descriptor,
-                                  bool IsLock) {
-  return std::visit(
-      [&Call, IsLock](auto &Descr) -> const MemRegion * {
-        return skipStdBaseClassRegion(Descr.getRegion(Call, IsLock));
-      },
-      Descriptor);
-}
-
-void BlockInCriticalSectionChecker::handleLock(
-    const MutexDescriptor &LockDescriptor, const CallEvent &Call,
-    CheckerContext &C) const {
-  const MemRegion *MutexRegion =
-      getRegion(Call, LockDescriptor, /*IsLock=*/true);
-  if (!MutexRegion)
-    return;
-
-  const CritSectionMarker MarkToAdd{Call.getOriginExpr(), MutexRegion};
-  ProgramStateRef StateWithLockEvent =
-      C.getState()->add<ActiveCritSections>(MarkToAdd);
-  C.addTransition(StateWithLockEvent, createCritSectionNote(MarkToAdd, C));
-}
-
-void BlockInCriticalSectionChecker::handleUnlock(
-    const MutexDescriptor &UnlockDescriptor, const CallEvent &Call,
-    CheckerContext &C) const {
-  const MemRegion *MutexRegion =
-      getRegion(Call, UnlockDescriptor, /*IsLock=*/false);
-  if (!MutexRegion)
-    return;
-
-  ProgramStateRef State = C.getState();
-  const auto ActiveSections = State->get<ActiveCritSections>();
-  const auto MostRecentLock =
-      llvm::find_if(ActiveSections, [MutexRegion](auto &&Marker) {
-        return Marker.LockReg == MutexRegion;
-      });
-  if (MostRecentLock == ActiveSections.end())
-    return;
-
-  // Build a new ImmutableList without this element.
-  auto &Factory = State->get_context<ActiveCritSections>();
-  llvm::ImmutableList<CritSectionMarker> NewList = Factory.getEmptyList();
-  for (auto It = ActiveSections.begin(), End = ActiveSections.end(); It != End;
-       ++It) {
-    if (It != MostRecentLock)
-      NewList = Factory.add(*It, NewList);
-  }
-
-  State = State->set<ActiveCritSections>(NewList);
-  C.addTransition(State);
+BlockInCriticalSectionChecker::BlockInCriticalSectionChecker() {
+  RegisterBugTypeForMutexModeling(&BlockInCritSectionBugType);
 }
 
 bool BlockInCriticalSectionChecker::isBlockingInCritSection(
     const CallEvent &Call, CheckerContext &C) const {
-  return BlockingFunctions.contains(Call) &&
-         !C.getState()->get<ActiveCritSections>().isEmpty();
-}
-
-void BlockInCriticalSectionChecker::checkPostCall(const CallEvent &Call,
-                                                  CheckerContext &C) const {
-  if (isBlockingInCritSection(Call, C)) {
-    reportBlockInCritSection(Call, C);
-  } else if (std::optional<MutexDescriptor> LockDesc =
-                 checkDescriptorMatch(Call, C, /*IsLock=*/true)) {
-    handleLock(*LockDesc, Call, C);
-  } else if (std::optional<MutexDescriptor> UnlockDesc =
-                 checkDescriptorMatch(Call, C, /*IsLock=*/false)) {
-    handleUnlock(*UnlockDesc, Call, C);
-  }
+  return BlockingFunctions.contains(Call) && AreAnyCritsectionsActive(C);
 }
 
 void BlockInCriticalSectionChecker::reportBlockInCritSection(
@@ -342,56 +79,56 @@ void BlockInCriticalSectionChecker::reportBlockInCritSection(
   C.emitReport(std::move(R));
 }
 
-const NoteTag *
-BlockInCriticalSectionChecker::createCritSectionNote(CritSectionMarker M,
-                                                     CheckerContext &C) const {
-  const BugType *BT = &this->BlockInCritSectionBugType;
-  return C.getNoteTag([M, BT](PathSensitiveBugReport &BR,
-                              llvm::raw_ostream &OS) {
-    if (&BR.getBugType() != BT)
-      return;
-
-    // Get the lock events for the mutex of the current line's lock event.
-    const auto CritSectionBegins =
-        BR.getErrorNode()->getState()->get<ActiveCritSections>();
-    llvm::SmallVector<CritSectionMarker, 4> LocksForMutex;
-    llvm::copy_if(
-        CritSectionBegins, std::back_inserter(LocksForMutex),
-        [M](const auto &Marker) { return Marker.LockReg == M.LockReg; });
-    if (LocksForMutex.empty())
-      return;
-
-    // As the ImmutableList builds the locks by prepending them, we
-    // reverse the list to get the correct order.
-    std::reverse(LocksForMutex.begin(), LocksForMutex.end());
-
-    // Find the index of the lock expression in the list of all locks for a
-    // given mutex (in acquisition order).
-    const auto Position =
-        llvm::find_if(std::as_const(LocksForMutex), [M](const auto &Marker) {
-          return Marker.LockExpr == M.LockExpr;
-        });
-    if (Position == LocksForMutex.end())
-      return;
-
-    // If there is only one lock event, we don't need to specify how many times
-    // the critical section was entered.
-    if (LocksForMutex.size() == 1) {
-      OS << "Entering critical section here";
-      return;
-    }
-
-    const auto IndexOfLock =
-        std::distance(std::as_const(LocksForMutex).begin(), Position);
-
-    const auto OrdinalOfLock = IndexOfLock + 1;
-    OS << "Entering critical section for the " << OrdinalOfLock
-       << llvm::getOrdinalSuffix(OrdinalOfLock) << " time here";
-  });
+void BlockInCriticalSectionChecker::checkPostCall(const CallEvent &Call,
+                                                  CheckerContext &C) const {
+  if (!isBlockingInCritSection(Call, C))
+    return;
+  reportBlockInCritSection(Call, C);
 }
 
+// Checker registration
 void ento::registerBlockInCriticalSectionChecker(CheckerManager &mgr) {
   mgr.registerChecker<BlockInCriticalSectionChecker>();
+
+  // Register events for std::mutex lock and unlock
+  // NOTE: There are standard library implementations where some methods
+  // of `std::mutex` are inherited from an implementation detail base
+  // class, and those aren't matched by the name specification {"std",
+  // "mutex", "lock"}.
+  // As a workaround here we omit the class name and only require the
+  // presence of the name parts "std" and "lock"/"unlock".
+  // TODO: Ensure that CallDescription understands inherited methods.
+  RegisterEvent(EventDescriptor{
+      mutex_modeling::MakeMemberExtractor({"std", /*"mutex"*/ "lock"}),
+      EventKind::Acquire, LibraryKind::NotApplicable,
+      SemanticsKind::XNUSemantics});
+  RegisterEvent(EventDescriptor{
+      mutex_modeling::MakeMemberExtractor({"std", /*"mutex"*/ "unlock"}),
+      EventKind::Release});
+
+  // Register events for std::lock_guard
+  RegisterEvent(EventDescriptor{
+      mutex_modeling::MakeRAIILockExtractor("lock_guard"), EventKind::Acquire,
+      LibraryKind::NotApplicable, SemanticsKind::XNUSemantics});
+  RegisterEvent(
+      EventDescriptor{mutex_modeling::MakeRAIIReleaseExtractor("lock_guard"),
+                      EventKind::Release});
+
+  // Register events for std::unique_lock
+  RegisterEvent(EventDescriptor{
+      mutex_modeling::MakeRAIILockExtractor("unique_lock"), EventKind::Acquire,
+      LibraryKind::NotApplicable, SemanticsKind::XNUSemantics});
+  RegisterEvent(
+      EventDescriptor{mutex_modeling::MakeRAIIReleaseExtractor("unique_lock"),
+                      EventKind::Release});
+
+  // Register events for std::scoped_lock
+  RegisterEvent(EventDescriptor{
+      mutex_modeling::MakeRAIILockExtractor("scoped_lock"), EventKind::Acquire,
+      LibraryKind::NotApplicable, SemanticsKind::XNUSemantics});
+  RegisterEvent(
+      EventDescriptor{mutex_modeling::MakeRAIIReleaseExtractor("scoped_lock"),
+                      EventKind::Release});
 }
 
 bool ento::shouldRegisterBlockInCriticalSectionChecker(
diff --git a/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt b/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
index 6da3665ab9a4df..c3987ba76529db 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
+++ b/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
@@ -68,6 +68,7 @@ add_clang_library(clangStaticAnalyzerCheckers
   MmapWriteExecChecker.cpp
   MIGChecker.cpp
   MoveChecker.cpp
+  MutexModeling.cpp
   MPI-Checker/MPIBugReporter.cpp
   MPI-Checker/MPIChecker.cpp
   MPI-Checker/MPIFunctionClassifier.cpp
diff --git a/clang/lib/StaticAnalyzer/Checkers/MutexModeling.cpp b/clang/lib/StaticAnalyzer/Checkers/MutexModeling.cpp
new file mode 100644
index 00000000000000..f0e13c5c95e432
--- /dev/null
+++ b/clang/lib/StaticAnalyzer/Checkers/MutexModeling.cpp
@@ -0,0 +1,773 @@
+//===--- MutexModeling.cpp - Modeling of mutexes --------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Defines modeling checker for tracking mutex states.
+//
+//===----------------------------------------------------------------------===//
+
+#include "MutexModeling/MutexM...
[truncated]

@steakhal
Copy link
Contributor

steakhal commented Oct 7, 2024

Should this be reviewed commit by commit?

@steakhal
Copy link
Contributor

steakhal commented Oct 7, 2024

Usually its hard to see how the implementation changes if anyways all the surrounding code moves.
For such changes, if the change is large (like in this case, its a 2k+ diff), it feels better to review the individual commits instead of everything all at once.
If we do that right, the bulk of the change will be attributed to a handful of NFC mechanical changes, and I could focus on whats left.

@gamesh411
Copy link
Contributor Author

gamesh411 commented Oct 7, 2024

Usually its hard to see how the implementation changes if anyways all the surrounding code moves. For such changes, if the change is large (like in this case, its a 2k+ diff), it feels better to review the individual commits instead of everything all at once. If we do that right, the bulk of the change will be attributed to a handful of NFC mechanical changes, and I could focus on whats left.

We can do that. I will just create separate PRs for these commits, emphasizing in each one whether they are NFCs or the key points where they deviate from being NFCs.

The first one is here: #111381

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sharing my unfinished review as I have it now to leave you a feedback.
In the LLVM community we encourage contributors to develop incrementally, and post chunks in a reviewable sizes.
Landing a completely new modeling in 1 PR and and then drooping all the old implementations in another PR is not something that fits really well this model.

This would affect multiple checkers, some of which are in heavy use at places where continuity is a key asset. I don't see a way to review these two changes with confidence.
Review time usually grows exponentially with the size of the PR if done right.

There are other ways to gain confidence, for example, by having a huge refactor (and by nature an NFC change) that we can verify at scale; and then start the increments from there, changing the desired behavior. This may need to have tricks in the "newer", refactored implementation to leave room for keeping the bugs we had in the original implementation, what you will later fix. The important part is that they are not done in the same PR.

To put this into perspective, in the Clang Static Analyzer since 2019.10.25 (llvmorg-10-init), in apprx. 5 years, we had only 15 commits that were larger than 500 lines that don't have NFC in their title - only counting changes in clang/lib/StaticAnalyzer and clang/include/clang/StaticAnalyzer.

hash (ins+del) title
0202c3596c52d453d1e9e5a43d7533b83444df4e 867 - [analyzer] CastValueChecker: Store the dynamic types and casts
82923c71efa57600d015dbc281202941d3d64dde 518 - [analyzer] Add Fuchsia Handle checker
9a08a3fab9993f9b93167de5c783dfed6dd7efc0 2339 - [Analyzer] Split container modeling from iterator modeling
f5086b3803ac2f908a734bbb2c7a50018fb3cd8c 941 - [analyzer] StdLibraryFunctionsChecker refactor: remove macros
239c53b72b18d6fd6c5ad9a6d27cd09b950dc97a 709 - [analyzer] Track runtime types represented by Obj-C Class objects
f7c7e8a523f56b0ed1b14c0756ba4e5d1ccb48d2 511 - [Analyzer][WebKit] RefCntblBaseVirtualDtorChecker
db4d5f7048a26a7708821e46095742aecfd8ba46 614 - [analyzer][StdLibraryFunctionsChecker] Add POSIX file handling functions
b13d9878b8dcef4354ddfc86f382ca9b537e65aa 696 - [analyzer][solver] Track symbol equivalence
a787a4ed16d6867f56d81159a8fcf2b711d18a8a 1228 - [analyzer][StdLibraryFunctionsChecker] Use Optionals throughout the summary API
11d2e63ab0060c656398afd8ea26760031a9fb96 849 - [analyzer][StdLibraryFunctionsChecker] Separate the signature from the summaries
170c67d5b8cc58dd8a4bd0ea7c5ca02290fac39c 604 - [analyzer] Use the MacroExpansionContext for macro expansions in plists
02b51e5316cd501ede547d0223e0f5416ebd9845 789 - [analyzer][solver] Redesign constraint ranges data structure
957014da2d2791359181d89a04a0d27da65474d4 555 - [clang][Analyzer] Add errno state to standard functions modeling.
343bdb10940cb2387c0b9bd3caccee7bb56c937b 565 - [analyzer] Show taint origin and propagation correctly
527fcb8e5d6b1d491b6699cde818db1127bbb12c 521 - [analyzer] Add std::variant checker (#66481)

Out of this list, I can only see 3 commits where we merged a semantic change that was not a refactor/mechanical change or adds a brand new checker. 2 of which is touching the Solver, where it's unexpectedly hard to land increments. And the remaining 1 was Show taint origin where I regret not pushing back and we had serious bugs because of the improper quality of review - I admit. This served me a lesson, which is why I speak up now.

Comment on lines +131 to +128
struct ProgramStateTrait<clang::ento::mutex_modeling::MutexEvents>
: public ProgramStatePartialTrait<
clang::ento::mutex_modeling::MutexEventsTy> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Within the namespace clang::ento you don't need to use the fully qualified name to refer to names with that namespace.

Comment on lines +128 to +124
namespace clang {
namespace ento {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
namespace clang {
namespace ento {
namespace clang::ento {

Same goes whenever applicable.

Comment on lines +73 to +72
// Iterator traits for ImmutableList and ImmutableMap data structures
// that enable the use of STL algorithms.
namespace std {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you are allowed to put these within the std namespace.
I was thinking about why are you doing this.

It is because the llvm::ImmutableList<*>::iterator does not define the types to let the std::iterator_traits<T> recognize it as an iterator.
I'd highly suggest fixing the root cause, and not opening the std namespace.

Just add define the following for the ImmutableList iterator:

  • difference_type
  • value_type
  • pointer
  • reference
  • iterator_category

This is the reason why you could use llvm::make_filter_range() of an ImmutableList.

Comment on lines +128 to +121
inline void printState(raw_ostream &Out, ProgramStateRef State, const char *NL,
const char *Sep) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest not having function bodies in API headers. A minimal header helps to understand better what is the intended use without the need to glass over implementation details.
It also helps in parsing times, as the less code is in headers, the better.

This applies to all headers.

Comment on lines 44 to 41
// Vector of registered event descriptors
inline llvm::SmallVector<EventDescriptor, 0> RegisteredEvents{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really concerned about inline globals. I'd prefer not exposing this in a header.
Or at least, only expose an extern declaration, and keep the definition in the cpp file.

Comment on lines +25 to +27
namespace clang {

namespace ento {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
namespace clang {
namespace ento {
namespace clang::ento {

Comment on lines +25 to +26
using namespace mutex_modeling;

namespace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per llvm coding style, we prefer marking functions static, and use the anonymous namespaces to wrap classes.

This patch introduces the basic structure for the MutexModeling
checker.It adds the checker definition, creates the MutexModeling.cpp
file, supporting header files, and updates the necessary build files.
The MutexModeling checker will be responsible for modeling mutexlock and
unlock events, improving the static analysis ofmultithreaded code
This patch updates the dependencies of existing checkers to use the
newMutexModeling checker. Specifically, it modifies the
PthreadLockBase,BlockInCriticalSectionChecker, and PthreadLockChecker to
depend onMutexModeling.
This patch updates the BlockInCriticalSectionChecker to use the
newMutexModeling API. It removes the old mutex modeling logic and
replacesit with calls to the new API.
This patch updates the PthreadLockChecker to use the new MutexModeling
API.It removes the old mutex modeling logic and replaces it with calls
to thenew API. The checker now handles mutex events through the
centralizedMutexModeling system.
This patch updates the existing tests to reflect the changes made to
themutex modeling system. It modifies the expected output for various
testcases to match the new MutexModeling API and the updated checker
behavior.
@gamesh411 gamesh411 force-pushed the thread-modeling-upstreaming branch from 5f80c4a to e09f147 Compare October 17, 2024 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:static analyzer clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants