Skip to content

Conversation

@flovent
Copy link
Contributor

@flovent flovent commented May 8, 2025

This PR aims to fix crash caused by std::bit_cast(#137417) and __builtin_bit_cast(#71174).

The crash caused by std::bit_cast is actually due to the __builtin_bit_cast used in its implementation (see the code added in test/Analysis/builtin_bitcast.cpp), so both issues share the same root cause.

Unlike other casts, such as reinterpret_cast, __builtin_bit_cast does not have an explicit cast kind (e.g., CK_IntegralToPointer) in its AST node. For example, consider the following code:

 __builtin_bit_cast(int*, static_cast<long>(-1))

Its corresponding AST node is:

BuiltinBitCastExpr 0x5636acdc0dc0 <col:4, col:50> 'int *' <LValueToRValueBitCast>
  `-MaterializeTemporaryExpr 0x5636acdc0da8 <col:29, col:49> 'long' xvalue
    `-CXXStaticCastExpr 0x5636acdc0d60 <col:29, col:49> 'long' static_cast<long> <NoOp>
      `-ImplicitCastExpr 0x5636acdc0d48 <col:47, col:48> 'long' <IntegralCast> part_of_explicit_cast
        `-UnaryOperator 0x5636acdc0d18 <col:47, col:48> 'int' prefix '-'
          `-IntegerLiteral 0x5636acdc0cf8 <col:48> 'int' 1

ExprEngine will call evalLoad for LValueToRValueBitCast, which eventually leads to the RegionStore::getBinding API. Before this PR's change, it will return original SVal with it's orignal type. For the example code above, whole expression's SVal will be evaluated to -1's SVal(NonLoc), if we compare it with a Loc(e.g., a pointer), assert fails and then crash happens.

This change only affects the testcase for __builtin_bit_cast itself, in gh_69922 it should be evaluated to Loc, but since orignal region is a SymbolReigon, and it can't casted to Loc now,

// FIXME: We should be able to cast NonLoc -> Loc
// (when Loc::isLocType(CastTy) is true)
// But it's hard to do as SymbolicRegions can't refer to SymbolCasts holding
// generic SymExprs. Check the commit message for the details.
// Symbol to pointer and whatever else.
return UnknownVal();

UnknownVal is reasonable and it is consistent with reinterpret_cast.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels May 8, 2025
@llvmbot
Copy link
Member

llvmbot commented May 8, 2025

@llvm/pr-subscribers-clang

Author: None (flovent)

Changes

This PR aims to fix crash caused by std::bit_cast(#137417) and __builtin_bit_cast(#71174).

The crash caused by std::bit_cast is actually due to the __builtin_bit_cast used in its implementation (see the code added in test/Analysis/builtin_bitcast.cpp), so both issues share the same root cause.

Unlike other casts, such as reinterpret_cast, __builtin_bit_cast does not have an explicit cast kind (e.g., CK_IntegralToPointer) in its AST node. For example, consider the following code:

 __builtin_bit_cast(int*, static_cast&lt;long&gt;(-1))

Its corresponding AST node is:

BuiltinBitCastExpr 0x5636acdc0dc0 &lt;col:4, col:50&gt; 'int *' &lt;LValueToRValueBitCast&gt;
  `-MaterializeTemporaryExpr 0x5636acdc0da8 &lt;col:29, col:49&gt; 'long' xvalue
    `-CXXStaticCastExpr 0x5636acdc0d60 &lt;col:29, col:49&gt; 'long' static_cast&lt;long&gt; &lt;NoOp&gt;
      `-ImplicitCastExpr 0x5636acdc0d48 &lt;col:47, col:48&gt; 'long' &lt;IntegralCast&gt; part_of_explicit_cast
        `-UnaryOperator 0x5636acdc0d18 &lt;col:47, col:48&gt; 'int' prefix '-'
          `-IntegerLiteral 0x5636acdc0cf8 &lt;col:48&gt; 'int' 1

ExprEngine will call evalLoad for LValueToRValueBitCast, which eventually leads to the RegionStore::getBinding API. Before this PR's change, it will return original SVal with it's orignal type. For the example code above, whole expression's SVal will be evaluated to -1's SVal(NonLoc), if we compare it with a Loc(e.g., a pointer), assert fails and then crash happens.

This change only affects the testcase for __builtin_bit_cast itself, in gh_69922 it should be evaluated to Loc, but since orignal region is a SymbolReigon, and it can't casted to Loc now,

// FIXME: We should be able to cast NonLoc -> Loc
// (when Loc::isLocType(CastTy) is true)
// But it's hard to do as SymbolicRegions can't refer to SymbolCasts holding
// generic SymExprs. Check the commit message for the details.
// Symbol to pointer and whatever else.
return UnknownVal();

UnknownVal is reasonable and it is consistent with reinterpret_cast.


Full diff: https://github.com/llvm/llvm-project/pull/139095.diff

2 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Core/RegionStore.cpp (+1-1)
  • (modified) clang/test/Analysis/builtin_bitcast.cpp (+30-2)
diff --git a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
index 1cc9cb84cbfa4..84331804edc9e 100644
--- a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
+++ b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -1650,7 +1650,7 @@ SVal RegionStoreManager::getBinding(RegionBindingsConstRef B, Loc L, QualType T)
 
   // Check if the region has a binding.
   if (V)
-    return *V;
+    return svalBuilder.evalCast(*V, T, QualType{});
 
   // The location does not have a bound value.  This means that it has
   // the value it had upon its creation and/or entry to the analyzed
diff --git a/clang/test/Analysis/builtin_bitcast.cpp b/clang/test/Analysis/builtin_bitcast.cpp
index 5a0d9e7189b8e..2cd1b96bf4550 100644
--- a/clang/test/Analysis/builtin_bitcast.cpp
+++ b/clang/test/Analysis/builtin_bitcast.cpp
@@ -1,5 +1,7 @@
 // RUN: %clang_analyze_cc1 -triple x86_64-unknown-unknown -verify %s \
 // RUN:   -analyzer-checker=core,debug.ExprInspection
+// RUN: %clang_analyze_cc1 -triple x86_64-unknown-unknown -verify -std=c++20 %s \
+// RUN:   -analyzer-checker=core,debug.ExprInspection
 
 template <typename T> void clang_analyzer_dump(T);
 using size_t = decltype(sizeof(int));
@@ -39,7 +41,7 @@ struct A {
   }
 };
 void gh_69922(size_t p) {
-  // expected-warning-re@+1 {{(reg_${{[0-9]+}}<size_t p>) & 1U}}
+  // expected-warning@+1 {{Unknown}}
   clang_analyzer_dump(__builtin_bit_cast(A*, p & 1));
 
   __builtin_bit_cast(A*, p & 1)->set(2); // no-crash
@@ -49,5 +51,31 @@ void gh_69922(size_t p) {
   // store to the member variable `n`.
 
   clang_analyzer_dump(__builtin_bit_cast(A*, p & 1)->n); // Ideally, this should print "2".
-  // expected-warning-re@-1 {{(reg_${{[0-9]+}}<size_t p>) & 1U}}
+  // expected-warning@-1 {{Unknown}}
+}
+
+static void issue_71174() {
+  auto res = __builtin_bit_cast(unsigned long long, &issue_71174) | 1; // no-crash
+}
+
+#if __cplusplus >= 202002L
+#include "Inputs/system-header-simulator-cxx.h"
+using intptr_t = decltype(sizeof(int*));
+
+namespace std {
+template< class To, class From >
+constexpr To bit_cast( const From& from ) noexcept {
+  #if __has_builtin(__builtin_bit_cast)
+  return __builtin_bit_cast(To, from);
+#else
+  To to;
+  std::memcpy(&to, &from, sizeof(To));
+  return to;
+#endif
+}
+}
+
+bool issue_137417(std::string* x) {
+  return x == std::bit_cast<std::string*>(static_cast<intptr_t>(-1)); // no-crash
 }
+#endif
\ No newline at end of file

@llvmbot
Copy link
Member

llvmbot commented May 8, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

Author: None (flovent)

Changes

This PR aims to fix crash caused by std::bit_cast(#137417) and __builtin_bit_cast(#71174).

The crash caused by std::bit_cast is actually due to the __builtin_bit_cast used in its implementation (see the code added in test/Analysis/builtin_bitcast.cpp), so both issues share the same root cause.

Unlike other casts, such as reinterpret_cast, __builtin_bit_cast does not have an explicit cast kind (e.g., CK_IntegralToPointer) in its AST node. For example, consider the following code:

 __builtin_bit_cast(int*, static_cast&lt;long&gt;(-1))

Its corresponding AST node is:

BuiltinBitCastExpr 0x5636acdc0dc0 &lt;col:4, col:50&gt; 'int *' &lt;LValueToRValueBitCast&gt;
  `-MaterializeTemporaryExpr 0x5636acdc0da8 &lt;col:29, col:49&gt; 'long' xvalue
    `-CXXStaticCastExpr 0x5636acdc0d60 &lt;col:29, col:49&gt; 'long' static_cast&lt;long&gt; &lt;NoOp&gt;
      `-ImplicitCastExpr 0x5636acdc0d48 &lt;col:47, col:48&gt; 'long' &lt;IntegralCast&gt; part_of_explicit_cast
        `-UnaryOperator 0x5636acdc0d18 &lt;col:47, col:48&gt; 'int' prefix '-'
          `-IntegerLiteral 0x5636acdc0cf8 &lt;col:48&gt; 'int' 1

ExprEngine will call evalLoad for LValueToRValueBitCast, which eventually leads to the RegionStore::getBinding API. Before this PR's change, it will return original SVal with it's orignal type. For the example code above, whole expression's SVal will be evaluated to -1's SVal(NonLoc), if we compare it with a Loc(e.g., a pointer), assert fails and then crash happens.

This change only affects the testcase for __builtin_bit_cast itself, in gh_69922 it should be evaluated to Loc, but since orignal region is a SymbolReigon, and it can't casted to Loc now,

// FIXME: We should be able to cast NonLoc -> Loc
// (when Loc::isLocType(CastTy) is true)
// But it's hard to do as SymbolicRegions can't refer to SymbolCasts holding
// generic SymExprs. Check the commit message for the details.
// Symbol to pointer and whatever else.
return UnknownVal();

UnknownVal is reasonable and it is consistent with reinterpret_cast.


Full diff: https://github.com/llvm/llvm-project/pull/139095.diff

2 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Core/RegionStore.cpp (+1-1)
  • (modified) clang/test/Analysis/builtin_bitcast.cpp (+30-2)
diff --git a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
index 1cc9cb84cbfa4..84331804edc9e 100644
--- a/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
+++ b/clang/lib/StaticAnalyzer/Core/RegionStore.cpp
@@ -1650,7 +1650,7 @@ SVal RegionStoreManager::getBinding(RegionBindingsConstRef B, Loc L, QualType T)
 
   // Check if the region has a binding.
   if (V)
-    return *V;
+    return svalBuilder.evalCast(*V, T, QualType{});
 
   // The location does not have a bound value.  This means that it has
   // the value it had upon its creation and/or entry to the analyzed
diff --git a/clang/test/Analysis/builtin_bitcast.cpp b/clang/test/Analysis/builtin_bitcast.cpp
index 5a0d9e7189b8e..2cd1b96bf4550 100644
--- a/clang/test/Analysis/builtin_bitcast.cpp
+++ b/clang/test/Analysis/builtin_bitcast.cpp
@@ -1,5 +1,7 @@
 // RUN: %clang_analyze_cc1 -triple x86_64-unknown-unknown -verify %s \
 // RUN:   -analyzer-checker=core,debug.ExprInspection
+// RUN: %clang_analyze_cc1 -triple x86_64-unknown-unknown -verify -std=c++20 %s \
+// RUN:   -analyzer-checker=core,debug.ExprInspection
 
 template <typename T> void clang_analyzer_dump(T);
 using size_t = decltype(sizeof(int));
@@ -39,7 +41,7 @@ struct A {
   }
 };
 void gh_69922(size_t p) {
-  // expected-warning-re@+1 {{(reg_${{[0-9]+}}<size_t p>) & 1U}}
+  // expected-warning@+1 {{Unknown}}
   clang_analyzer_dump(__builtin_bit_cast(A*, p & 1));
 
   __builtin_bit_cast(A*, p & 1)->set(2); // no-crash
@@ -49,5 +51,31 @@ void gh_69922(size_t p) {
   // store to the member variable `n`.
 
   clang_analyzer_dump(__builtin_bit_cast(A*, p & 1)->n); // Ideally, this should print "2".
-  // expected-warning-re@-1 {{(reg_${{[0-9]+}}<size_t p>) & 1U}}
+  // expected-warning@-1 {{Unknown}}
+}
+
+static void issue_71174() {
+  auto res = __builtin_bit_cast(unsigned long long, &issue_71174) | 1; // no-crash
+}
+
+#if __cplusplus >= 202002L
+#include "Inputs/system-header-simulator-cxx.h"
+using intptr_t = decltype(sizeof(int*));
+
+namespace std {
+template< class To, class From >
+constexpr To bit_cast( const From& from ) noexcept {
+  #if __has_builtin(__builtin_bit_cast)
+  return __builtin_bit_cast(To, from);
+#else
+  To to;
+  std::memcpy(&to, &from, sizeof(To));
+  return to;
+#endif
+}
+}
+
+bool issue_137417(std::string* x) {
+  return x == std::bit_cast<std::string*>(static_cast<intptr_t>(-1)); // no-crash
 }
+#endif
\ No newline at end of file

@flovent flovent closed this May 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:static analyzer clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants