Skip to content

Conversation

@balazske
Copy link
Collaborator

CallAndMessageChecker did have a warning for the case when pointer to uninitialized data is passed to a function when the argument type is pointer to const. This did not work for struct types. The check is improved to handle cases when struct with uninitialized data is passed to a function.

… struct

CallAndMessageChecker did have a warning for the case when pointer to
uninitialized data is passed to a function when the argument type is
pointer to const. This did not work for struct types. The check is
improved to handle cases when struct with uninitialized data is
passed to a function.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Oct 22, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 22, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Balázs Kéri (balazske)

Changes

CallAndMessageChecker did have a warning for the case when pointer to uninitialized data is passed to a function when the argument type is pointer to const. This did not work for struct types. The check is improved to handle cases when struct with uninitialized data is passed to a function.


Full diff: https://github.com/llvm/llvm-project/pull/164600.diff

3 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp (+8-4)
  • (modified) clang/test/Analysis/PR40625.cpp (+2-2)
  • (modified) clang/test/Analysis/call-and-message.c (+12)
diff --git a/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
index 7cc146ed29d0d..cd24b1e816e01 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
@@ -183,8 +183,6 @@ bool CallAndMessageChecker::uninitRefOrPointer(
     CheckerContext &C, SVal V, SourceRange ArgRange, const Expr *ArgEx,
     const BugType &BT, const ParmVarDecl *ParamDecl, int ArgumentNumber) const {
 
-  // The pointee being uninitialized is a sign of code smell, not a bug, no need
-  // to sink here.
   if (!ChecksEnabled[CK_ArgPointeeInitializedness])
     return false;
 
@@ -212,8 +210,14 @@ bool CallAndMessageChecker::uninitRefOrPointer(
 
   if (const MemRegion *SValMemRegion = V.getAsRegion()) {
     const ProgramStateRef State = C.getState();
-    const SVal PSV = State->getSVal(SValMemRegion, C.getASTContext().CharTy);
-    if (PSV.isUndef()) {
+    QualType T = ParamDecl->getType()->getPointeeType();
+    if (T->isVoidType())
+      T = C.getASTContext().CharTy;
+    const SVal PSV = State->getSVal(SValMemRegion, T);
+    bool IsUndef = PSV.isUndef();
+    if (auto LCV = PSV.getAs<nonloc::LazyCompoundVal>())
+      IsUndef = LCV->getStore() == nullptr;
+    if (IsUndef) {
       if (ExplodedNode *N = C.generateErrorNode()) {
         auto R = std::make_unique<PathSensitiveBugReport>(BT, Os.str(), N);
         R->addRange(ArgRange);
diff --git a/clang/test/Analysis/PR40625.cpp b/clang/test/Analysis/PR40625.cpp
index 5ebe2122945e6..ab3faa328298a 100644
--- a/clang/test/Analysis/PR40625.cpp
+++ b/clang/test/Analysis/PR40625.cpp
@@ -5,11 +5,11 @@
 void f(const int *end);
 
 void g(const int (&arrr)[10]) {
-  f(arrr); // expected-warning{{1st function call argument is a pointer to uninitialized value}}
+  f(arrr);
 }
 
 void h() {
   int arr[10];
 
-  g(arr);
+  g(arr); // expected-warning{{1st function call argument is an uninitialized value}}
 }
diff --git a/clang/test/Analysis/call-and-message.c b/clang/test/Analysis/call-and-message.c
index ade51145e2a93..fdac77176569b 100644
--- a/clang/test/Analysis/call-and-message.c
+++ b/clang/test/Analysis/call-and-message.c
@@ -24,6 +24,18 @@ void pointee_uninit(void) {
   doStuff_pointerToConstInt(p); // expected-warning{{1st function call argument is a pointer to uninitialized value [core.CallAndMessage]}}
 }
 
+typedef struct S {
+  int a;
+  short b;
+} S;
+
+void doStuff_pointerToConstStruct(const S *s){};
+void pointee_uninit_struct(void) {
+  S s;
+  S *p = &s;
+  doStuff_pointerToConstStruct(p); // expected-warning{{1st function call argument is a pointer to uninitialized value [core.CallAndMessage]}}
+}
+
 // TODO: If this hash ever changes, turn
 // core.CallAndMessage:ArgPointeeInitializedness from a checker option into a
 // checker, as described in the CallAndMessage comments!

@llvmbot
Copy link
Member

llvmbot commented Oct 22, 2025

@llvm/pr-subscribers-clang

Author: Balázs Kéri (balazske)

Changes

CallAndMessageChecker did have a warning for the case when pointer to uninitialized data is passed to a function when the argument type is pointer to const. This did not work for struct types. The check is improved to handle cases when struct with uninitialized data is passed to a function.


Full diff: https://github.com/llvm/llvm-project/pull/164600.diff

3 Files Affected:

  • (modified) clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp (+8-4)
  • (modified) clang/test/Analysis/PR40625.cpp (+2-2)
  • (modified) clang/test/Analysis/call-and-message.c (+12)
diff --git a/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
index 7cc146ed29d0d..cd24b1e816e01 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
@@ -183,8 +183,6 @@ bool CallAndMessageChecker::uninitRefOrPointer(
     CheckerContext &C, SVal V, SourceRange ArgRange, const Expr *ArgEx,
     const BugType &BT, const ParmVarDecl *ParamDecl, int ArgumentNumber) const {
 
-  // The pointee being uninitialized is a sign of code smell, not a bug, no need
-  // to sink here.
   if (!ChecksEnabled[CK_ArgPointeeInitializedness])
     return false;
 
@@ -212,8 +210,14 @@ bool CallAndMessageChecker::uninitRefOrPointer(
 
   if (const MemRegion *SValMemRegion = V.getAsRegion()) {
     const ProgramStateRef State = C.getState();
-    const SVal PSV = State->getSVal(SValMemRegion, C.getASTContext().CharTy);
-    if (PSV.isUndef()) {
+    QualType T = ParamDecl->getType()->getPointeeType();
+    if (T->isVoidType())
+      T = C.getASTContext().CharTy;
+    const SVal PSV = State->getSVal(SValMemRegion, T);
+    bool IsUndef = PSV.isUndef();
+    if (auto LCV = PSV.getAs<nonloc::LazyCompoundVal>())
+      IsUndef = LCV->getStore() == nullptr;
+    if (IsUndef) {
       if (ExplodedNode *N = C.generateErrorNode()) {
         auto R = std::make_unique<PathSensitiveBugReport>(BT, Os.str(), N);
         R->addRange(ArgRange);
diff --git a/clang/test/Analysis/PR40625.cpp b/clang/test/Analysis/PR40625.cpp
index 5ebe2122945e6..ab3faa328298a 100644
--- a/clang/test/Analysis/PR40625.cpp
+++ b/clang/test/Analysis/PR40625.cpp
@@ -5,11 +5,11 @@
 void f(const int *end);
 
 void g(const int (&arrr)[10]) {
-  f(arrr); // expected-warning{{1st function call argument is a pointer to uninitialized value}}
+  f(arrr);
 }
 
 void h() {
   int arr[10];
 
-  g(arr);
+  g(arr); // expected-warning{{1st function call argument is an uninitialized value}}
 }
diff --git a/clang/test/Analysis/call-and-message.c b/clang/test/Analysis/call-and-message.c
index ade51145e2a93..fdac77176569b 100644
--- a/clang/test/Analysis/call-and-message.c
+++ b/clang/test/Analysis/call-and-message.c
@@ -24,6 +24,18 @@ void pointee_uninit(void) {
   doStuff_pointerToConstInt(p); // expected-warning{{1st function call argument is a pointer to uninitialized value [core.CallAndMessage]}}
 }
 
+typedef struct S {
+  int a;
+  short b;
+} S;
+
+void doStuff_pointerToConstStruct(const S *s){};
+void pointee_uninit_struct(void) {
+  S s;
+  S *p = &s;
+  doStuff_pointerToConstStruct(p); // expected-warning{{1st function call argument is a pointer to uninitialized value [core.CallAndMessage]}}
+}
+
 // TODO: If this hash ever changes, turn
 // core.CallAndMessage:ArgPointeeInitializedness from a checker option into a
 // checker, as described in the CallAndMessage comments!

@balazske
Copy link
Collaborator Author

Do we need more tests for this checker option (and in cpp mode)?

const SVal PSV = State->getSVal(SValMemRegion, T);
bool IsUndef = PSV.isUndef();
if (auto LCV = PSV.getAs<nonloc::LazyCompoundVal>())
IsUndef = LCV->getStore() == nullptr;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not find information about if this is the correct way of detecting that the value is not initialized.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly more conditions are needed, the current version is LCV->getStore() == nullptr && isa<NonParamVarRegion>(LCV->getRegion()). Still at some C++ cases (for example memory allocated with new) the uninitialized struct is still not detected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no way you can ask for the underlying value of an LCV, so by extension you also can't detect if it's uninitialized or not. Here you would need to know if any of the bindings in the cluster of the region of the LCV in the Store of the LCV refers to undef.
That can be undef for 2 reasons: 1) the default binding at least partially covering that region is undef., or 2) there is a direct binding of Undef that at least partially overlaps with that region.

iterBindings could give you the list of direct bindings, but I don't think there is a way to query a default binding. NVM, there is one api, getDefaultBinding. But I start to really hate these LCVs and the regionstore.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that there is no binding at all? Even with iterBindings over subregions and super-regions, no binding is found. LazyCoupoundValue is not necessary for this search (memregion and store is available from elsewhere). The LCV can contain a null Store pointer (and exactly this case looks interesting).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no cluster for the region, than it means that we saw no stores to that place.
Reading from such a place might be still valid for global storage duration objects such as: globals, local statics, thread local objects. This can be inferred from the memory space associated with the region.

void pointee_uninit_struct(void) {
S s;
S *p = &s;
doStuff_pointerToConstStruct(p); // expected-warning{{1st function call argument is a pointer to uninitialized value [core.CallAndMessage]}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand it correctly that this testcase wouldn't pass without your change in the checker?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was not detected before.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit that my first impression is that "I don't know" -- I'm not familiar enough with the possible representations of undefined values in the analyzer to form an opinion about this PR. I could research this if it is really needed -- but perhaps @steakhal @haoNoQ or @Xazax-hun could say something wise more directly?

My first guess is that the old code should have been sufficient to detect whether the memory area is undefined -- and it is a bit sad that you need to write this more elaborate code. In general, I tend to notice a pattern that LazyCompoundVal is too opaque and its performance benefits come with lots of added code complexity as we always need to explicitly check for it. (However I don't know enough, perhaps this is still the least wrong approach.)

@balazske
Copy link
Collaborator Author

There are more FIXME comments (in test files) about that it is not possible to read undefined values from a LazyCompoundVal, instead these will be unknown values. Probably this is the same problem here. To have a good solution this issue should be fixed, but it needs again more investigation why it works this way or how difficult is it to improve.

@steakhal
Copy link
Contributor

steakhal commented Nov 2, 2025

In general, I tend to notice a pattern that LazyCompoundVal is too opaque and its performance benefits come with lots of added code complexity as we always need to explicitly check for it. (However I don't know enough, perhaps this is still the least wrong approach.)

LCVs are complicated. However, having nonloc::CompoundVals only make this worse. If I had to choose today, I'd only have LCVs for representing compount vals. But note that I have a strong believe that LCVs are necessary for their laziness. I'm pretty sure though that we could have nice APIs around it that would bring consistency and no surprises.

@NagyDonat
Copy link
Contributor

<offtopic>
@steakhal

However, having nonloc::CompoundVals only make this worse.

Technically I didn't claim that they are better 😅, and I can easily believe that they are indeed worse...

If I had to choose today, I'd only have LCVs for representing compount vals.

I think non-lazy compound values are necessary from a theoretical POV to cover cases where we want to create a compound value directly (without "building it up" gradually in a state). For example, assume that we need to write an EvalCall for a function that construct a class object and returns it by value: this return value (which we want to bind into the Environment) is a compound value which cannot be expressed as a lazy compound value (because it is not the contents of a memory region that we model). (I admit that I don't know about a use case where we actually need an EvalCall like this.)

I'm pretty sure though that we could have nice APIs around it that would bring consistency and no surprises.

I agree and I hope that we can reach this eventually.
</offtopic>

@steakhal
Copy link
Contributor

steakhal commented Nov 3, 2025

<offtopic> @steakhal

However, having nonloc::CompoundVals only make this worse.

Technically I didn't claim that they are better 😅, and I can easily believe that they are indeed worse...

If I had to choose today, I'd only have LCVs for representing compount vals.

I think non-lazy compound values are necessary from a theoretical POV to cover cases where we want to create a compound value directly (without "building it up" gradually in a state). For example, assume that we need to write an EvalCall for a function that construct a class object and returns it by value: this return value (which we want to bind into the Environment) is a compound value which cannot be expressed as a lazy compound value (because it is not the contents of a memory region that we model). (I admit that I don't know about a use case where we actually need an EvalCall like this.)

As I understand this, the refcounted Store object doesn't need to be built up gradually. You can create a Store object for your liking and then set your LCV to refer to that Store. That's true that what LCVs are currently built up gradually, but that's because we have regular compound vals that take that use case where we would create a branch new Store object in a single step, holding each binding.

@NagyDonat
Copy link
Contributor

As I understand this, the refcounted Store object doesn't need to be built up gradually. You can create a Store object for your liking and then set your LCV to refer to that Store. That's true that what LCVs are currently built up gradually, but that's because we have regular compound vals that take that use case where we would create a branch new Store object in a single step, holding each binding.

Fair point, it would be indeed possible to replace non-lazy compound vals with LCVs that have an "artifical" state which is created just for them (and e.g. there is no node which uses it as the state).

@github-actions
Copy link

github-actions bot commented Nov 17, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Os << (ArgumentNumber + 1) << llvm::getOrdinalSuffix(ArgumentNumber + 1)
<< " function call argument is ";
if (ParamT->isPointerType())
Os << "a pointer to uninitialized value";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be "a const pointer to uninitialized value" because the constness of the pointer is the reason why this is a bug.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change the text (it was part of the existing code).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(The plan is to extend this check to functions that have in-out parameters which must be initialized.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, I didn't notice that this message was already part of the existing code.

Then I think the best approach would be keeping the message text unchanged in this commit, and updating it in a later follow-up change.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me.

I think the message formatting could be simplified by using llvm::formatv instead of the more verbose raw_ostream handling (as I write in inline comments), but the current implementation is also correct.

@github-actions
Copy link

github-actions bot commented Nov 20, 2025

🐧 Linux x64 Test Results

  • 111432 tests passed
  • 4455 tests skipped

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! I really like the solution with the format_provider specialization, it is even more elegant than what I was suggesting.

I think this commit is ready to be merged. @steakhal Do you have anything to add?

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 30 seconds, I didn't see anything that would make me concerned.
Thanks both of you.
I appreciate the consistent and steady support for clang.

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's merge this! 🚀

@balazske
Copy link
Collaborator Author

I was waiting with the merge because the new check can produce incorrect results for functions where it is enough to initialize a part of an input structure (there can be relatively many of these). Now a new option was added to require totally un-initialized structs to emit the warning.

@balazske
Copy link
Collaborator Author

The new option affects only the ArgPointeeInitializedness check (not ArgInitializedness) to not change existing behavior. (Documentation should be updated.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:static analyzer clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants