-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[clang][analyzer] CallAndMessage warnings at pointer to uninitialized struct #164600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… struct CallAndMessageChecker did have a warning for the case when pointer to uninitialized data is passed to a function when the argument type is pointer to const. This did not work for struct types. The check is improved to handle cases when struct with uninitialized data is passed to a function.
|
@llvm/pr-subscribers-clang-static-analyzer-1 Author: Balázs Kéri (balazske) Changes
Full diff: https://github.com/llvm/llvm-project/pull/164600.diff 3 Files Affected:
diff --git a/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
index 7cc146ed29d0d..cd24b1e816e01 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
@@ -183,8 +183,6 @@ bool CallAndMessageChecker::uninitRefOrPointer(
CheckerContext &C, SVal V, SourceRange ArgRange, const Expr *ArgEx,
const BugType &BT, const ParmVarDecl *ParamDecl, int ArgumentNumber) const {
- // The pointee being uninitialized is a sign of code smell, not a bug, no need
- // to sink here.
if (!ChecksEnabled[CK_ArgPointeeInitializedness])
return false;
@@ -212,8 +210,14 @@ bool CallAndMessageChecker::uninitRefOrPointer(
if (const MemRegion *SValMemRegion = V.getAsRegion()) {
const ProgramStateRef State = C.getState();
- const SVal PSV = State->getSVal(SValMemRegion, C.getASTContext().CharTy);
- if (PSV.isUndef()) {
+ QualType T = ParamDecl->getType()->getPointeeType();
+ if (T->isVoidType())
+ T = C.getASTContext().CharTy;
+ const SVal PSV = State->getSVal(SValMemRegion, T);
+ bool IsUndef = PSV.isUndef();
+ if (auto LCV = PSV.getAs<nonloc::LazyCompoundVal>())
+ IsUndef = LCV->getStore() == nullptr;
+ if (IsUndef) {
if (ExplodedNode *N = C.generateErrorNode()) {
auto R = std::make_unique<PathSensitiveBugReport>(BT, Os.str(), N);
R->addRange(ArgRange);
diff --git a/clang/test/Analysis/PR40625.cpp b/clang/test/Analysis/PR40625.cpp
index 5ebe2122945e6..ab3faa328298a 100644
--- a/clang/test/Analysis/PR40625.cpp
+++ b/clang/test/Analysis/PR40625.cpp
@@ -5,11 +5,11 @@
void f(const int *end);
void g(const int (&arrr)[10]) {
- f(arrr); // expected-warning{{1st function call argument is a pointer to uninitialized value}}
+ f(arrr);
}
void h() {
int arr[10];
- g(arr);
+ g(arr); // expected-warning{{1st function call argument is an uninitialized value}}
}
diff --git a/clang/test/Analysis/call-and-message.c b/clang/test/Analysis/call-and-message.c
index ade51145e2a93..fdac77176569b 100644
--- a/clang/test/Analysis/call-and-message.c
+++ b/clang/test/Analysis/call-and-message.c
@@ -24,6 +24,18 @@ void pointee_uninit(void) {
doStuff_pointerToConstInt(p); // expected-warning{{1st function call argument is a pointer to uninitialized value [core.CallAndMessage]}}
}
+typedef struct S {
+ int a;
+ short b;
+} S;
+
+void doStuff_pointerToConstStruct(const S *s){};
+void pointee_uninit_struct(void) {
+ S s;
+ S *p = &s;
+ doStuff_pointerToConstStruct(p); // expected-warning{{1st function call argument is a pointer to uninitialized value [core.CallAndMessage]}}
+}
+
// TODO: If this hash ever changes, turn
// core.CallAndMessage:ArgPointeeInitializedness from a checker option into a
// checker, as described in the CallAndMessage comments!
|
|
@llvm/pr-subscribers-clang Author: Balázs Kéri (balazske) Changes
Full diff: https://github.com/llvm/llvm-project/pull/164600.diff 3 Files Affected:
diff --git a/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
index 7cc146ed29d0d..cd24b1e816e01 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/CallAndMessageChecker.cpp
@@ -183,8 +183,6 @@ bool CallAndMessageChecker::uninitRefOrPointer(
CheckerContext &C, SVal V, SourceRange ArgRange, const Expr *ArgEx,
const BugType &BT, const ParmVarDecl *ParamDecl, int ArgumentNumber) const {
- // The pointee being uninitialized is a sign of code smell, not a bug, no need
- // to sink here.
if (!ChecksEnabled[CK_ArgPointeeInitializedness])
return false;
@@ -212,8 +210,14 @@ bool CallAndMessageChecker::uninitRefOrPointer(
if (const MemRegion *SValMemRegion = V.getAsRegion()) {
const ProgramStateRef State = C.getState();
- const SVal PSV = State->getSVal(SValMemRegion, C.getASTContext().CharTy);
- if (PSV.isUndef()) {
+ QualType T = ParamDecl->getType()->getPointeeType();
+ if (T->isVoidType())
+ T = C.getASTContext().CharTy;
+ const SVal PSV = State->getSVal(SValMemRegion, T);
+ bool IsUndef = PSV.isUndef();
+ if (auto LCV = PSV.getAs<nonloc::LazyCompoundVal>())
+ IsUndef = LCV->getStore() == nullptr;
+ if (IsUndef) {
if (ExplodedNode *N = C.generateErrorNode()) {
auto R = std::make_unique<PathSensitiveBugReport>(BT, Os.str(), N);
R->addRange(ArgRange);
diff --git a/clang/test/Analysis/PR40625.cpp b/clang/test/Analysis/PR40625.cpp
index 5ebe2122945e6..ab3faa328298a 100644
--- a/clang/test/Analysis/PR40625.cpp
+++ b/clang/test/Analysis/PR40625.cpp
@@ -5,11 +5,11 @@
void f(const int *end);
void g(const int (&arrr)[10]) {
- f(arrr); // expected-warning{{1st function call argument is a pointer to uninitialized value}}
+ f(arrr);
}
void h() {
int arr[10];
- g(arr);
+ g(arr); // expected-warning{{1st function call argument is an uninitialized value}}
}
diff --git a/clang/test/Analysis/call-and-message.c b/clang/test/Analysis/call-and-message.c
index ade51145e2a93..fdac77176569b 100644
--- a/clang/test/Analysis/call-and-message.c
+++ b/clang/test/Analysis/call-and-message.c
@@ -24,6 +24,18 @@ void pointee_uninit(void) {
doStuff_pointerToConstInt(p); // expected-warning{{1st function call argument is a pointer to uninitialized value [core.CallAndMessage]}}
}
+typedef struct S {
+ int a;
+ short b;
+} S;
+
+void doStuff_pointerToConstStruct(const S *s){};
+void pointee_uninit_struct(void) {
+ S s;
+ S *p = &s;
+ doStuff_pointerToConstStruct(p); // expected-warning{{1st function call argument is a pointer to uninitialized value [core.CallAndMessage]}}
+}
+
// TODO: If this hash ever changes, turn
// core.CallAndMessage:ArgPointeeInitializedness from a checker option into a
// checker, as described in the CallAndMessage comments!
|
|
Do we need more tests for this checker option (and in cpp mode)? |
| const SVal PSV = State->getSVal(SValMemRegion, T); | ||
| bool IsUndef = PSV.isUndef(); | ||
| if (auto LCV = PSV.getAs<nonloc::LazyCompoundVal>()) | ||
| IsUndef = LCV->getStore() == nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not find information about if this is the correct way of detecting that the value is not initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly more conditions are needed, the current version is LCV->getStore() == nullptr && isa<NonParamVarRegion>(LCV->getRegion()). Still at some C++ cases (for example memory allocated with new) the uninitialized struct is still not detected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no way you can ask for the underlying value of an LCV, so by extension you also can't detect if it's uninitialized or not. Here you would need to know if any of the bindings in the cluster of the region of the LCV in the Store of the LCV refers to undef.
That can be undef for 2 reasons: 1) the default binding at least partially covering that region is undef., or 2) there is a direct binding of Undef that at least partially overlaps with that region.
iterBindings could give you the list of direct bindings, but I don't think there is a way to query a default binding. NVM, there is one api, getDefaultBinding. But I start to really hate these LCVs and the regionstore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible that there is no binding at all? Even with iterBindings over subregions and super-regions, no binding is found. LazyCoupoundValue is not necessary for this search (memregion and store is available from elsewhere). The LCV can contain a null Store pointer (and exactly this case looks interesting).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is no cluster for the region, than it means that we saw no stores to that place.
Reading from such a place might be still valid for global storage duration objects such as: globals, local statics, thread local objects. This can be inferred from the memory space associated with the region.
| void pointee_uninit_struct(void) { | ||
| S s; | ||
| S *p = &s; | ||
| doStuff_pointerToConstStruct(p); // expected-warning{{1st function call argument is a pointer to uninitialized value [core.CallAndMessage]}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand it correctly that this testcase wouldn't pass without your change in the checker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was not detected before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I admit that my first impression is that "I don't know" -- I'm not familiar enough with the possible representations of undefined values in the analyzer to form an opinion about this PR. I could research this if it is really needed -- but perhaps @steakhal @haoNoQ or @Xazax-hun could say something wise more directly?
My first guess is that the old code should have been sufficient to detect whether the memory area is undefined -- and it is a bit sad that you need to write this more elaborate code. In general, I tend to notice a pattern that LazyCompoundVal is too opaque and its performance benefits come with lots of added code complexity as we always need to explicitly check for it. (However I don't know enough, perhaps this is still the least wrong approach.)
|
There are more FIXME comments (in test files) about that it is not possible to read undefined values from a |
LCVs are complicated. However, having nonloc::CompoundVals only make this worse. If I had to choose today, I'd only have LCVs for representing compount vals. But note that I have a strong believe that LCVs are necessary for their laziness. I'm pretty sure though that we could have nice APIs around it that would bring consistency and no surprises. |
|
Technically I didn't claim that they are better 😅, and I can easily believe that they are indeed worse...
I think non-lazy compound values are necessary from a theoretical POV to cover cases where we want to create a compound value directly (without "building it up" gradually in a state). For example, assume that we need to write an
I agree and I hope that we can reach this eventually. |
As I understand this, the refcounted Store object doesn't need to be built up gradually. You can create a Store object for your liking and then set your LCV to refer to that Store. That's true that what LCVs are currently built up gradually, but that's because we have regular compound vals that take that use case where we would create a branch new Store object in a single step, holding each binding. |
Fair point, it would be indeed possible to replace non-lazy compound vals with LCVs that have an "artifical" state which is created just for them (and e.g. there is no node which uses it as the state). |
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
| Os << (ArgumentNumber + 1) << llvm::getOrdinalSuffix(ArgumentNumber + 1) | ||
| << " function call argument is "; | ||
| if (ParamT->isPointerType()) | ||
| Os << "a pointer to uninitialized value"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be "a const pointer to uninitialized value" because the constness of the pointer is the reason why this is a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can change the text (it was part of the existing code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(The plan is to extend this check to functions that have in-out parameters which must be initialized.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, I didn't notice that this message was already part of the existing code.
Then I think the best approach would be keeping the message text unchanged in this commit, and updating it in a later follow-up change.
NagyDonat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change looks good to me.
I think the message formatting could be simplified by using llvm::formatv instead of the more verbose raw_ostream handling (as I write in inline comments), but the current implementation is also correct.
🐧 Linux x64 Test Results
|
NagyDonat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update! I really like the solution with the format_provider specialization, it is even more elegant than what I was suggesting.
I think this commit is ready to be merged. @steakhal Do you have anything to add?
steakhal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 30 seconds, I didn't see anything that would make me concerned.
Thanks both of you.
I appreciate the consistent and steady support for clang.
NagyDonat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let's merge this! 🚀
|
I was waiting with the merge because the new check can produce incorrect results for functions where it is enough to initialize a part of an input structure (there can be relatively many of these). Now a new option was added to require totally un-initialized structs to emit the warning. |
|
The new option affects only the ArgPointeeInitializedness check (not ArgInitializedness) to not change existing behavior. (Documentation should be updated.) |
CallAndMessageCheckerdid have a warning for the case when pointer to uninitialized data is passed to a function when the argument type is pointer to const. This did not work for struct types. The check is improved to handle cases when struct with uninitialized data is passed to a function.