-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[clang] Generate appropriate assume in presence of libc's memcpy #119704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[clang] Generate appropriate assume in presence of libc's memcpy #119704
Conversation
It is an undefined behavior to pass null arguments as memcpy source or destination parameter, so generate the appropriate llvm.assume call to communicate this to the backend. Don't do it for builtin memcpy as they don't suffer the same behavior in case of null size.
|
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-codegen Author: None (serge-sans-paille) ChangesIt is an undefined behavior to pass null arguments as memcpy source or destination parameter, so generate the appropriate llvm.assume call to communicate this to the backend. Don't do it for builtin memcpy as they don't suffer the same behavior in case of null size. Full diff: https://github.com/llvm/llvm-project/pull/119704.diff 2 Files Affected:
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 497a0b3952af8c..50cd8529120d36 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -4531,6 +4531,12 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
Value *SizeVal = EmitScalarExpr(E->getArg(2));
EmitArgCheck(TCK_Store, Dest, E->getArg(0), 0);
EmitArgCheck(TCK_Load, Src, E->getArg(1), 1);
+ if (BuiltinID == Builtin::BImemcpy || BuiltinID == Builtin::BImempcpy) {
+ Builder.CreateAssumption(
+ Builder.CreateIsNotNull(Dest.emitRawPointer(*this)));
+ Builder.CreateAssumption(
+ Builder.CreateIsNotNull(Src.emitRawPointer(*this)));
+ }
Builder.CreateMemCpy(Dest, Src, SizeVal, false);
if (BuiltinID == Builtin::BImempcpy ||
BuiltinID == Builtin::BI__builtin_mempcpy)
diff --git a/clang/test/CodeGen/catch-undef-behavior.c b/clang/test/CodeGen/catch-undef-behavior.c
index 7580290b0b0333..7ef7c78f1cebb7 100644
--- a/clang/test/CodeGen/catch-undef-behavior.c
+++ b/clang/test/CodeGen/catch-undef-behavior.c
@@ -371,6 +371,10 @@ void call_memcpy_nonnull(void *p, void *q, int sz) {
// CHECK-TRAP: call void @llvm.ubsantrap(i8 16)
// CHECK-COMMON-NOT: call
+ // CHECK-COMMON: %5 = icmp ne ptr %0, null
+ // CHECK-COMMON: call void @llvm.assume(i1 %5)
+ // CHECK-COMMON: %6 = icmp ne ptr %0, null
+ // CHECK-COMMON: call void @llvm.assume(i1 %6)
// CHECK-COMMON: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %0, ptr align 1 %1, i64 %conv, i1 false)
memcpy(p, q, sz);
}
|
nikic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer the case after N3322.
(Incidentally, I wrote a blog post about this yesterday: https://developers.redhat.com/articles/2024/12/11/making-memcpynull-null-0-well-defined)
|
what a funny timing :-) |
Nice! The next step might be to update glibc string.h |
|
For the record, I've added the assumption that seems to correctly capture current UB. I'm not 100% sure it's worth generating them now though? |
|
Yes, I'd expect this to more likely be a regression than an improvement. What was you original motivation for this? |
|
My original motivation was a colleague pointing me at https://godbolt.org/z/cPdx6v13r which was specific to gcc but an interesting situation and I wanted clang to be on-par with GCC on that behavior. |
|
On the clang side, we've intentionally ignored the nonnull marking on memcpy and friends for a long time; we made a decision a few years back the optimization wasn't worthwhile compared to the security risk. The updated patch doesn't do anything useful; we can prove the pointers must point to valid memory from the llvm.memcpy call itself. |
It is an undefined behavior to pass null arguments as memcpy source or destination parameter, so generate the appropriate llvm.assume call to communicate this to the backend.
Don't do it for builtin memcpy as they don't suffer the same behavior in case of null size.