Skip to content

Conversation

@serge-sans-paille
Copy link
Collaborator

It is an undefined behavior to pass null arguments as memcpy source or destination parameter, so generate the appropriate llvm.assume call to communicate this to the backend.

Don't do it for builtin memcpy as they don't suffer the same behavior in case of null size.

It is an undefined behavior to pass null arguments as memcpy source or
destination parameter, so generate the appropriate llvm.assume call to
communicate this to the backend.

Don't do it for builtin memcpy as they don't suffer the same behavior in
case of null size.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. labels Dec 12, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 12, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: None (serge-sans-paille)

Changes

It is an undefined behavior to pass null arguments as memcpy source or destination parameter, so generate the appropriate llvm.assume call to communicate this to the backend.

Don't do it for builtin memcpy as they don't suffer the same behavior in case of null size.


Full diff: https://github.com/llvm/llvm-project/pull/119704.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+6)
  • (modified) clang/test/CodeGen/catch-undef-behavior.c (+4)
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 497a0b3952af8c..50cd8529120d36 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -4531,6 +4531,12 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     Value *SizeVal = EmitScalarExpr(E->getArg(2));
     EmitArgCheck(TCK_Store, Dest, E->getArg(0), 0);
     EmitArgCheck(TCK_Load, Src, E->getArg(1), 1);
+    if (BuiltinID == Builtin::BImemcpy || BuiltinID == Builtin::BImempcpy) {
+      Builder.CreateAssumption(
+          Builder.CreateIsNotNull(Dest.emitRawPointer(*this)));
+      Builder.CreateAssumption(
+          Builder.CreateIsNotNull(Src.emitRawPointer(*this)));
+    }
     Builder.CreateMemCpy(Dest, Src, SizeVal, false);
     if (BuiltinID == Builtin::BImempcpy ||
         BuiltinID == Builtin::BI__builtin_mempcpy)
diff --git a/clang/test/CodeGen/catch-undef-behavior.c b/clang/test/CodeGen/catch-undef-behavior.c
index 7580290b0b0333..7ef7c78f1cebb7 100644
--- a/clang/test/CodeGen/catch-undef-behavior.c
+++ b/clang/test/CodeGen/catch-undef-behavior.c
@@ -371,6 +371,10 @@ void call_memcpy_nonnull(void *p, void *q, int sz) {
   // CHECK-TRAP: call void @llvm.ubsantrap(i8 16)
   // CHECK-COMMON-NOT: call
 
+  // CHECK-COMMON: %5 = icmp ne ptr %0, null
+  // CHECK-COMMON: call void @llvm.assume(i1 %5)
+  // CHECK-COMMON: %6 = icmp ne ptr %0, null
+  // CHECK-COMMON: call void @llvm.assume(i1 %6)
   // CHECK-COMMON: call void @llvm.memcpy.p0.p0.i64(ptr align 1 %0, ptr align 1 %1, i64 %conv, i1 false)
   memcpy(p, q, sz);
 }

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer the case after N3322.

(Incidentally, I wrote a blog post about this yesterday: https://developers.redhat.com/articles/2024/12/11/making-memcpynull-null-0-well-defined)

@serge-sans-paille
Copy link
Collaborator Author

what a funny timing :-)
If I understand correctly, the assume should be dst != 0 || len ==0, correct?

@MaskRay
Copy link
Member

MaskRay commented Dec 12, 2024

This is no longer the case after N3322.

(Incidentally, I wrote a blog post about this yesterday: developers.redhat.com/articles/2024/12/11/making-memcpynull-null-0-well-defined)

Nice! The next step might be to update glibc string.h __nonnull

@serge-sans-paille
Copy link
Collaborator Author

For the record, I've added the assumption that seems to correctly capture current UB. I'm not 100% sure it's worth generating them now though?

@nikic
Copy link
Contributor

nikic commented Dec 13, 2024

Yes, I'd expect this to more likely be a regression than an improvement. What was you original motivation for this?

@serge-sans-paille
Copy link
Collaborator Author

My original motivation was a colleague pointing me at https://godbolt.org/z/cPdx6v13r which was specific to gcc but an interesting situation and I wanted clang to be on-par with GCC on that behavior.

@efriedma-quic
Copy link
Collaborator

On the clang side, we've intentionally ignored the nonnull marking on memcpy and friends for a long time; we made a decision a few years back the optimization wasn't worthwhile compared to the security risk.

The updated patch doesn't do anything useful; we can prove the pointers must point to valid memory from the llvm.memcpy call itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants