Skip to content

Conversation

@fhahn
Copy link
Contributor

@fhahn fhahn commented Jun 10, 2024

This adds an attribute version of -fpass-by-value-is-noalias (added in a874d63)

Still needs proper docs.

Attribute to be used by libc++ types: #90119

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jun 10, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 10, 2024

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Florian Hahn (fhahn)

Changes

This adds an attribute version of -fpass-by-value-is-noalias (added in a874d63)

Still needs proper docs.


Full diff: https://github.com/llvm/llvm-project/pull/95004.diff

4 Files Affected:

  • (modified) clang/include/clang/Basic/Attr.td (+10)
  • (modified) clang/lib/CodeGen/CGCall.cpp (+4-3)
  • (added) clang/test/CodeGenCXX/value-type-attr.cpp (+40)
  • (modified) clang/test/Misc/pragma-attribute-supported-attributes-list.test (+1)
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index b70b0c8b836a5..811e1620f38d7 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -1811,6 +1811,16 @@ def TrivialABI : InheritableAttr {
   let SimpleHandler = 1;
 }
 
+def ValueType : InheritableAttr {
+  // This attribute does not have a C [[]] spelling because it requires the
+  // CPlusPlus language option.
+  let Spellings = [Clang<"value_type", 0>];
+  let Subjects = SubjectList<[CXXRecord]>;
+  let Documentation = [TrivialABIDocs];
+  let LangOpts = [CPlusPlus];
+  let SimpleHandler = 1;
+}
+
 def MaxFieldAlignment : InheritableAttr {
   // This attribute has no spellings as it is only ever created implicitly.
   let Spellings = [];
diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp
index 65d82285b907b..55733734d1de2 100644
--- a/clang/lib/CodeGen/CGCall.cpp
+++ b/clang/lib/CodeGen/CGCall.cpp
@@ -2744,9 +2744,10 @@ void CodeGenModule::ConstructAttributeList(StringRef Name,
         Attrs.addByValAttr(getTypes().ConvertTypeForMem(ParamType));
 
       auto *Decl = ParamType->getAsRecordDecl();
-      if (CodeGenOpts.PassByValueIsNoAlias && Decl &&
-          Decl->getArgPassingRestrictions() ==
-              RecordArgPassingKind::CanPassInRegs)
+      if (Decl && ((CodeGenOpts.PassByValueIsNoAlias &&
+                    Decl->getArgPassingRestrictions() ==
+                        RecordArgPassingKind::CanPassInRegs) ||
+                   Decl->hasAttr<ValueTypeAttr>()))
         // When calling the function, the pointer passed in will be the only
         // reference to the underlying object. Mark it accordingly.
         Attrs.addAttribute(llvm::Attribute::NoAlias);
diff --git a/clang/test/CodeGenCXX/value-type-attr.cpp b/clang/test/CodeGenCXX/value-type-attr.cpp
new file mode 100644
index 0000000000000..46938dd7d271a
--- /dev/null
+++ b/clang/test/CodeGenCXX/value-type-attr.cpp
@@ -0,0 +1,40 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
+// RUN: %clang_cc1 -triple arm64-apple-ios11 -std=c++11 -fcxx-exceptions -fexceptions -emit-llvm -o - %s | FileCheck %s
+
+struct Small {
+  int *p;
+  Small();
+  ~Small();
+  Small(const Small &) noexcept;
+  Small &operator=(const Small &);
+};
+
+
+struct __attribute__((value_type)) SmallVT {
+  int *p;
+  SmallVT();
+  ~SmallVT();
+  SmallVT(const SmallVT &) noexcept;
+  SmallVT &operator=(const SmallVT &);
+};
+
+struct Large {
+  int *p;
+  int a[128];
+  Large();
+  ~Large();
+  Large(const Large &) noexcept;
+  Large &operator=(const Large &);
+};
+
+struct __attribute__((value_type)) LargeVT {
+  int *p;
+  int a[128];
+  LargeVT();
+  ~LargeVT();
+  LargeVT(const LargeVT &) noexcept;
+  LargeVT &operator=(const LargeVT &);
+};
+
+void foo(Small a, SmallVT b, Large c, LargeVT d) {}
+// CHECK: define void @_Z3foo5Small7SmallVT5Large7LargeVT(ptr noundef %a, ptr noalias noundef %b, ptr noundef %c, ptr noalias noundef %d)
diff --git a/clang/test/Misc/pragma-attribute-supported-attributes-list.test b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
index 99732694f72a5..4474a2c415bb2 100644
--- a/clang/test/Misc/pragma-attribute-supported-attributes-list.test
+++ b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
@@ -201,6 +201,7 @@
 // CHECK-NEXT: Uninitialized (SubjectMatchRule_variable_is_local)
 // CHECK-NEXT: UnsafeBufferUsage (SubjectMatchRule_function)
 // CHECK-NEXT: UseHandle (SubjectMatchRule_variable_is_parameter)
+// CHECK-NEXT: ValueType (SubjectMatchRule_record)
 // CHECK-NEXT: VecReturn (SubjectMatchRule_record)
 // CHECK-NEXT: VecTypeHint (SubjectMatchRule_function)
 // CHECK-NEXT: WarnUnused (SubjectMatchRule_record)

@ldionne
Copy link
Member

ldionne commented Jun 10, 2024

CC @philnik777

@philnik777
Copy link
Contributor

This is really sweet. However, I feel like this is a property that needs to propagate through types. We can't guarantee that a std::vector<int, my_nasty_allocator<int>> doesn't escape the pointer. Similarly, I think we'd like

struct my_pair {
  int i;
  std::string str;
};

to have this property.

@efriedma-quic
Copy link
Collaborator

I feel like this is a property that needs to propagate through types

You mean, similar to the way trivial_abi works? That makes sense.

@efriedma-quic
Copy link
Collaborator

Also, we probably want to specify the interaction between trivial_abi and value_type here... should trivial_abi imply value_type?

@fhahn
Copy link
Contributor Author

fhahn commented Jun 19, 2024

Thanks, I'll hope to get back to this and make the suggested adjustments soonish

fhahn added 2 commits March 14, 2025 14:18
This adds an attribute version of -fpass-by-value-is-noalias (added in
a874d63)

Still needs proper docs.
@fhahn fhahn force-pushed the clang-value-type-attr branch from 34e880a to 82e6bdb Compare March 14, 2025 16:25
@fhahn
Copy link
Contributor Author

fhahn commented Mar 14, 2025

Just rebased the PR and added docs

I feel like this is a property that needs to propagate through types

You mean, similar to the way trivial_abi works? That makes sense.

I am not sure about propagating. I added some documentation to make the spec of the attribute clearer.

Specifically, it requires the objects of the type to be considered copied into new objects when passing to functions, i.e. code using the type isn't allowed to rely on previous pointers to the actual object e.g. via copy elision or NRVO.

Similarly, I think the guarantees are also different to trivial_abi, as there could be other pointers to the same object even if it is trivial via copy elision or NRVO. There's a thread where this problem in general has been discussed extensively a while ago https://lists.llvm.org/pipermail/cfe-dev/2020-July/066357.html

let Category = DocCatDecl;
let Content = [{
The ``value_type`` attribute can be used to mark user-defined types as 'value
types'. When objects of value types are passed value to functions, the objects
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
types'. When objects of value types are passed value to functions, the objects
types'. When objects of value types are passed by value to functions, the objects

are always considered to be formally copied into a new object. This means the
argument itself must be the only value referencing the passed object. This
allows the optimizer to assume that no other pointers may alias the object when
it is passed indirectly to a function..
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
it is passed indirectly to a function..
it is passed indirectly to a function.

The ``value_type`` attribute can be used to mark user-defined types as 'value
types'. When objects of value types are passed value to functions, the objects
are always considered to be formally copied into a new object. This means the
argument itself must be the only value referencing the passed object. This
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am having a hard time figuring out why pass-by-value doesn't already imply this? Pass-by-value means it was copied into the new object, right? So I guess I have a hard time figuring out what we really mean here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NRVO copy-elision can enable a function which returns by value to stash the address of the returned value. Or, C++ constructors can do the same.

Then, copy-elision on the call can result in the callee receiving a object whose address is escaped, despite it receiving it "passed by value". E.g. in the following example, &a == global in the body of foo.

struct NonTrivial { NonTrivial(); };
NonTrivial *global;
NonTrivial frob() {
    NonTrivial result;
    global = &result;
    return result;
}

void foo(NonTrivial a);
int main() {
  foo(frob());
}

But, I don't think this new attribute is viable; the intent expressed by the PR message is to use it in libc++ to mark stdlib types. But, this cannot be a type property, because any code could do the above with any type.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see what you mean now. Yeah, that doesn't look like something that could be a property of the type so much as the uses itself. So I again find myself wondering what property of the type we can annotate that makes it usable in the opt requested.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the argument satisfies the formal conditions for being passed in registers, the implementation is allowed to introduce a copy despite NRVO, which might be enough to make any escaped/self pointers formally invalid even if we don't actually copy. But yeah, NRVO might mean we're out of luck for an arbitrary type to which that doesn't apply.

We could probably talk to the C++ committee about this if we can show that we could do useful optimizations except for some contrived and dangerous code pattern like escaping the address of a copyable class value prior to returning it via NRVO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jyknight thanks for sharing the example! I made a few changes to make it runable: https://clang.godbolt.org/z/WrE8Ga6d9

What's interesting is that we get different results with Clang with and without optimizations (without optimizations, there's a copy due to byval, while with optimizations to function is inlined and the byval is gone). GCC seems to create copies both with and without optimizations and gives consistent results.

On AArch64, the argument isn't marked as byval, so we get the same results with and w/o optimizations. But it behaves differently to X86 w/o optimizations.

So overall it seems like there already are a number of inconsistencies when trying to capture the address of a pass-by-value argument across multiple dimensions; IIUC the program should return the same results for both GCC, Clang, with opts, w/o opts and different architectures.

The attribute itself is intended to forbid such uses of the type, but I agree it is not clear if it would be possible to use other for types in libc++.

Copy link
Contributor

@rjmccall rjmccall Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, right, stack arguments.

Anyway, that code specifically has... at least unspecified behavior. NonTrivial satisfies the conditions of [class.temporary]p3, and so the compiler is permitted (but not required) to perform additional copies when passing and returning it, which makes it impossible to specify under the standard whether the object addresses compare the same. They do if all copies are avoided, they don't if the return copy is avoided but the parameter copy is not, and it's UB if there's a return copy because the pointer no longer points to allocated storage.

I don't know whether that's a strong enough argument to actually treat the existence of pointers to the parameter as UB, though. And it's specific to formally trivial types, which presumably doesn't really help us with our optimization efforts because we want to optimize std::string, std::vector, and the like.

// CPlusPlus language option.
let Spellings = [Clang<"value_type", 0>];
let Subjects = SubjectList<[CXXRecord]>;
let Documentation = [TrivialABIDocs];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we adding docs, but setting the documentation link here to something else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants