Skip to content

Conversation

@a-nogikh
Copy link

@a-nogikh a-nogikh commented Nov 7, 2025

The "malloc" attribute restricts the possible function signatures to the ones returning a pointer, which is not the case for some non-standard allocation function variants. For example, P0901R11 proposed ::operator new overloads that return a return_size_t result - a struct that contains a pointer to the allocated memory as well as the actual size of the allocated memory. Another example is __size_returning_new.

Introduce a new "malloc_span" attribute that exhibits similar semantics, but applies to functions returning records whose first member is a pointer (assumed to point to the allocated memory). This is the case for return_size_t as well as std::span, should it be returned from such an annotated function.

An alternative approach would be to relax the restrictions of the existing "malloc" attribute to be applied to both functions returning pointers and functions returning span-like structs. However, it would complicate the user-space code by requiring specific Clang version checks. In contrast, the presence of a new attribute can be straightforwardly verified via the __has_attribute macro. Introducing a new attribute also avoids concerns about the potential incompatibility with GCC's "malloc" semantics.

In future commits, codegen can be improved to recognize the noalias-ness of the pointer returned inside a span-like struct.

This change helps unlock the alloc token instrumentation for such non-standard allocation functions:
https://clang.llvm.org/docs/AllocToken.html#instrumenting-non-standard-allocation-functions

The "malloc" attribute restricts the possible function signatures to
the ones returning a pointer, which is not the case for some non-standard
allocation function variants. For example, P0901R11 proposed ::operator new
overloads that return a return_size_t result - a struct that contains
a pointer to the allocated memory as well as the actual size of the
allocated memory. Another example is __size_returning_new.

Introduce a new "malloc_span" attribute that exhibits similar semantics,
but applies to functions returning records whose first member is
a pointer (assumed to point to the allocated memory). This is the case for
return_size_t as well as std::span, should it be returned from such
an annotated function.

An alternative approach would be to relax the restrictions of the
existing "malloc" attribute to be applied to both functions returning
pointers and functions returning span-like structs. However, it would
complicate the user-space code by requiring specific Clang version
checks. In contrast, the presence of a new attribute can be
straightforwardly verified via the __has_attribute macro. Introducing
a new attribute also avoids concerns about the potential incompatibility
with GCC's "malloc" semantics.

In future commits, codegen can be improved to recognize the
noalias-ness of the pointer returned inside a span-like struct.

This change helps unlock the alloc token instrumentation for such
non-standard allocation functions:
https://clang.llvm.org/docs/AllocToken.html#instrumenting-non-standard-allocation-functions
@github-actions
Copy link

github-actions bot commented Nov 7, 2025

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Nov 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 7, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-codegen

Author: Aleksandr Nogikh (a-nogikh)

Changes

The "malloc" attribute restricts the possible function signatures to the ones returning a pointer, which is not the case for some non-standard allocation function variants. For example, P0901R11 proposed ::operator new overloads that return a return_size_t result - a struct that contains a pointer to the allocated memory as well as the actual size of the allocated memory. Another example is __size_returning_new.

Introduce a new "malloc_span" attribute that exhibits similar semantics, but applies to functions returning records whose first member is a pointer (assumed to point to the allocated memory). This is the case for return_size_t as well as std::span, should it be returned from such an annotated function.

An alternative approach would be to relax the restrictions of the existing "malloc" attribute to be applied to both functions returning pointers and functions returning span-like structs. However, it would complicate the user-space code by requiring specific Clang version checks. In contrast, the presence of a new attribute can be straightforwardly verified via the __has_attribute macro. Introducing a new attribute also avoids concerns about the potential incompatibility with GCC's "malloc" semantics.

In future commits, codegen can be improved to recognize the noalias-ness of the pointer returned inside a span-like struct.

This change helps unlock the alloc token instrumentation for such non-standard allocation functions:
https://clang.llvm.org/docs/AllocToken.html#instrumenting-non-standard-allocation-functions


Full diff: https://github.com/llvm/llvm-project/pull/167010.diff

8 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+4)
  • (modified) clang/include/clang/Basic/Attr.td (+6)
  • (modified) clang/include/clang/Basic/AttrDocs.td (+15)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+4)
  • (modified) clang/lib/CodeGen/CGExpr.cpp (+2-1)
  • (modified) clang/lib/Sema/SemaDeclAttr.cpp (+36)
  • (modified) clang/test/Misc/pragma-attribute-supported-attributes-list.test (+1)
  • (added) clang/test/Sema/attr-malloc_span.c (+31)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index e8339fa13ffba..53ed395b54ee5 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -327,6 +327,10 @@ Attribute Changes in Clang
 - New format attributes ``gnu_printf``, ``gnu_scanf``, ``gnu_strftime`` and ``gnu_strfmon`` are added
   as aliases for ``printf``, ``scanf``, ``strftime`` and ``strfmon``. (#GH16219)
 
+- New function attribute `malloc_span` is added. It has the `malloc` semantics, but must be applied
+  not to functions returning pointers, but to functions returning span-like structures (i.e. those
+  that contain a pointer field and a size integer field).
+
 Improvements to Clang's diagnostics
 -----------------------------------
 - Diagnostics messages now refer to ``structured binding`` instead of ``decomposition``,
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 1013bfc575747..27987b4da3cc9 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -2068,6 +2068,12 @@ def Restrict : InheritableAttr {
   let Documentation = [RestrictDocs];
 }
 
+def MallocSpan : InheritableAttr {
+  let Spellings = [Clang<"malloc_span">];
+  let Subjects = SubjectList<[Function]>;
+  let Documentation = [MallocSpanDocs];
+}
+
 def LayoutVersion : InheritableAttr, TargetSpecificAttr<TargetMicrosoftRecordLayout> {
   let Spellings = [Declspec<"layout_version">];
   let Args = [UnsignedArgument<"Version">];
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 1be9a96aa44de..0964ea70f345d 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -5247,6 +5247,21 @@ yet implemented in clang.
   }];
 }
 
+def MallocSpanDocs : Documentation {
+  let Category = DocCatFunction;
+  let Heading = "malloc_span";
+  let Content = [{
+The ``malloc_span`` attribute can be used to mark that a function, which acts
+like a system memory allocation function and returns a span-like structure,
+returns pointer to memory that does not alias storage from any other object
+accessible to the caller.
+
+In this context, a span-like structure is assumed to have a pointer to the
+allocated memory as its first field and an integer with the size of the
+actually allocated memory as the second field.
+  }];
+}
+
 def ReturnsNonNullDocs : Documentation {
   let Category = NullabilityDocs;
   let Content = [{
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 04f2e8d654fd5..7687eb5ca0ec6 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -3449,6 +3449,10 @@ def err_attribute_integers_only : Error<
 def warn_attribute_return_pointers_only : Warning<
   "%0 attribute only applies to return values that are pointers">,
   InGroup<IgnoredAttributes>;
+def warn_attribute_return_span_only
+    : Warning<"%0 attribute only applies to return values that are span-like "
+              "structures">,
+      InGroup<IgnoredAttributes>;
 def warn_attribute_return_pointers_refs_only : Warning<
   "%0 attribute only applies to return values that are pointers or references">,
   InGroup<IgnoredAttributes>;
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index 01f2161f27555..cc6174578a87c 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -6642,7 +6642,8 @@ RValue CodeGenFunction::EmitCall(QualType CalleeType,
                                   CalleeDecl);
     }
     if (CalleeDecl->hasAttr<RestrictAttr>() ||
-        CalleeDecl->hasAttr<AllocSizeAttr>()) {
+        CalleeDecl->hasAttr<AllocSizeAttr>() ||
+        CalleeDecl->hasAttr<MallocSpanAttr>()) {
       // Function has 'malloc' (aka. 'restrict') or 'alloc_size' attribute.
       if (SanOpts.has(SanitizerKind::AllocToken)) {
         // Set !alloc_token metadata.
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index a9e7b44ac9d73..34d6ad0de1f70 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -1839,6 +1839,39 @@ static void handleRestrictAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
                  RestrictAttr(S.Context, AL, DeallocE, DeallocPtrIdx));
 }
 
+static bool isSpanLikeType(const QualType &Ty) {
+  // Check that the type is a plain record with the first field being a pointer
+  // type and the second field being an integer.
+  // This matches the common implementation of std::span or sized_allocation_t
+  // in P0901R11.
+  // Note that there may also be numerous cases of pointer+integer structures
+  // not actually exhibiting a std::span-like semantics, so sometimes
+  // this heuristic expectedly leads to false positive results.
+  const RecordDecl *RD = Ty->getAsRecordDecl();
+  if (!RD || RD->isUnion())
+    return false;
+  const RecordDecl *Def = RD->getDefinition();
+  if (!Def)
+    return false; // This is an incomplete type.
+  auto FieldsBegin = Def->field_begin();
+  if (std::distance(FieldsBegin, Def->field_end()) != 2)
+    return false;
+  const FieldDecl *FirstField = *FieldsBegin;
+  const FieldDecl *SecondField = *std::next(FieldsBegin);
+  return FirstField->getType()->isAnyPointerType() &&
+         SecondField->getType()->isIntegerType();
+}
+
+static void handleMallocSpanAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
+  QualType ResultType = getFunctionOrMethodResultType(D);
+  if (!isSpanLikeType(ResultType)) {
+    S.Diag(AL.getLoc(), diag::warn_attribute_return_span_only)
+        << AL << getFunctionOrMethodResultSourceRange(D);
+    return;
+  }
+  D->addAttr(::new (S.Context) MallocSpanAttr(S.Context, AL));
+}
+
 static void handleCPUSpecificAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
   // Ensure we don't combine these with themselves, since that causes some
   // confusing behavior.
@@ -7278,6 +7311,9 @@ ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D, const ParsedAttr &AL,
   case ParsedAttr::AT_Restrict:
     handleRestrictAttr(S, D, AL);
     break;
+  case ParsedAttr::AT_MallocSpan:
+    handleMallocSpanAttr(S, D, AL);
+    break;
   case ParsedAttr::AT_Mode:
     handleModeAttr(S, D, AL);
     break;
diff --git a/clang/test/Misc/pragma-attribute-supported-attributes-list.test b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
index ab4153a64f028..747eb17446c87 100644
--- a/clang/test/Misc/pragma-attribute-supported-attributes-list.test
+++ b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
@@ -102,6 +102,7 @@
 // CHECK-NEXT: MIGServerRoutine (SubjectMatchRule_function, SubjectMatchRule_objc_method, SubjectMatchRule_block)
 // CHECK-NEXT: MSConstexpr (SubjectMatchRule_function)
 // CHECK-NEXT: MSStruct (SubjectMatchRule_record)
+// CHECK-NEXT: MallocSpan (SubjectMatchRule_function)
 // CHECK-NEXT: MaybeUndef (SubjectMatchRule_variable_is_parameter)
 // CHECK-NEXT: MicroMips (SubjectMatchRule_function)
 // CHECK-NEXT: MinSize (SubjectMatchRule_function, SubjectMatchRule_objc_method)
diff --git a/clang/test/Sema/attr-malloc_span.c b/clang/test/Sema/attr-malloc_span.c
new file mode 100644
index 0000000000000..05f29ccf6dd83
--- /dev/null
+++ b/clang/test/Sema/attr-malloc_span.c
@@ -0,0 +1,31 @@
+// RUN: %clang_cc1 -verify -fsyntax-only %s
+// RUN: %clang_cc1 -emit-llvm -o %t %s
+
+#include <stddef.h>
+
+typedef struct {
+  void *ptr;
+  size_t n;
+} sized_ptr;
+sized_ptr  returns_sized_ptr  (void) __attribute((malloc_span)); // no-warning
+
+// The first struct field must be pointer and the second must be an integer.
+// Check the possible ways to violate it.
+typedef struct {
+  size_t n;
+  void *ptr;
+} invalid_span1;
+invalid_span1  returns_non_std_span1  (void) __attribute((malloc_span)); // expected-warning {{attribute only applies to return values that are span-like structures}}
+
+typedef struct {
+  void *ptr;
+  void *ptr2;
+} invalid_span2;
+invalid_span2  returns_non_std_span2  (void) __attribute((malloc_span)); // expected-warning {{attribute only applies to return values that are span-like structures}}
+
+typedef struct {
+  void *ptr;
+  size_t n;
+  size_t n2;
+} invalid_span3;
+invalid_span3  returns_non_std_span3  (void) __attribute((malloc_span)); // expected-warning {{attribute only applies to return values that are span-like structures}}

@a-nogikh
Copy link
Author

a-nogikh commented Nov 7, 2025

This PR is a reworked version of #165433. The original PR proposed the changes to the semantics of the malloc attribute itself, but the consensus was that adding a separate malloc_span attribute would be a better approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants