Skip to content

Conversation

@kees
Copy link
Contributor

@kees kees commented Oct 23, 2025

The counted_by attribute currently rejects void* members because void has no defined size. However, the sized_by attribute accepts void* since it explicitly measures bytes. As a GNU extension, void pointer arithmetic treats void as having size 1 byte, so counted_by on void* should behave identically to sized_by (treating the count as bytes).

Allow counted_by on void* as a GNU extension. The implementation validates this only at declaration time in SemaBoundsSafety.cpp, emitting a -Wpointer-arith warning that the attribute is treated as a GNU extension equivalent to sized_by. Both use-site validation and code generation trust this earlier validation, avoiding redundant checks.

In CodeGen, __builtin_dynamic_object_size now correctly handles counted_by on void* by treating any CountAttributedType with zero element size as having 1-byte elements, matching the GNU void pointer arithmetic semantics.

Add tests validating both Sema diagnostics and CodeGen behavior (correct byte counts from __builtin_dynamic_object_size). Update existing counted_by tests to explicitly use -Wpointer-arith to preserve their original intent of rejecting void* in strict C mode.

The counted_by attribute currently rejects void* members because void
has no defined size. However, the sized_by attribute accepts void*
since it explicitly measures bytes. In GNU mode, void pointer
arithmetic treats void as having size 1 byte, so counted_by on void*
should behave identically to sized_by (treating the count as bytes).

Allow counted_by on void* when GNU extensions are enabled. The
implementation validates this only at declaration time in
SemaBoundsSafety.cpp, emitting a -Wpointer-arith warning that the
attribute is treated as a GNU extension equivalent to sized_by. Both
use-site validation and code generation trust this earlier validation,
avoiding redundant checks.

In CodeGen, __builtin_dynamic_object_size now correctly handles
counted_by on void* by treating any CountAttributedType with zero
element size as having 1-byte elements, matching the GNU void pointer
arithmetic semantics.

Add tests validating both Sema diagnostics (warnings in GNU mode,
errors in strict C) and CodeGen behavior (correct byte counts from
__builtin_dynamic_object_size). Update existing counted_by tests to
explicitly use -std=c11 to preserve their original intent of rejecting
void* in strict C mode.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Oct 23, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 23, 2025

@llvm/pr-subscribers-clang

Author: Kees Cook (kees)

Changes

The counted_by attribute currently rejects void* members because void has no defined size. However, the sized_by attribute accepts void* since it explicitly measures bytes. In GNU mode, void pointer arithmetic treats void as having size 1 byte, so counted_by on void* should behave identically to sized_by (treating the count as bytes).

Allow counted_by on void* when GNU extensions are enabled. The implementation validates this only at declaration time in SemaBoundsSafety.cpp, emitting a -Wpointer-arith warning that the attribute is treated as a GNU extension equivalent to sized_by. Both use-site validation and code generation trust this earlier validation, avoiding redundant checks.

In CodeGen, __builtin_dynamic_object_size now correctly handles counted_by on void* by treating any CountAttributedType with zero element size as having 1-byte elements, matching the GNU void pointer arithmetic semantics.

Add tests validating both Sema diagnostics (warnings in GNU mode, errors in strict C) and CodeGen behavior (correct byte counts from __builtin_dynamic_object_size). Update existing counted_by tests to explicitly use -std=c11 to preserve their original intent of rejecting void* in strict C mode.


Full diff: https://github.com/llvm/llvm-project/pull/164737.diff

10 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+6)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+5-3)
  • (modified) clang/lib/Sema/SemaBoundsSafety.cpp (+19-3)
  • (added) clang/test/CodeGen/attr-counted-by-void-ptr-gnu.c (+65)
  • (modified) clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c (+1-1)
  • (modified) clang/test/Sema/attr-counted-by-or-null-last-field.c (+2-2)
  • (modified) clang/test/Sema/attr-counted-by-or-null-late-parsed-struct-ptrs.c (+1-1)
  • (modified) clang/test/Sema/attr-counted-by-or-null-struct-ptrs.c (+2-2)
  • (modified) clang/test/Sema/attr-counted-by-struct-ptrs.c (+2-2)
  • (added) clang/test/Sema/attr-counted-by-void-ptr-gnu.c (+65)
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 20b499462ae94..33f28f251f5d0 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -8114,6 +8114,12 @@ def ext_gnu_ptr_func_arith : Extension<
   "arithmetic on%select{ a|}0 pointer%select{|s}0 to%select{ the|}2 function "
   "type%select{|s}2 %1%select{| and %3}2 is a GNU extension">,
   InGroup<GNUPointerArith>;
+def ext_gnu_counted_by_void_ptr
+    : Extension<
+          "'%select{counted_by|sized_by|counted_by_or_null|sized_by_or_null}0' "
+          "on a pointer to void is a GNU extension, treated as "
+          "'%select{sized_by|sized_by|sized_by_or_null|sized_by_or_null}0'">,
+      InGroup<GNUPointerArith>;
 def err_readonly_message_assignment : Error<
   "assigning to 'readonly' return result of an Objective-C message not allowed">;
 def ext_c2y_increment_complex : Extension<
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index fd14cd6926fe2..aaf8d59414498 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -1211,11 +1211,13 @@ llvm::Value *CodeGenFunction::emitCountedByPointerSize(
         getContext().getTypeSizeInChars(ElementTy->getPointeeType());
 
     if (ElementSize.isZero()) {
-      // This might be a __sized_by on a 'void *', which counts bytes, not
-      // elements.
+      // This might be a __sized_by (or __counted_by in GNU mode) on a
+      // 'void *', which counts bytes, not elements.
       auto *CAT = ElementTy->getAs<CountAttributedType>();
       if (!CAT || (CAT->getKind() != CountAttributedType::SizedBy &&
-                   CAT->getKind() != CountAttributedType::SizedByOrNull))
+                   CAT->getKind() != CountAttributedType::SizedByOrNull &&
+                   CAT->getKind() != CountAttributedType::CountedBy &&
+                   CAT->getKind() != CountAttributedType::CountedByOrNull))
         // Okay, not sure what it is now.
         // FIXME: Should this be an assert?
         return std::optional<CharUnits>();
diff --git a/clang/lib/Sema/SemaBoundsSafety.cpp b/clang/lib/Sema/SemaBoundsSafety.cpp
index 39ab13653f5fe..abc7397aaa888 100644
--- a/clang/lib/Sema/SemaBoundsSafety.cpp
+++ b/clang/lib/Sema/SemaBoundsSafety.cpp
@@ -132,9 +132,20 @@ bool Sema::CheckCountedByAttrOnField(FieldDecl *FD, Expr *E, bool CountInBytes,
     // `BoundsSafetyCheckUseOfCountAttrPtr`
     //
     // * When the pointee type is always an incomplete type (e.g.
-    // `void`) the attribute is disallowed by this method because we know the
-    // type can never be completed so there's no reason to allow it.
-    InvalidTypeKind = CountedByInvalidPointeeTypeKind::INCOMPLETE;
+    // `void` in strict C mode) the attribute is disallowed by this method
+    // because we know the type can never be completed so there's no reason
+    // to allow it.
+    //
+    // Exception: In GNU mode, void has an implicit size of 1 byte for pointer
+    // arithmetic. Therefore, counted_by on void* is allowed as a GNU extension
+    // and behaves equivalently to sized_by (treating the count as bytes).
+    bool IsVoidPtrInGNUMode = PointeeTy->isVoidType() && getLangOpts().GNUMode;
+    if (IsVoidPtrInGNUMode) {
+      // Emit a warning that this is a GNU extension
+      Diag(FD->getBeginLoc(), diag::ext_gnu_counted_by_void_ptr) << Kind;
+    } else {
+      InvalidTypeKind = CountedByInvalidPointeeTypeKind::INCOMPLETE;
+    }
   } else if (PointeeTy->isSizelessType()) {
     InvalidTypeKind = CountedByInvalidPointeeTypeKind::SIZELESS;
   } else if (PointeeTy->isFunctionType()) {
@@ -272,6 +283,11 @@ GetCountedByAttrOnIncompletePointee(QualType Ty, NamedDecl **ND) {
   if (!PointeeTy->isIncompleteType(ND))
     return {};
 
+  // If counted_by is on void*, it was already validated at declaration time
+  // as a GNU extension. No need to re-check GNU mode here.
+  if (PointeeTy->isVoidType())
+    return {};
+
   return {CATy, PointeeTy};
 }
 
diff --git a/clang/test/CodeGen/attr-counted-by-void-ptr-gnu.c b/clang/test/CodeGen/attr-counted-by-void-ptr-gnu.c
new file mode 100644
index 0000000000000..e22aad306f60c
--- /dev/null
+++ b/clang/test/CodeGen/attr-counted-by-void-ptr-gnu.c
@@ -0,0 +1,65 @@
+// RUN: %clang_cc1 -std=gnu11 -triple x86_64-unknown-linux-gnu -O2 -emit-llvm -o - %s | FileCheck %s
+
+// Test that counted_by on void* in GNU mode treats void as having size 1 (byte count)
+
+#define __counted_by(f)  __attribute__((counted_by(f)))
+#define __sized_by(f)  __attribute__((sized_by(f)))
+
+struct with_counted_by_void {
+  int count;
+  void* buf __counted_by(count);
+};
+
+struct with_sized_by_void {
+  int size;
+  void* buf __sized_by(size);
+};
+
+struct with_counted_by_int {
+  int count;
+  int* buf __counted_by(count);
+};
+
+// CHECK-LABEL: define dso_local {{.*}}@test_counted_by_void(
+// CHECK:         %[[COUNT:.*]] = load i32, ptr %s
+// CHECK:         %[[NARROW:.*]] = tail call i32 @llvm.smax.i32(i32 %[[COUNT]], i32 0)
+// CHECK:         %[[ZEXT:.*]] = zext nneg i32 %[[NARROW]] to i64
+// CHECK:         ret i64 %[[ZEXT]]
+//
+// Verify: counted_by on void* returns the count directly (count * 1 byte)
+long long test_counted_by_void(struct with_counted_by_void *s) {
+  return __builtin_dynamic_object_size(s->buf, 0);
+}
+
+// CHECK-LABEL: define dso_local {{.*}}@test_sized_by_void(
+// CHECK:         %[[SIZE:.*]] = load i32, ptr %s
+// CHECK:         %[[NARROW:.*]] = tail call i32 @llvm.smax.i32(i32 %[[SIZE]], i32 0)
+// CHECK:         %[[ZEXT:.*]] = zext nneg i32 %[[NARROW]] to i64
+// CHECK:         ret i64 %[[ZEXT]]
+//
+// Verify: sized_by on void* returns the size directly
+long long test_sized_by_void(struct with_sized_by_void *s) {
+  return __builtin_dynamic_object_size(s->buf, 0);
+}
+
+// CHECK-LABEL: define dso_local {{.*}}@test_counted_by_int(
+// CHECK:         %[[COUNT:.*]] = load i32, ptr %s
+// CHECK:         %[[SEXT:.*]] = sext i32 %[[COUNT]] to i64
+// CHECK:         %[[SIZE:.*]] = shl nsw i64 %[[SEXT]], 2
+// CHECK:         ret i64
+//
+// Verify: counted_by on int* returns count * sizeof(int) = count * 4
+long long test_counted_by_int(struct with_counted_by_int *s) {
+  return __builtin_dynamic_object_size(s->buf, 0);
+}
+
+// CHECK-LABEL: define dso_local ptr @test_void_ptr_arithmetic(
+// CHECK:         %[[BUF:.*]] = load ptr, ptr
+// CHECK:         %[[EXT:.*]] = sext i32 %offset to i64
+// CHECK:         %[[PTR:.*]] = getelementptr inbounds i8, ptr %[[BUF]], i64 %[[EXT]]
+// CHECK:         ret ptr %[[PTR]]
+//
+// Verify: pointer arithmetic on void* uses i8 (byte offsets), not i32 or other sizes
+void* test_void_ptr_arithmetic(struct with_counted_by_void *s, int offset) {
+  return s->buf + offset;  // GNU extension: void* arithmetic
+}
diff --git a/clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c b/clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c
index 8d4e0c510603a..b6c2ba68732ee 100644
--- a/clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c
+++ b/clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c11 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
 
 #define __counted_by(f)  __attribute__((counted_by(f)))
 
diff --git a/clang/test/Sema/attr-counted-by-or-null-last-field.c b/clang/test/Sema/attr-counted-by-or-null-last-field.c
index 60a1f571b19e9..4cf8cb64557d2 100644
--- a/clang/test/Sema/attr-counted-by-or-null-last-field.c
+++ b/clang/test/Sema/attr-counted-by-or-null-last-field.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -fsyntax-only -verify=expected,immediate %s
-// RUN: %clang_cc1 -fsyntax-only -fexperimental-late-parse-attributes -verify=expected,late %s
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -verify=expected,immediate %s
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -fexperimental-late-parse-attributes -verify=expected,late %s
 
 #define __counted_by_or_null(f)  __attribute__((counted_by_or_null(f)))
 
diff --git a/clang/test/Sema/attr-counted-by-or-null-late-parsed-struct-ptrs.c b/clang/test/Sema/attr-counted-by-or-null-late-parsed-struct-ptrs.c
index 2150c81f9e9be..29e235463f822 100644
--- a/clang/test/Sema/attr-counted-by-or-null-late-parsed-struct-ptrs.c
+++ b/clang/test/Sema/attr-counted-by-or-null-late-parsed-struct-ptrs.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c11 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
 
 #define __counted_by_or_null(f)  __attribute__((counted_by_or_null(f)))
 #define __counted_by(f)  __attribute__((counted_by(f)))
diff --git a/clang/test/Sema/attr-counted-by-or-null-struct-ptrs.c b/clang/test/Sema/attr-counted-by-or-null-struct-ptrs.c
index 0bb09059c97f9..7fe43a20c0c6e 100644
--- a/clang/test/Sema/attr-counted-by-or-null-struct-ptrs.c
+++ b/clang/test/Sema/attr-counted-by-or-null-struct-ptrs.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s
-// RUN: %clang_cc1 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c11 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
 
 #define __counted_by_or_null(f)  __attribute__((counted_by_or_null(f)))
 #define __counted_by(f)  __attribute__((counted_by(f)))
diff --git a/clang/test/Sema/attr-counted-by-struct-ptrs.c b/clang/test/Sema/attr-counted-by-struct-ptrs.c
index c05d18262e2b7..d75533d7d135b 100644
--- a/clang/test/Sema/attr-counted-by-struct-ptrs.c
+++ b/clang/test/Sema/attr-counted-by-struct-ptrs.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s
-// RUN: %clang_cc1 -fsyntax-only -fexperimental-late-parse-attributes %s -verify
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -fexperimental-late-parse-attributes %s -verify
 
 #define __counted_by(f)  __attribute__((counted_by(f)))
 
diff --git a/clang/test/Sema/attr-counted-by-void-ptr-gnu.c b/clang/test/Sema/attr-counted-by-void-ptr-gnu.c
new file mode 100644
index 0000000000000..c6aa371434bc4
--- /dev/null
+++ b/clang/test/Sema/attr-counted-by-void-ptr-gnu.c
@@ -0,0 +1,65 @@
+// RUN: %clang_cc1 -std=gnu11 -fsyntax-only -verify=expected-nowarn %s
+// RUN: %clang_cc1 -std=gnu11 -Wpointer-arith -fsyntax-only -verify=expected-warn %s
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -verify=expected-strict %s
+
+// expected-nowarn-no-diagnostics
+
+#define __counted_by(f)  __attribute__((counted_by(f)))
+#define __counted_by_or_null(f)  __attribute__((counted_by_or_null(f)))
+#define __sized_by(f)  __attribute__((sized_by(f)))
+
+//==============================================================================
+// Test: counted_by on void* is allowed in GNU mode, rejected in strict mode
+//==============================================================================
+
+struct test_void_ptr_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by' on a pointer to void is a GNU extension, treated as 'sized_by'}}
+  // expected-strict-error@+1{{'counted_by' cannot be applied to a pointer with pointee of unknown size because 'void' is an incomplete type}}
+  void* buf __counted_by(count);
+};
+
+struct test_const_void_ptr_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by' on a pointer to void is a GNU extension, treated as 'sized_by'}}
+  // expected-strict-error@+1{{'counted_by' cannot be applied to a pointer with pointee of unknown size because 'const void' is an incomplete type}}
+  const void* buf __counted_by(count);
+};
+
+struct test_volatile_void_ptr_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by' on a pointer to void is a GNU extension, treated as 'sized_by'}}
+  // expected-strict-error@+1{{'counted_by' cannot be applied to a pointer with pointee of unknown size because 'volatile void' is an incomplete type}}
+  volatile void* buf __counted_by(count);
+};
+
+struct test_const_volatile_void_ptr_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by' on a pointer to void is a GNU extension, treated as 'sized_by'}}
+  // expected-strict-error@+1{{'counted_by' cannot be applied to a pointer with pointee of unknown size because 'const volatile void' is an incomplete type}}
+  const volatile void* buf __counted_by(count);
+};
+
+// Verify sized_by still works the same way (always allowed, no warning)
+struct test_sized_by_void_ptr {
+  int size;
+  void* buf __sized_by(size);  // OK in both modes, no warning
+};
+
+//==============================================================================
+// Test: counted_by_or_null on void* behaves the same
+//==============================================================================
+
+struct test_void_ptr_or_null_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by_or_null' on a pointer to void is a GNU extension, treated as 'sized_by_or_null'}}
+  // expected-strict-error@+1{{'counted_by_or_null' cannot be applied to a pointer with pointee of unknown size because 'void' is an incomplete type}}
+  void* buf __counted_by_or_null(count);
+};
+
+struct test_const_void_ptr_or_null_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by_or_null' on a pointer to void is a GNU extension, treated as 'sized_by_or_null'}}
+  // expected-strict-error@+1{{'counted_by_or_null' cannot be applied to a pointer with pointee of unknown size because 'const void' is an incomplete type}}
+  const void* buf __counted_by_or_null(count);
+};

@llvmbot
Copy link
Member

llvmbot commented Oct 23, 2025

@llvm/pr-subscribers-clang-codegen

Author: Kees Cook (kees)

Changes

The counted_by attribute currently rejects void* members because void has no defined size. However, the sized_by attribute accepts void* since it explicitly measures bytes. In GNU mode, void pointer arithmetic treats void as having size 1 byte, so counted_by on void* should behave identically to sized_by (treating the count as bytes).

Allow counted_by on void* when GNU extensions are enabled. The implementation validates this only at declaration time in SemaBoundsSafety.cpp, emitting a -Wpointer-arith warning that the attribute is treated as a GNU extension equivalent to sized_by. Both use-site validation and code generation trust this earlier validation, avoiding redundant checks.

In CodeGen, __builtin_dynamic_object_size now correctly handles counted_by on void* by treating any CountAttributedType with zero element size as having 1-byte elements, matching the GNU void pointer arithmetic semantics.

Add tests validating both Sema diagnostics (warnings in GNU mode, errors in strict C) and CodeGen behavior (correct byte counts from __builtin_dynamic_object_size). Update existing counted_by tests to explicitly use -std=c11 to preserve their original intent of rejecting void* in strict C mode.


Full diff: https://github.com/llvm/llvm-project/pull/164737.diff

10 Files Affected:

  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+6)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+5-3)
  • (modified) clang/lib/Sema/SemaBoundsSafety.cpp (+19-3)
  • (added) clang/test/CodeGen/attr-counted-by-void-ptr-gnu.c (+65)
  • (modified) clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c (+1-1)
  • (modified) clang/test/Sema/attr-counted-by-or-null-last-field.c (+2-2)
  • (modified) clang/test/Sema/attr-counted-by-or-null-late-parsed-struct-ptrs.c (+1-1)
  • (modified) clang/test/Sema/attr-counted-by-or-null-struct-ptrs.c (+2-2)
  • (modified) clang/test/Sema/attr-counted-by-struct-ptrs.c (+2-2)
  • (added) clang/test/Sema/attr-counted-by-void-ptr-gnu.c (+65)
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 20b499462ae94..33f28f251f5d0 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -8114,6 +8114,12 @@ def ext_gnu_ptr_func_arith : Extension<
   "arithmetic on%select{ a|}0 pointer%select{|s}0 to%select{ the|}2 function "
   "type%select{|s}2 %1%select{| and %3}2 is a GNU extension">,
   InGroup<GNUPointerArith>;
+def ext_gnu_counted_by_void_ptr
+    : Extension<
+          "'%select{counted_by|sized_by|counted_by_or_null|sized_by_or_null}0' "
+          "on a pointer to void is a GNU extension, treated as "
+          "'%select{sized_by|sized_by|sized_by_or_null|sized_by_or_null}0'">,
+      InGroup<GNUPointerArith>;
 def err_readonly_message_assignment : Error<
   "assigning to 'readonly' return result of an Objective-C message not allowed">;
 def ext_c2y_increment_complex : Extension<
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index fd14cd6926fe2..aaf8d59414498 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -1211,11 +1211,13 @@ llvm::Value *CodeGenFunction::emitCountedByPointerSize(
         getContext().getTypeSizeInChars(ElementTy->getPointeeType());
 
     if (ElementSize.isZero()) {
-      // This might be a __sized_by on a 'void *', which counts bytes, not
-      // elements.
+      // This might be a __sized_by (or __counted_by in GNU mode) on a
+      // 'void *', which counts bytes, not elements.
       auto *CAT = ElementTy->getAs<CountAttributedType>();
       if (!CAT || (CAT->getKind() != CountAttributedType::SizedBy &&
-                   CAT->getKind() != CountAttributedType::SizedByOrNull))
+                   CAT->getKind() != CountAttributedType::SizedByOrNull &&
+                   CAT->getKind() != CountAttributedType::CountedBy &&
+                   CAT->getKind() != CountAttributedType::CountedByOrNull))
         // Okay, not sure what it is now.
         // FIXME: Should this be an assert?
         return std::optional<CharUnits>();
diff --git a/clang/lib/Sema/SemaBoundsSafety.cpp b/clang/lib/Sema/SemaBoundsSafety.cpp
index 39ab13653f5fe..abc7397aaa888 100644
--- a/clang/lib/Sema/SemaBoundsSafety.cpp
+++ b/clang/lib/Sema/SemaBoundsSafety.cpp
@@ -132,9 +132,20 @@ bool Sema::CheckCountedByAttrOnField(FieldDecl *FD, Expr *E, bool CountInBytes,
     // `BoundsSafetyCheckUseOfCountAttrPtr`
     //
     // * When the pointee type is always an incomplete type (e.g.
-    // `void`) the attribute is disallowed by this method because we know the
-    // type can never be completed so there's no reason to allow it.
-    InvalidTypeKind = CountedByInvalidPointeeTypeKind::INCOMPLETE;
+    // `void` in strict C mode) the attribute is disallowed by this method
+    // because we know the type can never be completed so there's no reason
+    // to allow it.
+    //
+    // Exception: In GNU mode, void has an implicit size of 1 byte for pointer
+    // arithmetic. Therefore, counted_by on void* is allowed as a GNU extension
+    // and behaves equivalently to sized_by (treating the count as bytes).
+    bool IsVoidPtrInGNUMode = PointeeTy->isVoidType() && getLangOpts().GNUMode;
+    if (IsVoidPtrInGNUMode) {
+      // Emit a warning that this is a GNU extension
+      Diag(FD->getBeginLoc(), diag::ext_gnu_counted_by_void_ptr) << Kind;
+    } else {
+      InvalidTypeKind = CountedByInvalidPointeeTypeKind::INCOMPLETE;
+    }
   } else if (PointeeTy->isSizelessType()) {
     InvalidTypeKind = CountedByInvalidPointeeTypeKind::SIZELESS;
   } else if (PointeeTy->isFunctionType()) {
@@ -272,6 +283,11 @@ GetCountedByAttrOnIncompletePointee(QualType Ty, NamedDecl **ND) {
   if (!PointeeTy->isIncompleteType(ND))
     return {};
 
+  // If counted_by is on void*, it was already validated at declaration time
+  // as a GNU extension. No need to re-check GNU mode here.
+  if (PointeeTy->isVoidType())
+    return {};
+
   return {CATy, PointeeTy};
 }
 
diff --git a/clang/test/CodeGen/attr-counted-by-void-ptr-gnu.c b/clang/test/CodeGen/attr-counted-by-void-ptr-gnu.c
new file mode 100644
index 0000000000000..e22aad306f60c
--- /dev/null
+++ b/clang/test/CodeGen/attr-counted-by-void-ptr-gnu.c
@@ -0,0 +1,65 @@
+// RUN: %clang_cc1 -std=gnu11 -triple x86_64-unknown-linux-gnu -O2 -emit-llvm -o - %s | FileCheck %s
+
+// Test that counted_by on void* in GNU mode treats void as having size 1 (byte count)
+
+#define __counted_by(f)  __attribute__((counted_by(f)))
+#define __sized_by(f)  __attribute__((sized_by(f)))
+
+struct with_counted_by_void {
+  int count;
+  void* buf __counted_by(count);
+};
+
+struct with_sized_by_void {
+  int size;
+  void* buf __sized_by(size);
+};
+
+struct with_counted_by_int {
+  int count;
+  int* buf __counted_by(count);
+};
+
+// CHECK-LABEL: define dso_local {{.*}}@test_counted_by_void(
+// CHECK:         %[[COUNT:.*]] = load i32, ptr %s
+// CHECK:         %[[NARROW:.*]] = tail call i32 @llvm.smax.i32(i32 %[[COUNT]], i32 0)
+// CHECK:         %[[ZEXT:.*]] = zext nneg i32 %[[NARROW]] to i64
+// CHECK:         ret i64 %[[ZEXT]]
+//
+// Verify: counted_by on void* returns the count directly (count * 1 byte)
+long long test_counted_by_void(struct with_counted_by_void *s) {
+  return __builtin_dynamic_object_size(s->buf, 0);
+}
+
+// CHECK-LABEL: define dso_local {{.*}}@test_sized_by_void(
+// CHECK:         %[[SIZE:.*]] = load i32, ptr %s
+// CHECK:         %[[NARROW:.*]] = tail call i32 @llvm.smax.i32(i32 %[[SIZE]], i32 0)
+// CHECK:         %[[ZEXT:.*]] = zext nneg i32 %[[NARROW]] to i64
+// CHECK:         ret i64 %[[ZEXT]]
+//
+// Verify: sized_by on void* returns the size directly
+long long test_sized_by_void(struct with_sized_by_void *s) {
+  return __builtin_dynamic_object_size(s->buf, 0);
+}
+
+// CHECK-LABEL: define dso_local {{.*}}@test_counted_by_int(
+// CHECK:         %[[COUNT:.*]] = load i32, ptr %s
+// CHECK:         %[[SEXT:.*]] = sext i32 %[[COUNT]] to i64
+// CHECK:         %[[SIZE:.*]] = shl nsw i64 %[[SEXT]], 2
+// CHECK:         ret i64
+//
+// Verify: counted_by on int* returns count * sizeof(int) = count * 4
+long long test_counted_by_int(struct with_counted_by_int *s) {
+  return __builtin_dynamic_object_size(s->buf, 0);
+}
+
+// CHECK-LABEL: define dso_local ptr @test_void_ptr_arithmetic(
+// CHECK:         %[[BUF:.*]] = load ptr, ptr
+// CHECK:         %[[EXT:.*]] = sext i32 %offset to i64
+// CHECK:         %[[PTR:.*]] = getelementptr inbounds i8, ptr %[[BUF]], i64 %[[EXT]]
+// CHECK:         ret ptr %[[PTR]]
+//
+// Verify: pointer arithmetic on void* uses i8 (byte offsets), not i32 or other sizes
+void* test_void_ptr_arithmetic(struct with_counted_by_void *s, int offset) {
+  return s->buf + offset;  // GNU extension: void* arithmetic
+}
diff --git a/clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c b/clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c
index 8d4e0c510603a..b6c2ba68732ee 100644
--- a/clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c
+++ b/clang/test/Sema/attr-counted-by-late-parsed-struct-ptrs.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c11 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
 
 #define __counted_by(f)  __attribute__((counted_by(f)))
 
diff --git a/clang/test/Sema/attr-counted-by-or-null-last-field.c b/clang/test/Sema/attr-counted-by-or-null-last-field.c
index 60a1f571b19e9..4cf8cb64557d2 100644
--- a/clang/test/Sema/attr-counted-by-or-null-last-field.c
+++ b/clang/test/Sema/attr-counted-by-or-null-last-field.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -fsyntax-only -verify=expected,immediate %s
-// RUN: %clang_cc1 -fsyntax-only -fexperimental-late-parse-attributes -verify=expected,late %s
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -verify=expected,immediate %s
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -fexperimental-late-parse-attributes -verify=expected,late %s
 
 #define __counted_by_or_null(f)  __attribute__((counted_by_or_null(f)))
 
diff --git a/clang/test/Sema/attr-counted-by-or-null-late-parsed-struct-ptrs.c b/clang/test/Sema/attr-counted-by-or-null-late-parsed-struct-ptrs.c
index 2150c81f9e9be..29e235463f822 100644
--- a/clang/test/Sema/attr-counted-by-or-null-late-parsed-struct-ptrs.c
+++ b/clang/test/Sema/attr-counted-by-or-null-late-parsed-struct-ptrs.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c11 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
 
 #define __counted_by_or_null(f)  __attribute__((counted_by_or_null(f)))
 #define __counted_by(f)  __attribute__((counted_by(f)))
diff --git a/clang/test/Sema/attr-counted-by-or-null-struct-ptrs.c b/clang/test/Sema/attr-counted-by-or-null-struct-ptrs.c
index 0bb09059c97f9..7fe43a20c0c6e 100644
--- a/clang/test/Sema/attr-counted-by-or-null-struct-ptrs.c
+++ b/clang/test/Sema/attr-counted-by-or-null-struct-ptrs.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s
-// RUN: %clang_cc1 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c11 -fexperimental-late-parse-attributes -fsyntax-only -verify %s
 
 #define __counted_by_or_null(f)  __attribute__((counted_by_or_null(f)))
 #define __counted_by(f)  __attribute__((counted_by(f)))
diff --git a/clang/test/Sema/attr-counted-by-struct-ptrs.c b/clang/test/Sema/attr-counted-by-struct-ptrs.c
index c05d18262e2b7..d75533d7d135b 100644
--- a/clang/test/Sema/attr-counted-by-struct-ptrs.c
+++ b/clang/test/Sema/attr-counted-by-struct-ptrs.c
@@ -1,5 +1,5 @@
-// RUN: %clang_cc1 -fsyntax-only -verify %s
-// RUN: %clang_cc1 -fsyntax-only -fexperimental-late-parse-attributes %s -verify
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -verify %s
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -fexperimental-late-parse-attributes %s -verify
 
 #define __counted_by(f)  __attribute__((counted_by(f)))
 
diff --git a/clang/test/Sema/attr-counted-by-void-ptr-gnu.c b/clang/test/Sema/attr-counted-by-void-ptr-gnu.c
new file mode 100644
index 0000000000000..c6aa371434bc4
--- /dev/null
+++ b/clang/test/Sema/attr-counted-by-void-ptr-gnu.c
@@ -0,0 +1,65 @@
+// RUN: %clang_cc1 -std=gnu11 -fsyntax-only -verify=expected-nowarn %s
+// RUN: %clang_cc1 -std=gnu11 -Wpointer-arith -fsyntax-only -verify=expected-warn %s
+// RUN: %clang_cc1 -std=c11 -fsyntax-only -verify=expected-strict %s
+
+// expected-nowarn-no-diagnostics
+
+#define __counted_by(f)  __attribute__((counted_by(f)))
+#define __counted_by_or_null(f)  __attribute__((counted_by_or_null(f)))
+#define __sized_by(f)  __attribute__((sized_by(f)))
+
+//==============================================================================
+// Test: counted_by on void* is allowed in GNU mode, rejected in strict mode
+//==============================================================================
+
+struct test_void_ptr_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by' on a pointer to void is a GNU extension, treated as 'sized_by'}}
+  // expected-strict-error@+1{{'counted_by' cannot be applied to a pointer with pointee of unknown size because 'void' is an incomplete type}}
+  void* buf __counted_by(count);
+};
+
+struct test_const_void_ptr_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by' on a pointer to void is a GNU extension, treated as 'sized_by'}}
+  // expected-strict-error@+1{{'counted_by' cannot be applied to a pointer with pointee of unknown size because 'const void' is an incomplete type}}
+  const void* buf __counted_by(count);
+};
+
+struct test_volatile_void_ptr_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by' on a pointer to void is a GNU extension, treated as 'sized_by'}}
+  // expected-strict-error@+1{{'counted_by' cannot be applied to a pointer with pointee of unknown size because 'volatile void' is an incomplete type}}
+  volatile void* buf __counted_by(count);
+};
+
+struct test_const_volatile_void_ptr_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by' on a pointer to void is a GNU extension, treated as 'sized_by'}}
+  // expected-strict-error@+1{{'counted_by' cannot be applied to a pointer with pointee of unknown size because 'const volatile void' is an incomplete type}}
+  const volatile void* buf __counted_by(count);
+};
+
+// Verify sized_by still works the same way (always allowed, no warning)
+struct test_sized_by_void_ptr {
+  int size;
+  void* buf __sized_by(size);  // OK in both modes, no warning
+};
+
+//==============================================================================
+// Test: counted_by_or_null on void* behaves the same
+//==============================================================================
+
+struct test_void_ptr_or_null_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by_or_null' on a pointer to void is a GNU extension, treated as 'sized_by_or_null'}}
+  // expected-strict-error@+1{{'counted_by_or_null' cannot be applied to a pointer with pointee of unknown size because 'void' is an incomplete type}}
+  void* buf __counted_by_or_null(count);
+};
+
+struct test_const_void_ptr_or_null_gnu {
+  int count;
+  // expected-warn-warning@+2{{'counted_by_or_null' on a pointer to void is a GNU extension, treated as 'sized_by_or_null'}}
+  // expected-strict-error@+1{{'counted_by_or_null' cannot be applied to a pointer with pointee of unknown size because 'const void' is an incomplete type}}
+  const void* buf __counted_by_or_null(count);
+};

@delcypher delcypher added the clang:bounds-safety Issue/PR relating to the experimental -fbounds-safety feature in Clang label Oct 23, 2025
@ojhunt
Copy link
Contributor

ojhunt commented Oct 24, 2025

I think this particular gcc feature falls into "this is a hazard rather than a feature" - it feels like if people want to use a pointer as a byte array they should simply convert the type to an appropriate type.

@delcypher
Copy link
Contributor

delcypher commented Oct 24, 2025

I'm not huge fan of diverging the behavior of the __counted_by attribute because this divergence will probably be a long term thing that we will never get rid of in projects that start using these attributes. The correct solution here is for the source code to use __sized_by instead of __counted_by under the -fbound-safety language model.

That being said, as this is a GNU extension and presumably GCC already implemented this we probably need to allow this in order to be source compatible.

@delcypher
Copy link
Contributor

What happens with this?

struct T {
    int count;
    void* __counted_by(count) ptr;
};

void use(struct T* t) {
    t = 0;
}

Previously this was completely disallowed and when using the GNU language mode it will be. I don't think you're testing this.

@yeoul We probably need to check what happens when we lift this restriction in -fbounds-safety. I'm not sure our implementation will do the right thing.

// Exception: In GNU mode, void has an implicit size of 1 byte for pointer
// arithmetic. Therefore, counted_by on void* is allowed as a GNU extension
// and behaves equivalently to sized_by (treating the count as bytes).
bool IsVoidPtrInGNUMode = PointeeTy->isVoidType() && getLangOpts().GNUMode;
Copy link
Contributor

@delcypher delcypher Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rapidsna What do you think about changing this to

 bool IsVoidPtrInGNUMode = PointeeTy->isVoidType() && getLangOpts().GNUMode && !getLangOpts().BoundsSafety;

?

This would mean in the -fbounds-safety language model we would still disallow this. Or for consistency with -fno-bounds-safety do you think we should allow this language extension?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does -fbounds-safety need this? I'd like to be bringing -fbounds-safety incrementally to Linux...

Copy link
Contributor

@rapidsna rapidsna Oct 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @kees. From my perspective, allowing counted_by on void * without -fbounds-safety means we would have to allow it with -fbounds-safety as well (in GNU mode), since we want the same annotated headers to be consumable by the compiler with or without -fbounds-safety (except in some inevitable cases like inline function bodies in headers).

I think this is actually a reasonable compromise because treating void * as having a stride of 1 is a de facto standard in C.

However, in general, we cannot change the behavior of -fbounds-safety solely because the Linux community pushes back strongly. I think the issue is that they don't have the full context for -fbounds-safety, so decisions about individual attributes without that broader context can easily become opinionated (while this change isn't not unreasonable). I will be working to improve this situation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, this is just my opinion. This change may lead to more of a radical shift than I initially thought because it affects the default behavior of -fbounds-safety in Clang (since GNU mode is on by default). So this is something I should discuss with our core -fbounds-safety engineers before making any decisions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't add new code checking GNUMode. It doesn't indicate whether we're trying to be gcc-compatible; it indicates whether the user used -std=c11 vs -std=gnu11.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detecting the -std was the intension here. I.e. that's what had been proposed for this feature. I'm open to alternatives, but this was what @rapidsna and I had discussed earlier.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason -std=c11 and -std=gnu11 both exist is for compatibility with old gcc extensions: very old versions of gcc were not careful about avoiding keywords/constructs that could overlap with future standardized keywords, and "gnu" mode enables those keywords. We shouldn't extend it beyond that.

If you think this construct likely indicates a bug, make it warn by default; users can decide if they want to fix their code, or just disable the warning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've switched to only using -Wpointer-arith as the control for this now.

@kees
Copy link
Contributor Author

kees commented Oct 24, 2025

I think this particular gcc feature falls into "this is a hazard rather than a feature" - it feels like if people want to use a pointer as a byte array they should simply convert the type to an appropriate type.

This is exactly what does happen, but it's usually as a generic allocation space, and the type usage comes later via implicit casts. For counted_by to be usable in Linux, it needs sized_by behavior for void * members. Putting this behind the GNU extension which also covers void * arithmetic should be a reasonable compromise.

@kees
Copy link
Contributor Author

kees commented Oct 24, 2025

What happens with this?

struct T {
    int count;
    void* __counted_by(count) ptr;
};

void use(struct T* t) {
    t = 0;
}

Previously this was completely disallowed and when using the GNU language mode it will be. I don't think you're testing this.

@yeoul We probably need to check what happens when we lift this restriction in -fbounds-safety. I'm not sure our implementation will do the right thing.

I don't understand what you mean? This builds with gnu and not with C11. I have tests in this PR for it? I must be misunderstanding something.

$ clang -std=gnu11 -o wat.o -c wat.c
$ clang -std=c11 -o wat.o -c wat.c
wat.c:4:5: error: 'counted_by' cannot be applied to a pointer with pointee of
      unknown size because 'void' is an incomplete type
    4 |     void* __counted_by(count) ptr;
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

@hnrklssn
Copy link
Member

For counted_by to be usable in Linux, it needs sized_by behavior for void * members. Putting this behind the GNU extension which also covers void * arithmetic should be a reasonable compromise.

I didn't quite get the context for why this is the case - maybe I missed something. If a user can write:

struct with_counted_by_void {
  int count;
  void* buf __counted_by(count);
};

why would they not be able to instead write this?

struct with_sized_by_void {
  int size;
  void* buf __sized_by(size);
};

@kees
Copy link
Contributor Author

kees commented Oct 24, 2025

For counted_by to be usable in Linux, it needs sized_by behavior for void * members. Putting this behind the GNU extension which also covers void * arithmetic should be a reasonable compromise.

I didn't quite get the context for why this is the case - maybe I missed something. If a user can write:

struct with_counted_by_void {
  int count;
  void* buf __counted_by(count);
};

why would they not be able to instead write this?

struct with_sized_by_void {
  int size;
  void* buf __sized_by(size);
};

Please know that I'm not trying to be difficult with this reply; I'm just trying to state the reality of the situation: because they(we) just will not. The Linux developer community is extraordinarily opinionated about these kinds of things and from their(our) perspective, void * points to an incomplete type whose allocation size is measured in bytes and always has. And counted_by is seen as counting elements which makes sized_by redundant. They(we) do not want 2 different spellings for doing conceptually the same thing. (In the same linked thread you'll see I am even getting push back for a versioned macro for flexible arrays vs pointer members, which is needed to handle the compiler version skew between counted_by coverage of those two cases.)

Since I have no preprocessor way to interrogate the type of the member when applying an attribute, I have no way to make Linux's __counted_by macro hide the implementation details. So, to make this work for Linux and retain the intended strict behavior, we can control it with a toggle. And since void *ptr = foo + 1 is valid under the GNU extensions, that seems the clear toggle for it.

Alternatively, if there were some other attribute name (strawman: alloc_elements) that internally used counted_by for non-void * and used sized_by for void * members, I could trivially hide that behind the __counted_by macro. What is critically needed for Linux to adopt this is having a single spelling for "the number of elements of this pointer is counted by this other member". Or just convert counted_by into sized_by with no need for another spelling (i.e. I could nudge this patch slightly so that CountAttributedType::SizedBy gets set earlier in Sema and then CodeGenFunction::emitCountedByPointerSize would need no changes, etc.)

I just need to find a way to keep both camps happy: both have legitimate reasons for why they want things the way they do, so I just need a knob to select between them (without compromising either side's implementation details).

@kees
Copy link
Contributor Author

kees commented Oct 24, 2025

Or just convert counted_by into sized_by with no need for another spelling (i.e. I could nudge this patch slightly so that CountAttributedType::SizedBy gets set earlier in Sema and then CodeGenFunction::emitCountedByPointerSize would need no changes, etc.)

I.e. the only logic change (i.e. keep all the tests) needed would be this:

diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 964a2a791e18..c97ec9cea0c6 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -6577,6 +6577,20 @@ static void handleCountedByAttrField(Sema &S, Decl *D, const ParsedAttr &AL) {
     llvm_unreachable("unexpected counted_by family attribute");
   }
 
+  // In GNU mode, silently convert counted_by/counted_by_or_null on void*
+  // to sized_by/sized_by_or_null, since void has size 1 in GNU pointer
+  // arithmetic.
+  if (!CountInBytes && FD->getType()->isPointerType()) {
+    QualType PointeeTy = FD->getType()->getPointeeType();
+    if (PointeeTy->isVoidType() && S.getLangOpts().GNUMode) {
+      // Emit diagnostic before conversion
+      S.Diag(FD->getBeginLoc(), diag::ext_gnu_counted_by_void_ptr)
+          << (OrNull ? CountAttributedType::CountedByOrNull
+                     : CountAttributedType::CountedBy);
+      CountInBytes = true; // Convert to sized_by variant
+    }
+  }
+
   if (S.CheckCountedByAttrOnField(FD, CountExpr, CountInBytes, OrNull))
     return;
 

This is, perhaps, much cleaner?

@hnrklssn
Copy link
Member

hnrklssn commented Oct 24, 2025

Please know that I'm not trying to be difficult with this reply; I'm just trying to state the reality of the situation: because they(we) just will not. The Linux developer community is extraordinarily opinionated about these kinds of things and from their(our) perspective, void * points to an incomplete type whose allocation size is measured in bytes and always has. And counted_by is seen as counting elements which makes sized_by redundant. They(we) do not want 2 different spellings for doing conceptually the same thing. (In the same linked thread you'll see I am even getting push back for a versioned macro for flexible arrays vs pointer members, which is needed to handle the compiler version skew between counted_by coverage of those two cases.)

Interesting... I wonder what their reaction to __counted_by_or_null will be.
Unrelated: to solve your version skew issue, would it be out of the question to do something like this:

#if VERSION < SUPPORTS_COUNTED_BY_POINTER_MEMBERS
#define __counted_by(x)
#warning "__counted_by disabled - upgrade to version " #SUPPORTS_COUNTED_BY_POINTER_MEMBERS " for __counted_by support"
#else
#define __counted_by(x) __attribute__((__counted_by__(x)))
#endif

Since I have no preprocessor way to interrogate the type of the member when applying an attribute, I have no way to make Linux's __counted_by macro hide the implementation details. So, to make this work for Linux and retain the intended strict behavior, we can control it with a toggle. And since void *ptr = foo + 1 is valid under the GNU extensions, that seems the clear toggle for it.

Alternatively, if there were some other attribute name (strawman: alloc_elements) that internally used counted_by for non-void * and used sized_by for void * members, I could trivially hide that behind the __counted_by macro. What is critically needed for Linux to adopt this is having a single spelling for "the number of elements of this pointer is counted by this other member". Or just convert counted_by into sized_by with no need for another spelling (i.e. I could nudge this patch slightly so that CountAttributedType::SizedBy gets set earlier in Sema and then CodeGenFunction::emitCountedByPointerSize would need no changes, etc.)

I just need to find a way to keep both camps happy: both have legitimate reasons for why they want things the way they do, so I just need a knob to select between them (without compromising either side's implementation details).

I'm not personally opposed to allowing __counted_by on void* in GNU-mode, but I do think it's unfortunate that clang defaults to GNU-mode.

As an potential alternative I think you might be able to do something this for pointers (but not arrays):

#define __counted_by_ptr(t, x) t __attribute__((__sized_by(x * sizeof(t))))
void foo(int len, __counted_by_ptr(void *, len) p);

@hnrklssn
Copy link
Member

Or just convert counted_by into sized_by with no need for another spelling (i.e. I could nudge this patch slightly so that CountAttributedType::SizedBy gets set earlier in Sema and then CodeGenFunction::emitCountedByPointerSize would need no changes, etc.)

I.e. the only logic change (i.e. keep all the tests) needed would be this:

diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 964a2a791e18..c97ec9cea0c6 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -6577,6 +6577,20 @@ static void handleCountedByAttrField(Sema &S, Decl *D, const ParsedAttr &AL) {
     llvm_unreachable("unexpected counted_by family attribute");
   }
 
+  // In GNU mode, silently convert counted_by/counted_by_or_null on void*
+  // to sized_by/sized_by_or_null, since void has size 1 in GNU pointer
+  // arithmetic.
+  if (!CountInBytes && FD->getType()->isPointerType()) {
+    QualType PointeeTy = FD->getType()->getPointeeType();
+    if (PointeeTy->isVoidType() && S.getLangOpts().GNUMode) {
+      // Emit diagnostic before conversion
+      S.Diag(FD->getBeginLoc(), diag::ext_gnu_counted_by_void_ptr)
+          << (OrNull ? CountAttributedType::CountedByOrNull
+                     : CountAttributedType::CountedBy);
+      CountInBytes = true; // Convert to sized_by variant
+    }
+  }
+
   if (S.CheckCountedByAttrOnField(FD, CountExpr, CountInBytes, OrNull))
     return;
 

This is, perhaps, much cleaner?

That seems a lot easier to maintain! You may have to live with diagnostics saying __sized_by despite the code saying __counted_by though.

@kees
Copy link
Contributor Author

kees commented Oct 24, 2025

Weirdly, I think a mismatched diagnostic string would be much more well tolerated by the Linux dev community. :) That kind of thing I can document along with the __counted_by macro. If others prefer this route, I can update the PR to do this instead?

@rapidsna
Copy link
Contributor

Or just convert counted_by into sized_by with no need for another spelling (i.e. I could nudge this patch slightly so that CountAttributedType::SizedBy gets set earlier in Sema and then CodeGenFunction::emitCountedByPointerSize would need no changes, etc.)

I.e. the only logic change (i.e. keep all the tests) needed would be this:

diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index 964a2a791e18..c97ec9cea0c6 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -6577,6 +6577,20 @@ static void handleCountedByAttrField(Sema &S, Decl *D, const ParsedAttr &AL) {
     llvm_unreachable("unexpected counted_by family attribute");
   }
 
+  // In GNU mode, silently convert counted_by/counted_by_or_null on void*
+  // to sized_by/sized_by_or_null, since void has size 1 in GNU pointer
+  // arithmetic.
+  if (!CountInBytes && FD->getType()->isPointerType()) {
+    QualType PointeeTy = FD->getType()->getPointeeType();
+    if (PointeeTy->isVoidType() && S.getLangOpts().GNUMode) {
+      // Emit diagnostic before conversion
+      S.Diag(FD->getBeginLoc(), diag::ext_gnu_counted_by_void_ptr)
+          << (OrNull ? CountAttributedType::CountedByOrNull
+                     : CountAttributedType::CountedBy);
+      CountInBytes = true; // Convert to sized_by variant
+    }
+  }
+
   if (S.CheckCountedByAttrOnField(FD, CountExpr, CountInBytes, OrNull))
     return;
 

This is, perhaps, much cleaner?

That seems a lot easier to maintain! You may have to live with diagnostics saying __sized_by despite the code saying __counted_by though.

Yes, this would make things easier, as we wouldn't need to update the rest of our code that currently assumes __counted_by isn't used with void *.

However, even if we do that, we should still preserve the original attribute kind so the AST can recover the attribute as originally written (e.g., enabling TypePrinter to print it as counted_by rather than sized_by). This aligns with Clang AST's principle of "faithfulness"—the AST should faithfully represent the original source code so that generating source code from the AST can reproduce the original source https://clang.llvm.org/docs/InternalsManual.html#faithfulness. Maintaining the original attribute kind will also address the mismatched diagnostic issue.

That's some work, and keeping two attribute kinds might be error-prone, so it might actually be simpler to just keep the attribute as is and update some of our code relying on "counted_by != void *" assumption for -fbounds-safety instead. The rest of Clang already has code assuming the stride of void * is 1 byte, so this would be consistent with existing behavior.

@delcypher
Copy link
Contributor

What happens with this?

struct T {
    int count;
    void* __counted_by(count) ptr;
};

void use(struct T* t) {
    t = 0;
}

Previously this was completely disallowed and when using the GNU language mode it will be. I don't think you're testing this.
@yeoul We probably need to check what happens when we lift this restriction in -fbounds-safety. I'm not sure our implementation will do the right thing.

I don't understand what you mean? This builds with gnu and not with C11. I have tests in this PR for it? I must be misunderstanding something.

$ clang -std=gnu11 -o wat.o -c wat.c
$ clang -std=c11 -o wat.o -c wat.c
wat.c:4:5: error: 'counted_by' cannot be applied to a pointer with pointee of
      unknown size because 'void' is an incomplete type
    4 |     void* __counted_by(count) ptr;
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

Sorry I should've been clearer. What I'm looking for is a Sema test that checks that the using a void* __counted_by(...) pointer works, not just that declaring it works. The change you've made in Sema::CheckCountedByAttrOnField basically means that a void* __counted_by pointer is treated like we treat incomplete types that might be completed later (e.g. struct Foo;). Those kinds of pointer are checked when they are used (e.g. as an rvalue or assigned to).

The clang/test/Sema/attr-counted-by-void-ptr-gnu.c test you added doesn't test this and none of the other Sema tests check this AFAICT. You do have a codegen test that checks some of this but it only tests use as an rvalue and not assignment to the pointer. Can you add a sema test that checks the behavior both in gnu11 and c11?

@kees
Copy link
Contributor Author

kees commented Oct 30, 2025

Given the preference to keep the diagnostic matching the usage, I'll stick with the originally proposed code changes. I've pushed fixes for the other review comments.

Sorry I should've been clearer. What I'm looking for is a Sema test that checks that the using a void* __counted_by(...) pointer works, not just that declaring it works

Okay, I've added this now, if I'm understanding what you're asking for. I can't test for -std=c11 since it won't accept it. But I think you mean "with -fbounds-safety and GNU mode", which is what I've added.

@kees kees requested a review from delcypher October 30, 2025 02:23
@kees kees requested a review from hnrklssn October 30, 2025 05:08
@kees kees changed the title [Clang][Sema] Allow counted_by on void* in GNU mode [Clang][Sema] Allow counted_by on void* as GNU extension Nov 4, 2025
@kees kees requested a review from efriedma-quic November 4, 2025 00:47
@kees
Copy link
Contributor Author

kees commented Nov 4, 2025

FYI, GCC has landed their corresponding change to support void * with counted_by: gcc-mirror/gcc@6951e95

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:bounds-safety Issue/PR relating to the experimental -fbounds-safety feature in Clang clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants