Skip to content

Conversation

@jhuber6
Copy link
Contributor

@jhuber6 jhuber6 commented Sep 5, 2025

Summary:
The added bit counting builtins for vectors used cttz and ctlz,
which is consistent with the LLVM naming convention. However, these are
clang builtins and implement exactly the __builtin_ctzg and
__builtin_clzg behavior. It is confusing to people familiar with other
other builtins that these are the only bit counting intrinsics named
differently. This includes the additional operation for the undefined
zero case, which was added as a clzg extension.

Summary:
The added bit counting builtins for vectors used `cttz` and `ctlz`,
which is consistent with the LLVM naming convention. However, these are
clang builtins and implement exactly the `__builtin_ctzg` and
`__builtin_clzg` behavior. It is confusing to people familiar with other
other builtins that these are the only bit counting intrinsics named
differently. This includes the additional operation for the undefined
zero case, which was added as a `clzg` extension.
@llvmbot llvmbot added clang Clang issues not falling into any other category backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:headers Headers provided by Clang, e.g. for intrinsics clang:codegen IR generation bugs: mangling, exceptions, etc. clang:bytecode Issues for the clang bytecode constexpr interpreter labels Sep 5, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2025

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang

Author: Joseph Huber (jhuber6)

Changes

Summary:
The added bit counting builtins for vectors used cttz and ctlz,
which is consistent with the LLVM naming convention. However, these are
clang builtins and implement exactly the __builtin_ctzg and
__builtin_clzg behavior. It is confusing to people familiar with other
other builtins that these are the only bit counting intrinsics named
differently. This includes the additional operation for the undefined
zero case, which was added as a clzg extension.


Patch is 30.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/157128.diff

12 Files Affected:

  • (modified) clang/docs/LanguageExtensions.rst (+3-3)
  • (modified) clang/include/clang/Basic/Builtins.td (+2-2)
  • (modified) clang/include/clang/Basic/DiagnosticASTKinds.td (+1-1)
  • (modified) clang/lib/AST/ByteCode/InterpBuiltin.cpp (+4-4)
  • (modified) clang/lib/AST/ExprConstant.cpp (+11-11)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+6-6)
  • (modified) clang/lib/Headers/avx512cdintrin.h (+2-2)
  • (modified) clang/lib/Headers/avx512vlcdintrin.h (+4-4)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+2-2)
  • (modified) clang/test/CodeGen/builtins-elementwise-math.c (+20-20)
  • (modified) clang/test/Sema/builtins-elementwise-math.c (+14-14)
  • (modified) clang/test/Sema/constant-builtins-vector.cpp (+37-37)
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index ad190eace5b05..a150b1c73bc92 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -772,7 +772,7 @@ The elementwise intrinsics ``__builtin_elementwise_popcount``,
 ``__builtin_elementwise_bitreverse``, ``__builtin_elementwise_add_sat``,
 ``__builtin_elementwise_sub_sat``, ``__builtin_elementwise_max``,
 ``__builtin_elementwise_min``, ``__builtin_elementwise_abs``,
-``__builtin_elementwise_ctlz``, ``__builtin_elementwise_cttz``, and
+``__builtin_elementwise_clzg``, ``__builtin_elementwise_ctzg``, and
 ``__builtin_elementwise_fma`` can be called in a ``constexpr`` context.
 
 No implicit promotion of integer types takes place. The mixing of integer types
@@ -884,11 +884,11 @@ T __builtin_elementwise_fshr(T x, T y, T z)     perform a funnel shift right. Co
                                                 right by z (modulo the bit width of the original arguments),
                                                 and the least significant bits are extracted to produce
                                                 a result that is the same size as the original arguments.
- T __builtin_elementwise_ctlz(T x[, T y])       return the number of leading 0 bits in the first argument. If          integer types
+ T __builtin_elementwise_clzg(T x[, T y])       return the number of leading 0 bits in the first argument. If          integer types
                                                 the first argument is 0 and an optional second argument is provided,
                                                 the second argument is returned. It is undefined behaviour if the
                                                 first argument is 0 and no second argument is provided.
- T __builtin_elementwise_cttz(T x[, T y])       return the number of trailing 0 bits in the first argument. If         integer types
+ T __builtin_elementwise_ctzg(T x[, T y])       return the number of trailing 0 bits in the first argument. If         integer types
                                                 the first argument is 0 and an optional second argument is provided,
                                                 the second argument is returned. It is undefined behaviour if the
                                                 first argument is 0 and no second argument is provided.
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 27fc6f008d743..1111cfacb8559 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1551,13 +1551,13 @@ def ElementwiseFshr : Builtin {
 }
 
 def ElementwiseCtlz : Builtin {
-  let Spellings = ["__builtin_elementwise_ctlz"];
+  let Spellings = ["__builtin_elementwise_clzg"];
   let Attributes = [NoThrow, Const, CustomTypeChecking, Constexpr];
   let Prototype = "void(...)";
 }
 
 def ElementwiseCttz : Builtin {
-  let Spellings = ["__builtin_elementwise_cttz"];
+  let Spellings = ["__builtin_elementwise_ctzg"];
   let Attributes = [NoThrow, Const, CustomTypeChecking, Constexpr];
   let Prototype = "void(...)";
 }
diff --git a/clang/include/clang/Basic/DiagnosticASTKinds.td b/clang/include/clang/Basic/DiagnosticASTKinds.td
index a63bd80b89657..0be9146f70364 100644
--- a/clang/include/clang/Basic/DiagnosticASTKinds.td
+++ b/clang/include/clang/Basic/DiagnosticASTKinds.td
@@ -401,7 +401,7 @@ def note_constexpr_non_const_vectorelements : Note<
 def note_constexpr_assumption_failed : Note<
   "assumption evaluated to false">;
 def note_constexpr_countzeroes_zero : Note<
-  "evaluation of %select{__builtin_elementwise_ctlz|__builtin_elementwise_cttz}0 "
+  "evaluation of %select{__builtin_elementwise_clzg|__builtin_elementwise_ctzg}0 "
   "with a zero value is undefined">;
 def err_experimental_clang_interp_failed : Error<
   "the experimental clang interpreter failed to evaluate an expression">;
diff --git a/clang/lib/AST/ByteCode/InterpBuiltin.cpp b/clang/lib/AST/ByteCode/InterpBuiltin.cpp
index ff6ef5a1f6864..57533db93d816 100644
--- a/clang/lib/AST/ByteCode/InterpBuiltin.cpp
+++ b/clang/lib/AST/ByteCode/InterpBuiltin.cpp
@@ -1831,7 +1831,7 @@ static bool interp__builtin_elementwise_countzeroes(InterpState &S,
                                                     const CallExpr *Call,
                                                     unsigned BuiltinID) {
   const bool HasZeroArg = Call->getNumArgs() == 2;
-  const bool IsCTTZ = BuiltinID == Builtin::BI__builtin_elementwise_cttz;
+  const bool IsCTTZ = BuiltinID == Builtin::BI__builtin_elementwise_ctzg;
   assert(Call->getNumArgs() == 1 || HasZeroArg);
   if (Call->getArg(0)->getType()->isIntegerType()) {
     PrimType ArgT = *S.getContext().classify(Call->getArg(0)->getType());
@@ -1855,7 +1855,7 @@ static bool interp__builtin_elementwise_countzeroes(InterpState &S,
       return false;
     }
 
-    if (BuiltinID == Builtin::BI__builtin_elementwise_ctlz) {
+    if (BuiltinID == Builtin::BI__builtin_elementwise_clzg) {
       pushInteger(S, Val.countLeadingZeros(), Call->getType());
     } else {
       pushInteger(S, Val.countTrailingZeros(), Call->getType());
@@ -3164,8 +3164,8 @@ bool InterpretBuiltin(InterpState &S, CodePtr OpPC, const CallExpr *Call,
   case Builtin::BI__builtin_ctzg:
     return interp__builtin_ctz(S, OpPC, Frame, Call, BuiltinID);
 
-  case Builtin::BI__builtin_elementwise_ctlz:
-  case Builtin::BI__builtin_elementwise_cttz:
+  case Builtin::BI__builtin_elementwise_clzg:
+  case Builtin::BI__builtin_elementwise_ctzg:
     return interp__builtin_elementwise_countzeroes(S, OpPC, Frame, Call,
                                                    BuiltinID);
 
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 6c6909e5b2370..0479aeecb1a5e 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -12039,8 +12039,8 @@ bool VectorExprEvaluator::VisitCallExpr(const CallExpr *E) {
 
     return Success(APValue(ResultElements.data(), ResultElements.size()), E);
   }
-  case Builtin::BI__builtin_elementwise_ctlz:
-  case Builtin::BI__builtin_elementwise_cttz: {
+  case Builtin::BI__builtin_elementwise_clzg:
+  case Builtin::BI__builtin_elementwise_ctzg: {
     APValue SourceLHS;
     std::optional<APValue> Fallback;
     if (!EvaluateAsRValue(Info, E->getArg(0), SourceLHS))
@@ -12064,19 +12064,19 @@ bool VectorExprEvaluator::VisitCallExpr(const CallExpr *E) {
         if (!Fallback) {
           Info.FFDiag(E, diag::note_constexpr_countzeroes_zero)
               << /*IsTrailing=*/(E->getBuiltinCallee() ==
-                                 Builtin::BI__builtin_elementwise_cttz);
+                                 Builtin::BI__builtin_elementwise_ctzg);
           return false;
         }
         ResultElements.push_back(Fallback->getVectorElt(EltNum));
         continue;
       }
       switch (E->getBuiltinCallee()) {
-      case Builtin::BI__builtin_elementwise_ctlz:
+      case Builtin::BI__builtin_elementwise_clzg:
         ResultElements.push_back(APValue(
             APSInt(APInt(Info.Ctx.getIntWidth(DestEltTy), LHS.countl_zero()),
                    DestEltTy->isUnsignedIntegerOrEnumerationType())));
         break;
-      case Builtin::BI__builtin_elementwise_cttz:
+      case Builtin::BI__builtin_elementwise_ctzg:
         ResultElements.push_back(APValue(
             APSInt(APInt(Info.Ctx.getIntWidth(DestEltTy), LHS.countr_zero()),
                    DestEltTy->isUnsignedIntegerOrEnumerationType())));
@@ -13694,7 +13694,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
   case Builtin::BI__builtin_clzll:
   case Builtin::BI__builtin_clzs:
   case Builtin::BI__builtin_clzg:
-  case Builtin::BI__builtin_elementwise_ctlz:
+  case Builtin::BI__builtin_elementwise_clzg:
   case Builtin::BI__lzcnt16: // Microsoft variants of count leading-zeroes
   case Builtin::BI__lzcnt:
   case Builtin::BI__lzcnt64: {
@@ -13710,7 +13710,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
 
     std::optional<APSInt> Fallback;
     if ((BuiltinOp == Builtin::BI__builtin_clzg ||
-         BuiltinOp == Builtin::BI__builtin_elementwise_ctlz) &&
+         BuiltinOp == Builtin::BI__builtin_elementwise_clzg) &&
         E->getNumArgs() > 1) {
       APSInt FallbackTemp;
       if (!EvaluateInteger(E->getArg(1), FallbackTemp, Info))
@@ -13729,7 +13729,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
                              BuiltinOp != Builtin::BI__lzcnt &&
                              BuiltinOp != Builtin::BI__lzcnt64;
 
-      if (BuiltinOp == Builtin::BI__builtin_elementwise_ctlz) {
+      if (BuiltinOp == Builtin::BI__builtin_elementwise_clzg) {
         Info.FFDiag(E, diag::note_constexpr_countzeroes_zero)
             << /*IsTrailing=*/false;
       }
@@ -13789,7 +13789,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
   case Builtin::BI__builtin_ctzll:
   case Builtin::BI__builtin_ctzs:
   case Builtin::BI__builtin_ctzg:
-  case Builtin::BI__builtin_elementwise_cttz: {
+  case Builtin::BI__builtin_elementwise_ctzg: {
     APSInt Val;
     if (E->getArg(0)->getType()->isExtVectorBoolType()) {
       APValue Vec;
@@ -13802,7 +13802,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
 
     std::optional<APSInt> Fallback;
     if ((BuiltinOp == Builtin::BI__builtin_ctzg ||
-         BuiltinOp == Builtin::BI__builtin_elementwise_cttz) &&
+         BuiltinOp == Builtin::BI__builtin_elementwise_ctzg) &&
         E->getNumArgs() > 1) {
       APSInt FallbackTemp;
       if (!EvaluateInteger(E->getArg(1), FallbackTemp, Info))
@@ -13814,7 +13814,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
       if (Fallback)
         return Success(*Fallback, E);
 
-      if (BuiltinOp == Builtin::BI__builtin_elementwise_cttz) {
+      if (BuiltinOp == Builtin::BI__builtin_elementwise_ctzg) {
         Info.FFDiag(E, diag::note_constexpr_countzeroes_zero)
             << /*IsTrailing=*/true;
       }
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 172a521e63c17..c9c98e16fab43 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3343,10 +3343,10 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
   case Builtin::BI__builtin_ctzl:
   case Builtin::BI__builtin_ctzll:
   case Builtin::BI__builtin_ctzg:
-  case Builtin::BI__builtin_elementwise_cttz: {
+  case Builtin::BI__builtin_elementwise_ctzg: {
     bool HasFallback =
         (BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_ctzg ||
-         BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_cttz) &&
+         BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_ctzg) &&
         E->getNumArgs() > 1;
 
     Value *ArgValue =
@@ -3360,7 +3360,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     // The elementwise builtins always exhibit zero-is-undef behaviour
     Value *ZeroUndef = Builder.getInt1(
         HasFallback || getTarget().isCLZForZeroUndef() ||
-        BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_cttz);
+        BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_ctzg);
     Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef});
     if (Result->getType() != ResultType)
       Result =
@@ -3380,10 +3380,10 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
   case Builtin::BI__builtin_clzl:
   case Builtin::BI__builtin_clzll:
   case Builtin::BI__builtin_clzg:
-  case Builtin::BI__builtin_elementwise_ctlz: {
+  case Builtin::BI__builtin_elementwise_clzg: {
     bool HasFallback =
         (BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_clzg ||
-         BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_ctlz) &&
+         BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_clzg) &&
         E->getNumArgs() > 1;
 
     Value *ArgValue =
@@ -3397,7 +3397,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     // The elementwise builtins always exhibit zero-is-undef behaviour
     Value *ZeroUndef = Builder.getInt1(
         HasFallback || getTarget().isCLZForZeroUndef() ||
-        BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_ctlz);
+        BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_clzg);
     Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef});
     if (Result->getType() != ResultType)
       Result =
diff --git a/clang/lib/Headers/avx512cdintrin.h b/clang/lib/Headers/avx512cdintrin.h
index 39e76711ca7b3..6d572daaff11e 100644
--- a/clang/lib/Headers/avx512cdintrin.h
+++ b/clang/lib/Headers/avx512cdintrin.h
@@ -71,7 +71,7 @@ _mm512_maskz_conflict_epi32 (__mmask16 __U, __m512i __A)
 
 static __inline__ __m512i __DEFAULT_FN_ATTRS_CONSTEXPR
 _mm512_lzcnt_epi32(__m512i __A) {
-  return (__m512i)__builtin_elementwise_ctlz((__v16si)__A,
+  return (__m512i)__builtin_elementwise_clzg((__v16si)__A,
                                              (__v16si)_mm512_set1_epi32(32));
 }
 
@@ -91,7 +91,7 @@ _mm512_maskz_lzcnt_epi32(__mmask16 __U, __m512i __A) {
 
 static __inline__ __m512i __DEFAULT_FN_ATTRS_CONSTEXPR
 _mm512_lzcnt_epi64(__m512i __A) {
-  return (__m512i)__builtin_elementwise_ctlz(
+  return (__m512i)__builtin_elementwise_clzg(
       (__v8di)__A, (__v8di)_mm512_set1_epi64((long long)64));
 }
 
diff --git a/clang/lib/Headers/avx512vlcdintrin.h b/clang/lib/Headers/avx512vlcdintrin.h
index 8f42675ba9b5d..62656f794ce2c 100644
--- a/clang/lib/Headers/avx512vlcdintrin.h
+++ b/clang/lib/Headers/avx512vlcdintrin.h
@@ -146,7 +146,7 @@ _mm256_maskz_conflict_epi32 (__mmask8 __U, __m256i __A)
 
 static __inline__ __m128i __DEFAULT_FN_ATTRS128_CONSTEXPR
 _mm_lzcnt_epi32(__m128i __A) {
-  return (__m128i)__builtin_elementwise_ctlz((__v4si)__A,
+  return (__m128i)__builtin_elementwise_clzg((__v4si)__A,
                                              (__v4si)_mm_set1_epi32(32));
 }
 
@@ -166,7 +166,7 @@ _mm_maskz_lzcnt_epi32(__mmask8 __U, __m128i __A) {
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256_CONSTEXPR
 _mm256_lzcnt_epi32(__m256i __A) {
-  return (__m256i)__builtin_elementwise_ctlz((__v8si)__A,
+  return (__m256i)__builtin_elementwise_clzg((__v8si)__A,
                                              (__v8si)_mm256_set1_epi32(32));
 }
 
@@ -186,7 +186,7 @@ _mm256_maskz_lzcnt_epi32(__mmask8 __U, __m256i __A) {
 
 static __inline__ __m128i __DEFAULT_FN_ATTRS128_CONSTEXPR
 _mm_lzcnt_epi64(__m128i __A) {
-  return (__m128i)__builtin_elementwise_ctlz(
+  return (__m128i)__builtin_elementwise_clzg(
       (__v2di)__A, (__v2di)_mm_set1_epi64x((long long)64));
 }
 
@@ -206,7 +206,7 @@ _mm_maskz_lzcnt_epi64(__mmask8 __U, __m128i __A) {
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256_CONSTEXPR
 _mm256_lzcnt_epi64(__m256i __A) {
-  return (__m256i)__builtin_elementwise_ctlz(
+  return (__m256i)__builtin_elementwise_clzg(
       (__v4di)__A, (__v4di)_mm256_set1_epi64x((long long)64));
 }
 
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 077f4311ed729..dc1d06894d48e 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -3181,8 +3181,8 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
     TheCall->setType(Magnitude.get()->getType());
     break;
   }
-  case Builtin::BI__builtin_elementwise_ctlz:
-  case Builtin::BI__builtin_elementwise_cttz:
+  case Builtin::BI__builtin_elementwise_clzg:
+  case Builtin::BI__builtin_elementwise_ctzg:
     // These builtins can be unary or binary. Note for empty calls we call the
     // unary checker in order to not emit an error that says the function
     // expects 2 arguments, which would be misleading.
diff --git a/clang/test/CodeGen/builtins-elementwise-math.c b/clang/test/CodeGen/builtins-elementwise-math.c
index 188a6c3a30f0e..e9344d8fe0b8b 100644
--- a/clang/test/CodeGen/builtins-elementwise-math.c
+++ b/clang/test/CodeGen/builtins-elementwise-math.c
@@ -1266,98 +1266,98 @@ void test_builtin_elementwise_fshl(long long int i1, long long int i2,
   u4 tmp_vu_r = __builtin_elementwise_fshr(vu1, vu2, vu3);
 }
 
-void test_builtin_elementwise_ctlz(si8 vs1, si8 vs2, u4 vu1,
+void test_builtin_elementwise_clzg(si8 vs1, si8 vs2, u4 vu1,
                                    long long int lli, short si,
                                    _BitInt(31) bi, int i,
                                    char ci) {
   // CHECK:      [[V8S1:%.+]] = load <8 x i16>, ptr %vs1.addr
   // CHECK-NEXT: call <8 x i16> @llvm.ctlz.v8i16(<8 x i16> [[V8S1]], i1 true)
-  vs1 = __builtin_elementwise_ctlz(vs1);
+  vs1 = __builtin_elementwise_clzg(vs1);
 
   // CHECK:      [[V8S1:%.+]] = load <8 x i16>, ptr %vs1.addr
   // CHECK-NEXT: [[CLZ:%.+]] = call <8 x i16> @llvm.ctlz.v8i16(<8 x i16> [[V8S1]], i1 true)
   // CHECK-NEXT: [[ISZERO:%.+]] = icmp eq <8 x i16> [[V8S1]], zeroinitializer
   // CHECK-NEXT: [[V8S2:%.+]] = load <8 x i16>, ptr %vs2.addr
   // select <8 x i1> [[ISZERO]], <8 x i16> [[CLZ]], <8 x i16> [[V8S2]]
-  vs1 = __builtin_elementwise_ctlz(vs1, vs2);
+  vs1 = __builtin_elementwise_clzg(vs1, vs2);
 
   // CHECK:      [[V4U1:%.+]] = load <4 x i32>, ptr %vu1.addr
   // CHECK-NEXT: call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> [[V4U1]], i1 true)
-  vu1 = __builtin_elementwise_ctlz(vu1);
+  vu1 = __builtin_elementwise_clzg(vu1);
 
   // CHECK:      [[LLI:%.+]] = load i64, ptr %lli.addr
   // CHECK-NEXT: call i64 @llvm.ctlz.i64(i64 [[LLI]], i1 true)
-  lli = __builtin_elementwise_ctlz(lli);
+  lli = __builtin_elementwise_clzg(lli);
 
   // CHECK:      [[SI:%.+]] = load i16, ptr %si.addr
   // CHECK-NEXT: call i16 @llvm.ctlz.i16(i16 [[SI]], i1 true)
-  si = __builtin_elementwise_ctlz(si);
+  si = __builtin_elementwise_clzg(si);
 
   // CHECK:      [[BI1:%.+]] = load i32, ptr %bi.addr
   // CHECK-NEXT: [[BI2:%.+]] = trunc i32 [[BI1]] to i31
   // CHECK-NEXT: call i31 @llvm.ctlz.i31(i31 [[BI2]], i1 true)
-  bi = __builtin_elementwise_ctlz(bi);
+  bi = __builtin_elementwise_clzg(bi);
 
   // CHECK:      [[BI1:%.+]] = load i32, ptr %bi.addr
   // CHECK-NEXT: [[BI2:%.+]] = trunc i32 [[BI1]] to i31
   // CHECK-NEXT: [[CLZ:%.+]] = call i31 @llvm.ctlz.i31(i31 [[BI2]], i1 true)
   // CHECK-NEXT: [[ISZERO:%.+]] = icmp eq i31 [[BI2]], 0
   // CHECK-NEXT: select i1 [[ISZERO]], i31 1, i31 [[CLZ]]
-  bi = __builtin_elementwise_ctlz(bi, (_BitInt(31))1);
+  bi = __builtin_elementwise_clzg(bi, (_BitInt(31))1);
 
   // CHECK:      [[I:%.+]] = load i32, ptr %i.addr
   // CHECK-NEXT: call i32 @llvm.ctlz.i32(i32 [[I]], i1 true)
-  i = __builtin_elementwise_ctlz(i);
+  i = __builtin_elementwise_clzg(i);
 
   // CHECK:      [[CI:%.+]] = load i8, ptr %ci.addr
   // CHECK-NEXT: call i8 @llvm.ctlz.i8(i8 [[CI]], i1 true)
-  ci = __builtin_elementwise_ctlz(ci);
+  ci = __builtin_elementwise_clzg(ci);
 }
 
-void test_builtin_elementwise_cttz(si8 vs1, si8 vs2, u4 vu1,
+void test_builtin_elementwise_ctzg(si8 vs1, si8 vs2, u4 vu1,
                                    long long int lli, short si,
                                    _BitInt(31) bi, int i,
                                    char ci) {
   // CHECK:      [[V8S1:%.+]] = load <8 x i16>, ptr %vs1.addr
   // CHECK-NEXT: call <8 x i16> @llvm.cttz.v8i16(<8 x i16> [[V8S1]], i1 true)
-  vs1 = __builtin_elementwise_cttz(vs1);
+  vs1 = __builtin_elementwise_ctzg(vs1);
 
   // CHECK:      [[V8S1:%.+]] = load <8 x i16>, ptr %vs1.addr
   // CHECK-NEXT: [[ctz:%.+]] = call <8 x i16> @llvm.cttz.v8i16(<8 x i16> [[V8S1]], i1 true)
   // CHECK-NEXT: [[ISZERO:%.+]] = icmp eq <8 x i16> [[V8S1]], zeroinitializer
   // CHECK-NEXT: [[V8S2:%.+]] = load <8 x i16>, ptr %vs2.addr
   // select <8 x i1> [[ISZERO]], <8 x i16> [[ctz]], <8 x i16> [[V8S2]]
-  vs1 = __builtin_elementwise_cttz(vs1, vs2);
+  vs1 = __builtin_elementwise_ctzg(vs1, vs2);
 
   // CHECK:      [[V4U1:%.+]] = load <4 x i32>, ptr %vu1.addr
   // CHECK-NEXT: call <4 x i32> @llvm.cttz.v4i32(<4 x i32> [[V4U1]], i1 true)
-  vu1 = __builtin_elementwise_cttz(vu1);
+  vu1 = __builtin_elementwise_ctzg(vu1);
 
   // CHECK:      [[LLI:%.+]] = load i64, ptr %lli.addr
   // CHECK-NEXT: call i64 @llvm.cttz.i64(i64 [[LLI]], i1 true)
-  lli = __builtin_elementwise_cttz(lli);
+  lli = __builtin_elementwise_ctzg(lli);
 
   // CHECK:      [[SI:%.+]] = load i16, ptr %si.addr
   /...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Sep 5, 2025

@llvm/pr-subscribers-clang-codegen

Author: Joseph Huber (jhuber6)

Changes

Summary:
The added bit counting builtins for vectors used cttz and ctlz,
which is consistent with the LLVM naming convention. However, these are
clang builtins and implement exactly the __builtin_ctzg and
__builtin_clzg behavior. It is confusing to people familiar with other
other builtins that these are the only bit counting intrinsics named
differently. This includes the additional operation for the undefined
zero case, which was added as a clzg extension.


Patch is 30.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/157128.diff

12 Files Affected:

  • (modified) clang/docs/LanguageExtensions.rst (+3-3)
  • (modified) clang/include/clang/Basic/Builtins.td (+2-2)
  • (modified) clang/include/clang/Basic/DiagnosticASTKinds.td (+1-1)
  • (modified) clang/lib/AST/ByteCode/InterpBuiltin.cpp (+4-4)
  • (modified) clang/lib/AST/ExprConstant.cpp (+11-11)
  • (modified) clang/lib/CodeGen/CGBuiltin.cpp (+6-6)
  • (modified) clang/lib/Headers/avx512cdintrin.h (+2-2)
  • (modified) clang/lib/Headers/avx512vlcdintrin.h (+4-4)
  • (modified) clang/lib/Sema/SemaChecking.cpp (+2-2)
  • (modified) clang/test/CodeGen/builtins-elementwise-math.c (+20-20)
  • (modified) clang/test/Sema/builtins-elementwise-math.c (+14-14)
  • (modified) clang/test/Sema/constant-builtins-vector.cpp (+37-37)
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index ad190eace5b05..a150b1c73bc92 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -772,7 +772,7 @@ The elementwise intrinsics ``__builtin_elementwise_popcount``,
 ``__builtin_elementwise_bitreverse``, ``__builtin_elementwise_add_sat``,
 ``__builtin_elementwise_sub_sat``, ``__builtin_elementwise_max``,
 ``__builtin_elementwise_min``, ``__builtin_elementwise_abs``,
-``__builtin_elementwise_ctlz``, ``__builtin_elementwise_cttz``, and
+``__builtin_elementwise_clzg``, ``__builtin_elementwise_ctzg``, and
 ``__builtin_elementwise_fma`` can be called in a ``constexpr`` context.
 
 No implicit promotion of integer types takes place. The mixing of integer types
@@ -884,11 +884,11 @@ T __builtin_elementwise_fshr(T x, T y, T z)     perform a funnel shift right. Co
                                                 right by z (modulo the bit width of the original arguments),
                                                 and the least significant bits are extracted to produce
                                                 a result that is the same size as the original arguments.
- T __builtin_elementwise_ctlz(T x[, T y])       return the number of leading 0 bits in the first argument. If          integer types
+ T __builtin_elementwise_clzg(T x[, T y])       return the number of leading 0 bits in the first argument. If          integer types
                                                 the first argument is 0 and an optional second argument is provided,
                                                 the second argument is returned. It is undefined behaviour if the
                                                 first argument is 0 and no second argument is provided.
- T __builtin_elementwise_cttz(T x[, T y])       return the number of trailing 0 bits in the first argument. If         integer types
+ T __builtin_elementwise_ctzg(T x[, T y])       return the number of trailing 0 bits in the first argument. If         integer types
                                                 the first argument is 0 and an optional second argument is provided,
                                                 the second argument is returned. It is undefined behaviour if the
                                                 first argument is 0 and no second argument is provided.
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 27fc6f008d743..1111cfacb8559 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -1551,13 +1551,13 @@ def ElementwiseFshr : Builtin {
 }
 
 def ElementwiseCtlz : Builtin {
-  let Spellings = ["__builtin_elementwise_ctlz"];
+  let Spellings = ["__builtin_elementwise_clzg"];
   let Attributes = [NoThrow, Const, CustomTypeChecking, Constexpr];
   let Prototype = "void(...)";
 }
 
 def ElementwiseCttz : Builtin {
-  let Spellings = ["__builtin_elementwise_cttz"];
+  let Spellings = ["__builtin_elementwise_ctzg"];
   let Attributes = [NoThrow, Const, CustomTypeChecking, Constexpr];
   let Prototype = "void(...)";
 }
diff --git a/clang/include/clang/Basic/DiagnosticASTKinds.td b/clang/include/clang/Basic/DiagnosticASTKinds.td
index a63bd80b89657..0be9146f70364 100644
--- a/clang/include/clang/Basic/DiagnosticASTKinds.td
+++ b/clang/include/clang/Basic/DiagnosticASTKinds.td
@@ -401,7 +401,7 @@ def note_constexpr_non_const_vectorelements : Note<
 def note_constexpr_assumption_failed : Note<
   "assumption evaluated to false">;
 def note_constexpr_countzeroes_zero : Note<
-  "evaluation of %select{__builtin_elementwise_ctlz|__builtin_elementwise_cttz}0 "
+  "evaluation of %select{__builtin_elementwise_clzg|__builtin_elementwise_ctzg}0 "
   "with a zero value is undefined">;
 def err_experimental_clang_interp_failed : Error<
   "the experimental clang interpreter failed to evaluate an expression">;
diff --git a/clang/lib/AST/ByteCode/InterpBuiltin.cpp b/clang/lib/AST/ByteCode/InterpBuiltin.cpp
index ff6ef5a1f6864..57533db93d816 100644
--- a/clang/lib/AST/ByteCode/InterpBuiltin.cpp
+++ b/clang/lib/AST/ByteCode/InterpBuiltin.cpp
@@ -1831,7 +1831,7 @@ static bool interp__builtin_elementwise_countzeroes(InterpState &S,
                                                     const CallExpr *Call,
                                                     unsigned BuiltinID) {
   const bool HasZeroArg = Call->getNumArgs() == 2;
-  const bool IsCTTZ = BuiltinID == Builtin::BI__builtin_elementwise_cttz;
+  const bool IsCTTZ = BuiltinID == Builtin::BI__builtin_elementwise_ctzg;
   assert(Call->getNumArgs() == 1 || HasZeroArg);
   if (Call->getArg(0)->getType()->isIntegerType()) {
     PrimType ArgT = *S.getContext().classify(Call->getArg(0)->getType());
@@ -1855,7 +1855,7 @@ static bool interp__builtin_elementwise_countzeroes(InterpState &S,
       return false;
     }
 
-    if (BuiltinID == Builtin::BI__builtin_elementwise_ctlz) {
+    if (BuiltinID == Builtin::BI__builtin_elementwise_clzg) {
       pushInteger(S, Val.countLeadingZeros(), Call->getType());
     } else {
       pushInteger(S, Val.countTrailingZeros(), Call->getType());
@@ -3164,8 +3164,8 @@ bool InterpretBuiltin(InterpState &S, CodePtr OpPC, const CallExpr *Call,
   case Builtin::BI__builtin_ctzg:
     return interp__builtin_ctz(S, OpPC, Frame, Call, BuiltinID);
 
-  case Builtin::BI__builtin_elementwise_ctlz:
-  case Builtin::BI__builtin_elementwise_cttz:
+  case Builtin::BI__builtin_elementwise_clzg:
+  case Builtin::BI__builtin_elementwise_ctzg:
     return interp__builtin_elementwise_countzeroes(S, OpPC, Frame, Call,
                                                    BuiltinID);
 
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 6c6909e5b2370..0479aeecb1a5e 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -12039,8 +12039,8 @@ bool VectorExprEvaluator::VisitCallExpr(const CallExpr *E) {
 
     return Success(APValue(ResultElements.data(), ResultElements.size()), E);
   }
-  case Builtin::BI__builtin_elementwise_ctlz:
-  case Builtin::BI__builtin_elementwise_cttz: {
+  case Builtin::BI__builtin_elementwise_clzg:
+  case Builtin::BI__builtin_elementwise_ctzg: {
     APValue SourceLHS;
     std::optional<APValue> Fallback;
     if (!EvaluateAsRValue(Info, E->getArg(0), SourceLHS))
@@ -12064,19 +12064,19 @@ bool VectorExprEvaluator::VisitCallExpr(const CallExpr *E) {
         if (!Fallback) {
           Info.FFDiag(E, diag::note_constexpr_countzeroes_zero)
               << /*IsTrailing=*/(E->getBuiltinCallee() ==
-                                 Builtin::BI__builtin_elementwise_cttz);
+                                 Builtin::BI__builtin_elementwise_ctzg);
           return false;
         }
         ResultElements.push_back(Fallback->getVectorElt(EltNum));
         continue;
       }
       switch (E->getBuiltinCallee()) {
-      case Builtin::BI__builtin_elementwise_ctlz:
+      case Builtin::BI__builtin_elementwise_clzg:
         ResultElements.push_back(APValue(
             APSInt(APInt(Info.Ctx.getIntWidth(DestEltTy), LHS.countl_zero()),
                    DestEltTy->isUnsignedIntegerOrEnumerationType())));
         break;
-      case Builtin::BI__builtin_elementwise_cttz:
+      case Builtin::BI__builtin_elementwise_ctzg:
         ResultElements.push_back(APValue(
             APSInt(APInt(Info.Ctx.getIntWidth(DestEltTy), LHS.countr_zero()),
                    DestEltTy->isUnsignedIntegerOrEnumerationType())));
@@ -13694,7 +13694,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
   case Builtin::BI__builtin_clzll:
   case Builtin::BI__builtin_clzs:
   case Builtin::BI__builtin_clzg:
-  case Builtin::BI__builtin_elementwise_ctlz:
+  case Builtin::BI__builtin_elementwise_clzg:
   case Builtin::BI__lzcnt16: // Microsoft variants of count leading-zeroes
   case Builtin::BI__lzcnt:
   case Builtin::BI__lzcnt64: {
@@ -13710,7 +13710,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
 
     std::optional<APSInt> Fallback;
     if ((BuiltinOp == Builtin::BI__builtin_clzg ||
-         BuiltinOp == Builtin::BI__builtin_elementwise_ctlz) &&
+         BuiltinOp == Builtin::BI__builtin_elementwise_clzg) &&
         E->getNumArgs() > 1) {
       APSInt FallbackTemp;
       if (!EvaluateInteger(E->getArg(1), FallbackTemp, Info))
@@ -13729,7 +13729,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
                              BuiltinOp != Builtin::BI__lzcnt &&
                              BuiltinOp != Builtin::BI__lzcnt64;
 
-      if (BuiltinOp == Builtin::BI__builtin_elementwise_ctlz) {
+      if (BuiltinOp == Builtin::BI__builtin_elementwise_clzg) {
         Info.FFDiag(E, diag::note_constexpr_countzeroes_zero)
             << /*IsTrailing=*/false;
       }
@@ -13789,7 +13789,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
   case Builtin::BI__builtin_ctzll:
   case Builtin::BI__builtin_ctzs:
   case Builtin::BI__builtin_ctzg:
-  case Builtin::BI__builtin_elementwise_cttz: {
+  case Builtin::BI__builtin_elementwise_ctzg: {
     APSInt Val;
     if (E->getArg(0)->getType()->isExtVectorBoolType()) {
       APValue Vec;
@@ -13802,7 +13802,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
 
     std::optional<APSInt> Fallback;
     if ((BuiltinOp == Builtin::BI__builtin_ctzg ||
-         BuiltinOp == Builtin::BI__builtin_elementwise_cttz) &&
+         BuiltinOp == Builtin::BI__builtin_elementwise_ctzg) &&
         E->getNumArgs() > 1) {
       APSInt FallbackTemp;
       if (!EvaluateInteger(E->getArg(1), FallbackTemp, Info))
@@ -13814,7 +13814,7 @@ bool IntExprEvaluator::VisitBuiltinCallExpr(const CallExpr *E,
       if (Fallback)
         return Success(*Fallback, E);
 
-      if (BuiltinOp == Builtin::BI__builtin_elementwise_cttz) {
+      if (BuiltinOp == Builtin::BI__builtin_elementwise_ctzg) {
         Info.FFDiag(E, diag::note_constexpr_countzeroes_zero)
             << /*IsTrailing=*/true;
       }
diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp
index 172a521e63c17..c9c98e16fab43 100644
--- a/clang/lib/CodeGen/CGBuiltin.cpp
+++ b/clang/lib/CodeGen/CGBuiltin.cpp
@@ -3343,10 +3343,10 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
   case Builtin::BI__builtin_ctzl:
   case Builtin::BI__builtin_ctzll:
   case Builtin::BI__builtin_ctzg:
-  case Builtin::BI__builtin_elementwise_cttz: {
+  case Builtin::BI__builtin_elementwise_ctzg: {
     bool HasFallback =
         (BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_ctzg ||
-         BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_cttz) &&
+         BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_ctzg) &&
         E->getNumArgs() > 1;
 
     Value *ArgValue =
@@ -3360,7 +3360,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     // The elementwise builtins always exhibit zero-is-undef behaviour
     Value *ZeroUndef = Builder.getInt1(
         HasFallback || getTarget().isCLZForZeroUndef() ||
-        BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_cttz);
+        BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_ctzg);
     Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef});
     if (Result->getType() != ResultType)
       Result =
@@ -3380,10 +3380,10 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
   case Builtin::BI__builtin_clzl:
   case Builtin::BI__builtin_clzll:
   case Builtin::BI__builtin_clzg:
-  case Builtin::BI__builtin_elementwise_ctlz: {
+  case Builtin::BI__builtin_elementwise_clzg: {
     bool HasFallback =
         (BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_clzg ||
-         BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_ctlz) &&
+         BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_clzg) &&
         E->getNumArgs() > 1;
 
     Value *ArgValue =
@@ -3397,7 +3397,7 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID,
     // The elementwise builtins always exhibit zero-is-undef behaviour
     Value *ZeroUndef = Builder.getInt1(
         HasFallback || getTarget().isCLZForZeroUndef() ||
-        BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_ctlz);
+        BuiltinIDIfNoAsmLabel == Builtin::BI__builtin_elementwise_clzg);
     Value *Result = Builder.CreateCall(F, {ArgValue, ZeroUndef});
     if (Result->getType() != ResultType)
       Result =
diff --git a/clang/lib/Headers/avx512cdintrin.h b/clang/lib/Headers/avx512cdintrin.h
index 39e76711ca7b3..6d572daaff11e 100644
--- a/clang/lib/Headers/avx512cdintrin.h
+++ b/clang/lib/Headers/avx512cdintrin.h
@@ -71,7 +71,7 @@ _mm512_maskz_conflict_epi32 (__mmask16 __U, __m512i __A)
 
 static __inline__ __m512i __DEFAULT_FN_ATTRS_CONSTEXPR
 _mm512_lzcnt_epi32(__m512i __A) {
-  return (__m512i)__builtin_elementwise_ctlz((__v16si)__A,
+  return (__m512i)__builtin_elementwise_clzg((__v16si)__A,
                                              (__v16si)_mm512_set1_epi32(32));
 }
 
@@ -91,7 +91,7 @@ _mm512_maskz_lzcnt_epi32(__mmask16 __U, __m512i __A) {
 
 static __inline__ __m512i __DEFAULT_FN_ATTRS_CONSTEXPR
 _mm512_lzcnt_epi64(__m512i __A) {
-  return (__m512i)__builtin_elementwise_ctlz(
+  return (__m512i)__builtin_elementwise_clzg(
       (__v8di)__A, (__v8di)_mm512_set1_epi64((long long)64));
 }
 
diff --git a/clang/lib/Headers/avx512vlcdintrin.h b/clang/lib/Headers/avx512vlcdintrin.h
index 8f42675ba9b5d..62656f794ce2c 100644
--- a/clang/lib/Headers/avx512vlcdintrin.h
+++ b/clang/lib/Headers/avx512vlcdintrin.h
@@ -146,7 +146,7 @@ _mm256_maskz_conflict_epi32 (__mmask8 __U, __m256i __A)
 
 static __inline__ __m128i __DEFAULT_FN_ATTRS128_CONSTEXPR
 _mm_lzcnt_epi32(__m128i __A) {
-  return (__m128i)__builtin_elementwise_ctlz((__v4si)__A,
+  return (__m128i)__builtin_elementwise_clzg((__v4si)__A,
                                              (__v4si)_mm_set1_epi32(32));
 }
 
@@ -166,7 +166,7 @@ _mm_maskz_lzcnt_epi32(__mmask8 __U, __m128i __A) {
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256_CONSTEXPR
 _mm256_lzcnt_epi32(__m256i __A) {
-  return (__m256i)__builtin_elementwise_ctlz((__v8si)__A,
+  return (__m256i)__builtin_elementwise_clzg((__v8si)__A,
                                              (__v8si)_mm256_set1_epi32(32));
 }
 
@@ -186,7 +186,7 @@ _mm256_maskz_lzcnt_epi32(__mmask8 __U, __m256i __A) {
 
 static __inline__ __m128i __DEFAULT_FN_ATTRS128_CONSTEXPR
 _mm_lzcnt_epi64(__m128i __A) {
-  return (__m128i)__builtin_elementwise_ctlz(
+  return (__m128i)__builtin_elementwise_clzg(
       (__v2di)__A, (__v2di)_mm_set1_epi64x((long long)64));
 }
 
@@ -206,7 +206,7 @@ _mm_maskz_lzcnt_epi64(__mmask8 __U, __m128i __A) {
 
 static __inline__ __m256i __DEFAULT_FN_ATTRS256_CONSTEXPR
 _mm256_lzcnt_epi64(__m256i __A) {
-  return (__m256i)__builtin_elementwise_ctlz(
+  return (__m256i)__builtin_elementwise_clzg(
       (__v4di)__A, (__v4di)_mm256_set1_epi64x((long long)64));
 }
 
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 077f4311ed729..dc1d06894d48e 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -3181,8 +3181,8 @@ Sema::CheckBuiltinFunctionCall(FunctionDecl *FDecl, unsigned BuiltinID,
     TheCall->setType(Magnitude.get()->getType());
     break;
   }
-  case Builtin::BI__builtin_elementwise_ctlz:
-  case Builtin::BI__builtin_elementwise_cttz:
+  case Builtin::BI__builtin_elementwise_clzg:
+  case Builtin::BI__builtin_elementwise_ctzg:
     // These builtins can be unary or binary. Note for empty calls we call the
     // unary checker in order to not emit an error that says the function
     // expects 2 arguments, which would be misleading.
diff --git a/clang/test/CodeGen/builtins-elementwise-math.c b/clang/test/CodeGen/builtins-elementwise-math.c
index 188a6c3a30f0e..e9344d8fe0b8b 100644
--- a/clang/test/CodeGen/builtins-elementwise-math.c
+++ b/clang/test/CodeGen/builtins-elementwise-math.c
@@ -1266,98 +1266,98 @@ void test_builtin_elementwise_fshl(long long int i1, long long int i2,
   u4 tmp_vu_r = __builtin_elementwise_fshr(vu1, vu2, vu3);
 }
 
-void test_builtin_elementwise_ctlz(si8 vs1, si8 vs2, u4 vu1,
+void test_builtin_elementwise_clzg(si8 vs1, si8 vs2, u4 vu1,
                                    long long int lli, short si,
                                    _BitInt(31) bi, int i,
                                    char ci) {
   // CHECK:      [[V8S1:%.+]] = load <8 x i16>, ptr %vs1.addr
   // CHECK-NEXT: call <8 x i16> @llvm.ctlz.v8i16(<8 x i16> [[V8S1]], i1 true)
-  vs1 = __builtin_elementwise_ctlz(vs1);
+  vs1 = __builtin_elementwise_clzg(vs1);
 
   // CHECK:      [[V8S1:%.+]] = load <8 x i16>, ptr %vs1.addr
   // CHECK-NEXT: [[CLZ:%.+]] = call <8 x i16> @llvm.ctlz.v8i16(<8 x i16> [[V8S1]], i1 true)
   // CHECK-NEXT: [[ISZERO:%.+]] = icmp eq <8 x i16> [[V8S1]], zeroinitializer
   // CHECK-NEXT: [[V8S2:%.+]] = load <8 x i16>, ptr %vs2.addr
   // select <8 x i1> [[ISZERO]], <8 x i16> [[CLZ]], <8 x i16> [[V8S2]]
-  vs1 = __builtin_elementwise_ctlz(vs1, vs2);
+  vs1 = __builtin_elementwise_clzg(vs1, vs2);
 
   // CHECK:      [[V4U1:%.+]] = load <4 x i32>, ptr %vu1.addr
   // CHECK-NEXT: call <4 x i32> @llvm.ctlz.v4i32(<4 x i32> [[V4U1]], i1 true)
-  vu1 = __builtin_elementwise_ctlz(vu1);
+  vu1 = __builtin_elementwise_clzg(vu1);
 
   // CHECK:      [[LLI:%.+]] = load i64, ptr %lli.addr
   // CHECK-NEXT: call i64 @llvm.ctlz.i64(i64 [[LLI]], i1 true)
-  lli = __builtin_elementwise_ctlz(lli);
+  lli = __builtin_elementwise_clzg(lli);
 
   // CHECK:      [[SI:%.+]] = load i16, ptr %si.addr
   // CHECK-NEXT: call i16 @llvm.ctlz.i16(i16 [[SI]], i1 true)
-  si = __builtin_elementwise_ctlz(si);
+  si = __builtin_elementwise_clzg(si);
 
   // CHECK:      [[BI1:%.+]] = load i32, ptr %bi.addr
   // CHECK-NEXT: [[BI2:%.+]] = trunc i32 [[BI1]] to i31
   // CHECK-NEXT: call i31 @llvm.ctlz.i31(i31 [[BI2]], i1 true)
-  bi = __builtin_elementwise_ctlz(bi);
+  bi = __builtin_elementwise_clzg(bi);
 
   // CHECK:      [[BI1:%.+]] = load i32, ptr %bi.addr
   // CHECK-NEXT: [[BI2:%.+]] = trunc i32 [[BI1]] to i31
   // CHECK-NEXT: [[CLZ:%.+]] = call i31 @llvm.ctlz.i31(i31 [[BI2]], i1 true)
   // CHECK-NEXT: [[ISZERO:%.+]] = icmp eq i31 [[BI2]], 0
   // CHECK-NEXT: select i1 [[ISZERO]], i31 1, i31 [[CLZ]]
-  bi = __builtin_elementwise_ctlz(bi, (_BitInt(31))1);
+  bi = __builtin_elementwise_clzg(bi, (_BitInt(31))1);
 
   // CHECK:      [[I:%.+]] = load i32, ptr %i.addr
   // CHECK-NEXT: call i32 @llvm.ctlz.i32(i32 [[I]], i1 true)
-  i = __builtin_elementwise_ctlz(i);
+  i = __builtin_elementwise_clzg(i);
 
   // CHECK:      [[CI:%.+]] = load i8, ptr %ci.addr
   // CHECK-NEXT: call i8 @llvm.ctlz.i8(i8 [[CI]], i1 true)
-  ci = __builtin_elementwise_ctlz(ci);
+  ci = __builtin_elementwise_clzg(ci);
 }
 
-void test_builtin_elementwise_cttz(si8 vs1, si8 vs2, u4 vu1,
+void test_builtin_elementwise_ctzg(si8 vs1, si8 vs2, u4 vu1,
                                    long long int lli, short si,
                                    _BitInt(31) bi, int i,
                                    char ci) {
   // CHECK:      [[V8S1:%.+]] = load <8 x i16>, ptr %vs1.addr
   // CHECK-NEXT: call <8 x i16> @llvm.cttz.v8i16(<8 x i16> [[V8S1]], i1 true)
-  vs1 = __builtin_elementwise_cttz(vs1);
+  vs1 = __builtin_elementwise_ctzg(vs1);
 
   // CHECK:      [[V8S1:%.+]] = load <8 x i16>, ptr %vs1.addr
   // CHECK-NEXT: [[ctz:%.+]] = call <8 x i16> @llvm.cttz.v8i16(<8 x i16> [[V8S1]], i1 true)
   // CHECK-NEXT: [[ISZERO:%.+]] = icmp eq <8 x i16> [[V8S1]], zeroinitializer
   // CHECK-NEXT: [[V8S2:%.+]] = load <8 x i16>, ptr %vs2.addr
   // select <8 x i1> [[ISZERO]], <8 x i16> [[ctz]], <8 x i16> [[V8S2]]
-  vs1 = __builtin_elementwise_cttz(vs1, vs2);
+  vs1 = __builtin_elementwise_ctzg(vs1, vs2);
 
   // CHECK:      [[V4U1:%.+]] = load <4 x i32>, ptr %vu1.addr
   // CHECK-NEXT: call <4 x i32> @llvm.cttz.v4i32(<4 x i32> [[V4U1]], i1 true)
-  vu1 = __builtin_elementwise_cttz(vu1);
+  vu1 = __builtin_elementwise_ctzg(vu1);
 
   // CHECK:      [[LLI:%.+]] = load i64, ptr %lli.addr
   // CHECK-NEXT: call i64 @llvm.cttz.i64(i64 [[LLI]], i1 true)
-  lli = __builtin_elementwise_cttz(lli);
+  lli = __builtin_elementwise_ctzg(lli);
 
   // CHECK:      [[SI:%.+]] = load i16, ptr %si.addr
   /...
[truncated]

@RKSimon
Copy link
Collaborator

RKSimon commented Sep 5, 2025

No objections, but only because we haven't had these builtins for very long.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Sep 5, 2025

No objections, but only because we haven't had these builtins for very long.

Yeah, they haven't made it into a release so it should be fair game.

@frasercrmck
Copy link
Contributor

I don't really mind what they're called. As I said in the original RFC and PR that introduced the builtins, I have no particular favourites.

However, do note that the elementwise builtins are not exactly like clzg/ctzg in that they don't have target-specific zero-is-undef behaviour - it's unconditionally zero-is-undef. That's perhaps one of the (few) arguments for naming them differently.

This PR should also update libclc which is using these builtins.

@jhuber6
Copy link
Contributor Author

jhuber6 commented Sep 7, 2025

I don't really mind what they're called. As I said in the original RFC and PR that introduced the builtins, I have no particular favourites.

However, do note that the elementwise builtins are not exactly like clzg/ctzg in that they don't have target-specific zero-is-undef behaviour - it's unconditionally zero-is-undef. That's perhaps one of the (few) arguments for naming them differently.

This PR should also update libclc which is using these builtins.

To me, maybe undefined is roughly equivalent to undefined in practice, so I think they're still more or less the same for the targets that people are more familiar with. Changes the libclc uses.

@llvmbot llvmbot added the libclc libclc OpenCL library label Sep 7, 2025
@jhuber6
Copy link
Contributor Author

jhuber6 commented Sep 15, 2025

ping

Copy link
Contributor

@philnik777 philnik777 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I'm in favour of the renaming. Having consistent naming for the __builtin_elementwise_* and __builtin_* functions is IMO much more valuable than being consistent with compiler-internal naming.

Copy link
Collaborator

@RKSimon RKSimon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jhuber6 jhuber6 merged commit 1597fad into llvm:main Sep 19, 2025
11 checks passed
@jhuber6 jhuber6 deleted the rename branch September 19, 2025 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:X86 clang:bytecode Issues for the clang bytecode constexpr interpreter clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:headers Headers provided by Clang, e.g. for intrinsics clang Clang issues not falling into any other category libclc libclc OpenCL library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants