[Clang] VectorExprEvaluator::VisitCallExpr / InterpretBuiltin - allow AVX/AVX512 subvector extraction intrinsics to be used in constexpr #157712 #162836

SeongjaeP · 2025-10-10T12:14:08Z

This PR supersedes and replaces PR #158853

The original branch diverged too far from the main branch, resulting in significant merge conflicts that were difficult to resolve cleanly. To provide a clean and reviewable history, this new PR was created by cherry-picking the necessary commits onto a fresh branch based on the latest main.

(Original Description)

This patch enables the use of AVX/AVX512 subvector extraction intrinsics within constexpr functions. This is achieved by implementing the evaluation logic for these intrinsics in VectorExprEvaluator::VisitCallExpr and InterpretBuiltin.

The original discussion and review comments can be found in the previous pull request for context: #158853

Fixes #157712

github-actions · 2025-10-10T12:14:30Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-10-10T12:15:22Z

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-clang

Author: SeongJaePark (SeongjaeP)

Changes

This PR supersedes and replaces PR #158853

The original branch diverged too far from the main branch, resulting in significant merge conflicts that were difficult to resolve cleanly. To provide a clean and reviewable history, this new PR was created by cherry-picking the necessary commits onto a fresh branch based on the latest main.

(Original Description)

This patch enables the use of AVX/AVX512 subvector extraction intrinsics within constexpr functions. This is achieved by implementing the evaluation logic for these intrinsics in VectorExprEvaluator::VisitCallExpr and InterpretBuiltin.

The original discussion and review comments can be found in the previous pull request for context: #158853

Patch is 40.23 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/162836.diff

13 Files Affected:

(modified) clang/include/clang/Basic/BuiltinsX86.td (+7-7)
(modified) clang/lib/AST/ByteCode/InterpBuiltin.cpp (+105)
(modified) clang/lib/AST/ExprConstant.cpp (+75)
(modified) clang/lib/Headers/avx512dqintrin.h (+4-4)
(modified) clang/lib/Headers/avx512fintrin.h (+4-4)
(modified) clang/lib/Headers/avx512vldqintrin.h (+2-2)
(modified) clang/lib/Headers/avx512vlintrin.h (+2-2)
(modified) clang/test/CodeGen/X86/avx-builtins.c (+7-1)
(modified) clang/test/CodeGen/X86/avx2-builtins.c (+1)
(modified) clang/test/CodeGen/X86/avx512dq-builtins.c (+12)
(modified) clang/test/CodeGen/X86/avx512f-builtins.c (+12-1)
(modified) clang/test/CodeGen/X86/avx512vl-builtins.c (+6)
(modified) clang/test/CodeGen/X86/avx512vldq-builtins.c (+6)

diff --git a/clang/include/clang/Basic/BuiltinsX86.td b/clang/include/clang/Basic/BuiltinsX86.td
index 217589d7add1d..b3b4d5e076fd8 100644
--- a/clang/include/clang/Basic/BuiltinsX86.td
+++ b/clang/include/clang/Basic/BuiltinsX86.td
@@ -471,7 +471,7 @@ let Features = "avx512f,vpclmulqdq", Attributes = [NoThrow, Const, RequiredVecto
   def pclmulqdq512 : X86Builtin<"_Vector<8, long long int>(_Vector<8, long long int>, _Vector<8, long long int>, _Constant char)">;
 }
 
-let Features = "avx", Attributes = [NoThrow, Const, RequiredVectorWidth<256>] in {
+let Features = "avx", Attributes = [NoThrow, Const, Constexpr, RequiredVectorWidth<256>] in {
   def vpermilvarpd : X86Builtin<"_Vector<2, double>(_Vector<2, double>, _Vector<2, long long int>)">;
   def vpermilvarps : X86Builtin<"_Vector<4, float>(_Vector<4, float>, _Vector<4, int>)">;
   def vpermilvarpd256 : X86Builtin<"_Vector<4, double>(_Vector<4, double>, _Vector<4, long long int>)">;
@@ -576,7 +576,7 @@ let Features = "avx", Attributes = [NoThrow, Const, Constexpr, RequiredVectorWid
   def vec_set_v8si : X86Builtin<"_Vector<8, int>(_Vector<8, int>, int, _Constant int)">;
 }
 
-let Features = "avx2", Attributes = [NoThrow, Const, RequiredVectorWidth<256>] in {
+let Features = "avx2", Attributes = [NoThrow, Const, Constexpr, RequiredVectorWidth<256>] in {
   def mpsadbw256 : X86Builtin<"_Vector<32, char>(_Vector<32, char>, _Vector<32, char>, _Constant char)">;
   def palignr256 : X86Builtin<"_Vector<32, char>(_Vector<32, char>, _Vector<32, char>, _Constant int)">;
   def phaddw256 : X86Builtin<"_Vector<16, short>(_Vector<16, short>, _Vector<16, short>)">;
@@ -1065,7 +1065,7 @@ let Features = "avx512vl", Attributes = [NoThrow, Const, RequiredVectorWidth<256
   def alignq256 : X86Builtin<"_Vector<4, long long int>(_Vector<4, long long int>, _Vector<4, long long int>, _Constant int)">;
 }
 
-let Features = "avx512f", Attributes = [NoThrow, Const, RequiredVectorWidth<512>] in {
+let Features = "avx512f", Attributes = [NoThrow, Const, Constexpr, RequiredVectorWidth<512>] in {
   def extractf64x4_mask : X86Builtin<"_Vector<4, double>(_Vector<8, double>, _Constant int, _Vector<4, double>, unsigned char)">;
   def extractf32x4_mask : X86Builtin<"_Vector<4, float>(_Vector<16, float>, _Constant int, _Vector<4, float>, unsigned char)">;
 }
@@ -2944,24 +2944,24 @@ let Features = "avx512vl", Attributes = [NoThrow, RequiredVectorWidth<256>] in {
   def pmovqw256mem_mask : X86Builtin<"void(_Vector<8, short *>, _Vector<4, long long int>, unsigned char)">;
 }
 
-let Features = "avx512dq", Attributes = [NoThrow, Const, RequiredVectorWidth<512>] in {
+let Features = "avx512dq", Attributes = [NoThrow, Const, Constexpr, RequiredVectorWidth<512>] in {
   def extractf32x8_mask : X86Builtin<"_Vector<8, float>(_Vector<16, float>, _Constant int, _Vector<8, float>, unsigned char)">;
   def extractf64x2_512_mask : X86Builtin<"_Vector<2, double>(_Vector<8, double>, _Constant int, _Vector<2, double>, unsigned char)">;
   def extracti32x8_mask : X86Builtin<"_Vector<8, int>(_Vector<16, int>, _Constant int, _Vector<8, int>, unsigned char)">;
   def extracti64x2_512_mask : X86Builtin<"_Vector<2, long long int>(_Vector<8, long long int>, _Constant int, _Vector<2, long long int>, unsigned char)">;
 }
 
-let Features = "avx512f", Attributes = [NoThrow, Const, RequiredVectorWidth<512>] in {
+let Features = "avx512f", Attributes = [NoThrow, Const, Constexpr, RequiredVectorWidth<512>] in {
   def extracti32x4_mask : X86Builtin<"_Vector<4, int>(_Vector<16, int>, _Constant int, _Vector<4, int>, unsigned char)">;
   def extracti64x4_mask : X86Builtin<"_Vector<4, long long int>(_Vector<8, long long int>, _Constant int, _Vector<4, long long int>, unsigned char)">;
 }
 
-let Features = "avx512dq,avx512vl", Attributes = [NoThrow, Const, RequiredVectorWidth<256>] in {
+let Features = "avx512dq,avx512vl", Attributes = [NoThrow, Const, Constexpr, RequiredVectorWidth<256>] in {
   def extractf64x2_256_mask : X86Builtin<"_Vector<2, double>(_Vector<4, double>, _Constant int, _Vector<2, double>, unsigned char)">;
   def extracti64x2_256_mask : X86Builtin<"_Vector<2, long long int>(_Vector<4, long long int>, _Constant int, _Vector<2, long long int>, unsigned char)">;
 }
 
-let Features = "avx512vl", Attributes = [NoThrow, Const, RequiredVectorWidth<256>] in {
+let Features = "avx512vl", Attributes = [NoThrow, Const, Constexpr, RequiredVectorWidth<256>] in {
   def extractf32x4_256_mask : X86Builtin<"_Vector<4, float>(_Vector<8, float>, _Constant int, _Vector<4, float>, unsigned char)">;
   def extracti32x4_256_mask : X86Builtin<"_Vector<4, int>(_Vector<8, int>, _Constant int, _Vector<4, int>, unsigned char)">;
 }
diff --git a/clang/lib/AST/ByteCode/InterpBuiltin.cpp b/clang/lib/AST/ByteCode/InterpBuiltin.cpp
index 922d67940e22f..d844f93d1b983 100644
--- a/clang/lib/AST/ByteCode/InterpBuiltin.cpp
+++ b/clang/lib/AST/ByteCode/InterpBuiltin.cpp
@@ -2819,6 +2819,92 @@ static bool interp__builtin_elementwise_triop(
   return true;
 }
 
+static bool interp__builtin_x86_extract_vector(InterpState &S, CodePtr OpPC,
+                                                 const CallExpr *Call,
+                                                 unsigned ID) {
+  assert(Call->getNumArgs() == 2);
+
+  APSInt ImmAPS = popToAPSInt(S, Call->getArg(1));
+  uint64_t Index = ImmAPS.getZExtValue();
+
+  const Pointer &Src = S.Stk.pop<Pointer>();
+  if (!Src.getFieldDesc()->isPrimitiveArray())
+    return false;
+
+  const Pointer &Dst = S.Stk.peek<Pointer>();
+  if (!Dst.getFieldDesc()->isPrimitiveArray())
+    return false;
+
+  unsigned SrcElems = Src.getNumElems();
+  unsigned DstElems = Dst.getNumElems();
+
+  if (SrcElems == 0 || DstElems == 0 || (SrcElems % DstElems) != 0)
+    return false;
+
+  unsigned NumLanes = SrcElems / DstElems;
+  unsigned Lane = static_cast<unsigned>(Index % NumLanes);
+  unsigned ExtractPos = Lane * DstElems;
+
+  PrimType ElemT = Src.getFieldDesc()->getPrimType();
+  if (ElemT != Dst.getFieldDesc()->getPrimType())
+    return false;
+
+  TYPE_SWITCH(ElemT, {
+    for (unsigned I = 0; I != DstElems; ++I) {
+      Dst.elem<T>(I) = Src.elem<T>(ExtractPos + I);
+    }
+  });
+
+  Dst.initializeAllElements();
+  return true;
+}
+
+static bool interp__builtin_x86_extract_vector_masked(InterpState &S, CodePtr OpPC,
+                                                      const CallExpr *Call,
+                                                      unsigned ID) {
+  assert(Call->getNumArgs() == 4);
+
+  APSInt MaskAPS = popToAPSInt(S, Call->getArg(3));
+  const Pointer &Merge = S.Stk.pop<Pointer>();
+  APSInt ImmAPS = popToAPSInt(S, Call->getArg(1));
+  const Pointer &Src = S.Stk.pop<Pointer>();
+
+  if (!Src.getFieldDesc()->isPrimitiveArray() || !Merge.getFieldDesc()->isPrimitiveArray())
+    return false;
+
+  const Pointer &Dst = S.Stk.peek<Pointer>();
+  if (!Dst.getFieldDesc()->isPrimitiveArray())
+    return false;
+
+  unsigned SrcElems = Src.getNumElems();
+  unsigned DstElems = Dst.getNumElems();
+  if (!SrcElems || !DstElems || (SrcElems % DstElems) != 0)
+    return false;
+
+  PrimType ElemT = Src.getFieldDesc()->getPrimType();
+  if (ElemT != Dst.getFieldDesc()->getPrimType() ||
+      ElemT != Merge.getFieldDesc()->getPrimType())
+    return false;
+
+  unsigned NumLanes = SrcElems / DstElems;
+  unsigned Lane = static_cast<unsigned>(ImmAPS.getZExtValue() % NumLanes);
+  unsigned Base = Lane * DstElems;
+
+  uint64_t Mask = MaskAPS.getZExtValue();
+
+  TYPE_SWITCH(ElemT, {
+    for (unsigned I = 0; I != DstElems; ++I) {
+      if ((Mask >> I) & 1)
+        Dst.elem<T>(I) = Src.elem<T>(Base + I);
+      else
+        Dst.elem<T>(I) = Merge.elem<T>(I);   
+    }
+  });
+
+  Dst.initializeAllElements();
+  return true;
+}
+
 static bool interp__builtin_x86_insert_subvector(InterpState &S, CodePtr OpPC,
                                                  const CallExpr *Call,
                                                  unsigned ID) {
@@ -3451,6 +3537,25 @@ bool InterpretBuiltin(InterpState &S, CodePtr OpPC, const CallExpr *Call,
         S, OpPC, Call, [](const APSInt &LHS, const APSInt &RHS) {
           return LHS.isSigned() ? LHS.ssub_sat(RHS) : LHS.usub_sat(RHS);
         });
+  case X86::BI__builtin_ia32_extract128i256:       
+  case X86::BI__builtin_ia32_vextractf128_pd256:  
+  case X86::BI__builtin_ia32_vextractf128_ps256:  
+  case X86::BI__builtin_ia32_vextractf128_si256:   
+    return interp__builtin_x86_extract_vector(S, OpPC, Call, BuiltinID);
+
+  case X86::BI__builtin_ia32_extractf32x4_256_mask:
+  case X86::BI__builtin_ia32_extractf32x4_mask:
+  case X86::BI__builtin_ia32_extractf32x8_mask:
+  case X86::BI__builtin_ia32_extractf64x2_256_mask:
+  case X86::BI__builtin_ia32_extractf64x2_512_mask:
+  case X86::BI__builtin_ia32_extractf64x4_mask:
+  case X86::BI__builtin_ia32_extracti32x4_256_mask:
+  case X86::BI__builtin_ia32_extracti32x4_mask:
+  case X86::BI__builtin_ia32_extracti32x8_mask:
+  case X86::BI__builtin_ia32_extracti64x2_256_mask:
+  case X86::BI__builtin_ia32_extracti64x2_512_mask:
+  case X86::BI__builtin_ia32_extracti64x4_mask:
+    return interp__builtin_x86_extract_vector_masked(S, OpPC, Call, BuiltinID);
 
   case clang::X86::BI__builtin_ia32_pavgb128:
   case clang::X86::BI__builtin_ia32_pavgw128:
diff --git a/clang/lib/AST/ExprConstant.cpp b/clang/lib/AST/ExprConstant.cpp
index 35a866ea5010f..5ac7837aa0fe3 100644
--- a/clang/lib/AST/ExprConstant.cpp
+++ b/clang/lib/AST/ExprConstant.cpp
@@ -11769,6 +11769,81 @@ bool VectorExprEvaluator::VisitCallExpr(const CallExpr *E) {
     return EvaluateBinOpExpr([](const APSInt &LHS, const APSInt &RHS) {
       return LHS.isSigned() ? LHS.ssub_sat(RHS) : LHS.usub_sat(RHS);
     });
+    
+  case X86::BI__builtin_ia32_extract128i256: 
+  case X86::BI__builtin_ia32_vextractf128_pd256:
+  case X86::BI__builtin_ia32_vextractf128_ps256:
+  case X86::BI__builtin_ia32_vextractf128_si256: {
+    APValue SourceVec, SourceImm;
+    if (!EvaluateAsRValue(Info, E->getArg(0), SourceVec) ||
+        !EvaluateAsRValue(Info, E->getArg(1), SourceImm))
+      return false;
+    
+    if (!SourceVec.isVector())
+      return false;
+
+    const auto *RetVT = E->getType()->castAs<VectorType>();
+    if (!RetVT) return false;
+
+    unsigned RetLen = RetVT->getNumElements();
+    unsigned SrcLen = SourceVec.getVectorLength();
+    if (SrcLen != RetLen * 2) 
+      return false;
+
+    unsigned Idx = SourceImm.getInt().getZExtValue() & 1;
+    
+    SmallVector<APValue, 32> ResultElements;
+    ResultElements.reserve(RetLen);
+
+    for (unsigned I = 0; I < RetLen; I++)
+      ResultElements.push_back(SourceVec.getVectorElt(Idx * RetLen + I));
+    
+    return Success(APValue(ResultElements.data(), RetLen), E);
+  }
+
+  case X86::BI__builtin_ia32_extracti32x4_256_mask: 
+  case X86::BI__builtin_ia32_extractf32x4_256_mask:
+  case X86::BI__builtin_ia32_extracti32x4_mask:     
+  case X86::BI__builtin_ia32_extractf32x4_mask:   
+  case X86::BI__builtin_ia32_extracti32x8_mask:      
+  case X86::BI__builtin_ia32_extractf32x8_mask:     
+  case X86::BI__builtin_ia32_extracti64x2_256_mask: 
+  case X86::BI__builtin_ia32_extractf64x2_256_mask: 
+  case X86::BI__builtin_ia32_extracti64x2_512_mask: 
+  case X86::BI__builtin_ia32_extractf64x2_512_mask:
+  case X86::BI__builtin_ia32_extracti64x4_mask:
+  case X86::BI__builtin_ia32_extractf64x4_mask:{ 
+    APValue SourceVec, MergeVec;
+    APSInt Imm, MaskImm;
+
+    if (!EvaluateAsRValue(Info, E->getArg(0), SourceVec) ||  
+      !EvaluateInteger(E->getArg(1), Imm, Info) ||  
+      !EvaluateAsRValue(Info, E->getArg(2), MergeVec)  ||  
+      !EvaluateInteger(E->getArg(3), MaskImm, Info))      
+    return false;
+
+    const auto *RetVT = E->getType()->castAs<VectorType>();
+    unsigned RetLen = RetVT->getNumElements();
+
+    if (!SourceVec.isVector() || !MergeVec.isVector()) return false;
+    unsigned SrcLen = SourceVec.getVectorLength();
+    if (!SrcLen || !RetLen || (SrcLen % RetLen) != 0) return false;
+
+    unsigned Lanes = SrcLen / RetLen;
+    unsigned Lane = static_cast<unsigned>(Imm.getZExtValue() % Lanes);
+    unsigned Base = Lane * RetLen;
+    uint64_t Mask = MaskImm.getZExtValue();
+
+    SmallVector<APValue, 32> ResultElements;
+    ResultElements.reserve(RetLen);
+    for (unsigned I = 0; I < RetLen; ++I) {
+      if ((Mask >> I) & 1)
+        ResultElements.push_back(SourceVec.getVectorElt(Base + I));
+      else
+        ResultElements.push_back(MergeVec.getVectorElt(I)); 
+    }
+    return Success(APValue(ResultElements.data(), ResultElements.size()), E);
+  }
 
   case clang::X86::BI__builtin_ia32_pavgb128:
   case clang::X86::BI__builtin_ia32_pavgw128:
diff --git a/clang/lib/Headers/avx512dqintrin.h b/clang/lib/Headers/avx512dqintrin.h
index fb65bf933b8ad..953285d6ab414 100644
--- a/clang/lib/Headers/avx512dqintrin.h
+++ b/clang/lib/Headers/avx512dqintrin.h
@@ -1214,7 +1214,7 @@ _mm512_maskz_broadcast_i64x2(__mmask8 __M, __m128i __A)
 
 #define _mm512_extractf32x8_ps(A, imm) \
   ((__m256)__builtin_ia32_extractf32x8_mask((__v16sf)(__m512)(A), (int)(imm), \
-                                            (__v8sf)_mm256_undefined_ps(), \
+                                            (__v8sf)_mm_setzero_pd(), \
                                             (__mmask8)-1))
 
 #define _mm512_mask_extractf32x8_ps(W, U, A, imm) \
@@ -1230,7 +1230,7 @@ _mm512_maskz_broadcast_i64x2(__mmask8 __M, __m128i __A)
 #define _mm512_extractf64x2_pd(A, imm) \
   ((__m128d)__builtin_ia32_extractf64x2_512_mask((__v8df)(__m512d)(A), \
                                                  (int)(imm), \
-                                                 (__v2df)_mm_undefined_pd(), \
+                                                 (__v2df)_mm_setzero_pd(), \
                                                  (__mmask8)-1))
 
 #define _mm512_mask_extractf64x2_pd(W, U, A, imm) \
@@ -1247,7 +1247,7 @@ _mm512_maskz_broadcast_i64x2(__mmask8 __M, __m128i __A)
 
 #define _mm512_extracti32x8_epi32(A, imm) \
   ((__m256i)__builtin_ia32_extracti32x8_mask((__v16si)(__m512i)(A), (int)(imm), \
-                                             (__v8si)_mm256_undefined_si256(), \
+                                             (__v8si)_mm256_setzero_si256(), \
                                              (__mmask8)-1))
 
 #define _mm512_mask_extracti32x8_epi32(W, U, A, imm) \
@@ -1263,7 +1263,7 @@ _mm512_maskz_broadcast_i64x2(__mmask8 __M, __m128i __A)
 #define _mm512_extracti64x2_epi64(A, imm) \
   ((__m128i)__builtin_ia32_extracti64x2_512_mask((__v8di)(__m512i)(A), \
                                                 (int)(imm), \
-                                                (__v2di)_mm_undefined_si128(), \
+                                                (__v2di)_mm_setzero_si128(), \
                                                 (__mmask8)-1))
 
 #define _mm512_mask_extracti64x2_epi64(W, U, A, imm) \
diff --git a/clang/lib/Headers/avx512fintrin.h b/clang/lib/Headers/avx512fintrin.h
index 80e58425cdd71..2768a5bae887d 100644
--- a/clang/lib/Headers/avx512fintrin.h
+++ b/clang/lib/Headers/avx512fintrin.h
@@ -3166,7 +3166,7 @@ _mm512_maskz_permutex2var_epi64(__mmask8 __U, __m512i __A, __m512i __I,
 
 #define _mm512_extractf64x4_pd(A, I) \
   ((__m256d)__builtin_ia32_extractf64x4_mask((__v8df)(__m512d)(A), (int)(I), \
-                                             (__v4df)_mm256_undefined_pd(), \
+                                             (__v4df)_mm256_setzero_pd(), \
                                              (__mmask8)-1))
 
 #define _mm512_mask_extractf64x4_pd(W, U, A, imm) \
@@ -3181,7 +3181,7 @@ _mm512_maskz_permutex2var_epi64(__mmask8 __U, __m512i __A, __m512i __I,
 
 #define _mm512_extractf32x4_ps(A, I) \
   ((__m128)__builtin_ia32_extractf32x4_mask((__v16sf)(__m512)(A), (int)(I), \
-                                            (__v4sf)_mm_undefined_ps(), \
+                                            (__v4sf)_mm_setzero_ps(), \
                                             (__mmask8)-1))
 
 #define _mm512_mask_extractf32x4_ps(W, U, A, imm) \
@@ -7107,7 +7107,7 @@ _mm512_mask_cvtepi64_storeu_epi16 (void *__P, __mmask8 __M, __m512i __A)
 
 #define _mm512_extracti32x4_epi32(A, imm) \
   ((__m128i)__builtin_ia32_extracti32x4_mask((__v16si)(__m512i)(A), (int)(imm), \
-                                             (__v4si)_mm_undefined_si128(), \
+                                             (__v4si)_mm_setzero_si128(), \
                                              (__mmask8)-1))
 
 #define _mm512_mask_extracti32x4_epi32(W, U, A, imm) \
@@ -7122,7 +7122,7 @@ _mm512_mask_cvtepi64_storeu_epi16 (void *__P, __mmask8 __M, __m512i __A)
 
 #define _mm512_extracti64x4_epi64(A, imm) \
   ((__m256i)__builtin_ia32_extracti64x4_mask((__v8di)(__m512i)(A), (int)(imm), \
-                                             (__v4di)_mm256_undefined_si256(), \
+                                             (__v4di)_mm256_setzero_si256(), \
                                              (__mmask8)-1))
 
 #define _mm512_mask_extracti64x4_epi64(W, U, A, imm) \
diff --git a/clang/lib/Headers/avx512vldqintrin.h b/clang/lib/Headers/avx512vldqintrin.h
index 68bd52e43981a..2d3c4b551e3b0 100644
--- a/clang/lib/Headers/avx512vldqintrin.h
+++ b/clang/lib/Headers/avx512vldqintrin.h
@@ -1075,7 +1075,7 @@ _mm256_maskz_broadcast_i64x2 (__mmask8 __M, __m128i __A)
 #define _mm256_extractf64x2_pd(A, imm) \
   ((__m128d)__builtin_ia32_extractf64x2_256_mask((__v4df)(__m256d)(A), \
                                                  (int)(imm), \
-                                                 (__v2df)_mm_undefined_pd(), \
+                                                 (__v2df)_mm_setzero_pd(), \
                                                  (__mmask8)-1))
 
 #define _mm256_mask_extractf64x2_pd(W, U, A, imm) \
@@ -1093,7 +1093,7 @@ _mm256_maskz_broadcast_i64x2 (__mmask8 __M, __m128i __A)
 #define _mm256_extracti64x2_epi64(A, imm) \
   ((__m128i)__builtin_ia32_extracti64x2_256_mask((__v4di)(__m256i)(A), \
                                                 (int)(imm), \
-                                                (__v2di)_mm_undefined_si128(), \
+                                                (__v2di)_mm_setzero_si128(), \
                                                 (__mmask8)-1))
 
 #define _mm256_mask_extracti64x2_epi64(W, U, A, imm) \
diff --git a/clang/lib/Headers/avx512vlintrin.h b/clang/lib/Headers/avx512vlintrin.h
index 965741f0ff944..252fb111988b0 100644
--- a/clang/lib/Headers/avx512vlintrin.h
+++ b/clang/lib/Headers/avx512vlintrin.h
@@ -7609,7 +7609,7 @@ _mm256_mask_cvtepi64_storeu_epi16 (void * __P, __mmask8 __M, __m256i __A)
 #define _mm256_extractf32x4_ps(A, imm) \
   ((__m128)__builtin_ia32_extractf32x4_256_mask((__v8sf)(__m256)(A), \
                                                 (int)(imm), \
-                                                (__v4sf)_mm_undefined_ps(), \
+                                                (__v4sf)_mm_setzero_ps(), \
                                                 (__mmask8)-1))
 
 #define _mm256_mask_extractf32x4_ps(W, U, A, imm) \
@@ -7627,7 +7627,7 @@ _mm256_mask_cvtepi64_storeu_epi16 (void * __P, __mmask8 __M, __m256i __A)
 #define _mm256_extracti32x4_epi32(A, imm) \
   ((__m128i)__builtin_ia32_extracti32x4_256_mask((__v8si)(__m256i)(A), \
                                                  (int)(imm), \
-                                                 (__v4si)_mm_undefined_si128(), \
+                                                 (__v4si)_mm_setzero_si128(), \
                                                  (__mmask8)-1))
 
 #define _mm256_mask_extracti32x4_epi32(W, U, A, imm) \
diff --git a/clang/test/CodeGen/X86/avx-builtins.c b/clang/test/CodeGen/X86/avx-builtins.c
index 5f08b6be81ab7..11ed8498b8ecd 100644
--- a/clang/test/CodeGen/X86/avx-builtins.c
+++ b/clang/test/CodeGen/X86/avx-builtins.c
@@ -1070,19 +1070,25 @@ __m128d test_mm256_extractf128_pd(__m256d A) {
   // CHECK: shufflevector <4 x double> %{{.*}}, <4 x double> poison, <2 x i32> <i32 2, i32 3>
   return _mm256_extractf128_pd(A, 1);
 }
+TEST_CONSTEXPR(match_m128d(_mm256_extractf128_pd(((__m256d){0.0, 1.0, 2.0, 3.0}), 1),
+              2.0, 3.0));
 
 __m128 test_mm256_extractf128_ps(__m256 A) {
   // CHECK-LABEL: test_mm256_extractf128_ps
   // CHECK: shufflevector <8 x float> %{{.*}}, <8 x float> poison, <4...
[truncated]

github-actions · 2025-10-14T09:58:42Z

✅ With the latest revision this PR passed the C/C++ code formatter.

clang/include/clang/Basic/BuiltinsX86.td

clang/lib/AST/ByteCode/InterpBuiltin.cpp

clang/lib/AST/ExprConstant.cpp

clang/lib/AST/ByteCode/InterpBuiltin.cpp

SeongjaeP · 2025-10-16T05:59:07Z

I really appreciate your review.
Please let me know if there’s anything I could improve or prepare ahead of time

clang/lib/AST/ExprConstant.cpp

RKSimon

LGTM - cheers

SeongjaeP · 2025-10-19T03:40:40Z

Thanks @RKSimon and @tbaederr for the feedback and reviews, really appreciate the help getting my first LLVM PR across the line

SeongjaeP · 2025-10-20T10:04:53Z

LGTM - cheers

Just wondering — do I need to look into the CI failures, or are they safe to ignore?
I ran ninja check-clang locally and all tests passed on my end.

RKSimon

CI failure

clang/lib/Headers/avx512dqintrin.h

github-actions · 2025-10-20T17:42:32Z

@SeongjaeP Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

… AVX/AVX512 subvector extraction intrinsics to be used in constexpr llvm#157712 (llvm#162836) **This PR supersedes and replaces PR llvm#158853** The original branch diverged too far from the main branch, resulting in significant merge conflicts that were difficult to resolve cleanly. To provide a clean and reviewable history, this new PR was created by cherry-picking the necessary commits onto a fresh branch based on the latest `main`. --- *(Original Description)* This patch enables the use of AVX/AVX512 subvector extraction intrinsics within `constexpr` functions. This is achieved by implementing the evaluation logic for these intrinsics in `VectorExprEvaluator::VisitCallExpr` and `InterpretBuiltin`. The original discussion and review comments can be found in the previous pull request for context: llvm#158853 Fixes llvm#157712

SeongjaeP and others added 7 commits October 10, 2025 18:40

Apply style fixes and rebase onto upstream

3128a37

Refactoring and Test Pass

d6b83cc

Add Constexpr

7525eec

[Clang] Rebase AVX feature branch onto main

3a468d7

[Clang] Refactor: Use setzero for explicit initialization

e73b88c

[Clang] Add TEST_CONSTEXPR macro for testing

5b7988f

Merge branch 'main' into new-feature

4453c58

SeongjaeP marked this pull request as ready for review October 10, 2025 12:14

llvmbot added clang Clang issues not falling into any other category backend:X86 clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:headers Headers provided by Clang, e.g. for intrinsics clang:bytecode Issues for the clang bytecode constexpr interpreter labels Oct 10, 2025

SeongjaeP and others added 4 commits October 10, 2025 21:26

Merge branch 'main' into new-feature

383886c

Merge branch 'main' into new-feature

b29aee7

Merge branch 'main' into new-feature

c1e7132

Merge branch 'main' into new-feature

870d50b

RKSimon requested review from RKSimon and tbaederr October 14, 2025 09:54

RKSimon requested changes Oct 14, 2025

View reviewed changes

SeongjaeP added 2 commits October 15, 2025 03:22

Move relevant AVX builtins to a Constexpr attribute block

454431b

Refactor and remove unnecessary code

26cb15c

RKSimon reviewed Oct 15, 2025

View reviewed changes

clang/lib/AST/ByteCode/InterpBuiltin.cpp Outdated Show resolved Hide resolved

clang/lib/AST/ByteCode/InterpBuiltin.cpp Outdated Show resolved Hide resolved

Remove unnecessary code and move code for better structure

aa54ae5

SeongjaeP force-pushed the new-feature branch 2 times, most recently from aa54ae5 to 26cb15c Compare October 16, 2025 02:50

Refactor: apply clang-format to recent changes

656c74f

RKSimon reviewed Oct 16, 2025

View reviewed changes

clang/lib/AST/ExprConstant.cpp Outdated Show resolved Hide resolved

SeongjaeP and others added 2 commits October 17, 2025 11:39

Remove unused code

9bb4796

Merge branch 'main' into new-feature

41eb882

RKSimon approved these changes Oct 17, 2025

View reviewed changes

RKSimon enabled auto-merge (squash) October 17, 2025 10:53

Fix type mismatch in macro

c0f6b6b

auto-merge was automatically disabled October 17, 2025 17:21
Head branch was pushed to by a user without write access

Merge branch 'main' into new-feature

ff2c038

Merge branch 'main' into new-feature

ab0f64f

RKSimon enabled auto-merge (squash) October 19, 2025 12:34

RKSimon requested changes Oct 20, 2025

View reviewed changes

clang/lib/Headers/avx512dqintrin.h Outdated Show resolved Hide resolved

auto-merge was automatically disabled October 20, 2025 14:24
Head branch was pushed to by a user without write access

Fix: CI Failure

f025bd6

SeongjaeP force-pushed the new-feature branch from 6e5d7d8 to f025bd6 Compare October 20, 2025 14:29

Merge branch 'main' into new-feature

52b88f1

RKSimon enabled auto-merge (squash) October 20, 2025 17:31

RKSimon merged commit e10afe0 into llvm:main Oct 20, 2025
9 of 10 checks passed

[Clang] VectorExprEvaluator::VisitCallExpr / InterpretBuiltin - allow AVX/AVX512 subvector extraction intrinsics to be used in constexpr #157712 #162836

[Clang] VectorExprEvaluator::VisitCallExpr / InterpretBuiltin - allow AVX/AVX512 subvector extraction intrinsics to be used in constexpr #157712 #162836

Uh oh!

Conversation

SeongjaeP commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 10, 2025

Uh oh!

llvmbot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SeongjaeP commented Oct 16, 2025

Uh oh!

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

SeongjaeP commented Oct 19, 2025

Uh oh!

SeongjaeP commented Oct 20, 2025

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SeongjaeP commented Oct 10, 2025 •

edited

Loading

llvmbot commented Oct 10, 2025 •

edited

Loading

github-actions bot commented Oct 14, 2025 •

edited

Loading