Skip to content

Conversation

@HendrikHuebner
Copy link
Contributor

Upstreams the sse2 prefetch builtin and adds two more builtins not present in the incubator repo (prefetchh and prefetchw)

@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Nov 14, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 14, 2025

@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clangir

Author: Hendrik Hübner (HendrikHuebner)

Changes

Upstreams the sse2 prefetch builtin and adds two more builtins not present in the incubator repo (prefetchh and prefetchw)


Full diff: https://github.com/llvm/llvm-project/pull/168051.diff

5 Files Affected:

  • (modified) clang/include/clang/CIR/Dialect/IR/CIROps.td (+2-2)
  • (modified) clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp (+30)
  • (added) clang/test/CIR/CodeGen/X86/prefetchw-builtin.c (+36)
  • (modified) clang/test/CIR/CodeGen/X86/sse-builtins.c (+18)
  • (modified) clang/test/CIR/CodeGen/X86/sse2-builtins.c (-1)
diff --git a/clang/include/clang/CIR/Dialect/IR/CIROps.td b/clang/include/clang/CIR/Dialect/IR/CIROps.td
index 2124b1dc62a81..219846ec7a884 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIROps.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIROps.td
@@ -4116,7 +4116,7 @@ def CIR_PrefetchOp : CIR_Op<"prefetch"> {
     $locality is a temporal locality specifier ranging from (0) - no locality,
     to (3) - extremely local, keep in cache. If $locality is not present, the
     default value is 3.
-    
+
     $isWrite specifies whether the prefetch is for a 'read' or 'write'. If
     $isWrite is not specified, it means that prefetch is prepared for 'read'.
   }];
@@ -4150,7 +4150,7 @@ def CIR_ObjSizeOp : CIR_Op<"objsize", [Pure]> {
     When the `min` attribute is present, the operation returns the minimum
     guaranteed accessible size. When absent (max mode), it returns the maximum
     possible object size. Corresponds to `llvm.objectsize`'s `min` argument.
-    
+
     The `dynamic` attribute determines if the value should be evaluated at
     runtime. Corresponds to `llvm.objectsize`'s `dynamic` argument.
 
diff --git a/clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp b/clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
index 2d6cf30fa2ded..6f23f2be79e91 100644
--- a/clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
@@ -21,6 +21,11 @@
 using namespace clang;
 using namespace clang::CIRGen;
 
+/// Get integer from a mlir::Value that is an int constant or a constant op.
+static int64_t getIntValueFromConstOp(mlir::Value val) {
+  return val.getDefiningOp<cir::ConstantOp>().getIntValue().getSExtValue();
+}
+
 template <typename... Operands>
 static mlir::Value emitIntrinsicCallOp(CIRGenFunction &cgf, const CallExpr *e,
                                        const std::string &str,
@@ -34,6 +39,28 @@ static mlir::Value emitIntrinsicCallOp(CIRGenFunction &cgf, const CallExpr *e,
       .getResult();
 }
 
+static mlir::Value emitPrefetch(CIRGenFunction &cgf, unsigned builtinID,
+                                const CallExpr *e,
+                                mlir::Value &addr, int64_t hint) {
+  CIRGenBuilderTy &builder = cgf.getBuilder();
+  mlir::Location location = cgf.getLoc(e->getExprLoc());
+  mlir::Type voidTy = builder.getVoidTy();
+  mlir::Value address = builder.createPtrBitcast(addr, voidTy);
+  bool isWrite{};
+  int locality{};
+
+  if (builtinID == X86::BI_mm_prefetch) {
+    isWrite = (hint >> 2) & 0x1;
+    locality = hint & 0x3;
+  } else {
+    isWrite = (builtinID == X86::BI_m_prefetchw);
+    locality = 0x3;
+  }
+
+  cir::PrefetchOp::create(builder, location, address, locality, isWrite);
+  return {};
+}
+
 mlir::Value CIRGenFunction::emitX86BuiltinExpr(unsigned builtinID,
                                                const CallExpr *e) {
   if (builtinID == Builtin::BI__builtin_cpu_is) {
@@ -87,6 +114,9 @@ mlir::Value CIRGenFunction::emitX86BuiltinExpr(unsigned builtinID,
   case X86::BI_mm_sfence:
     return emitIntrinsicCallOp(*this, e, "x86.sse.sfence", voidTy);
   case X86::BI_mm_prefetch:
+  case X86::BI_m_prefetch:
+  case X86::BI_m_prefetchw:
+      return emitPrefetch(*this, builtinID, e, ops[0], getIntValueFromConstOp(ops[1]));
   case X86::BI__rdtsc:
   case X86::BI__builtin_ia32_rdtscp:
   case X86::BI__builtin_ia32_lzcnt_u16:
diff --git a/clang/test/CIR/CodeGen/X86/prefetchw-builtin.c b/clang/test/CIR/CodeGen/X86/prefetchw-builtin.c
new file mode 100644
index 0000000000000..fbf7894ba69b2
--- /dev/null
+++ b/clang/test/CIR/CodeGen/X86/prefetchw-builtin.c
@@ -0,0 +1,36 @@
+
+// RUN: %clang_cc1 -x c -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-unknown-linux -target-feature +sse -fclangir -emit-cir -o %t.cir -Wall -Werror
+// RUN: FileCheck --check-prefix=CIR --input-file=%t.cir %s
+// RUN: %clang_cc1 -x c -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-unknown-linux -target-feature +sse -fclangir -emit-llvm -o %t.ll -Wall -Werror
+// RUN: FileCheck --check-prefixes=LLVM --input-file=%t.ll %s
+
+// RUN: %clang_cc1 -x c++ -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-unknown-linux -target-feature +sse -fno-signed-char -fclangir -emit-cir -o %t.cir -Wall -Werror
+// RUN: FileCheck --check-prefix=CIR --input-file=%t.cir %s
+// RUN: %clang_cc1 -x c++ -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-unknown-linux -target-feature +sse -fclangir -emit-llvm -o %t.ll -Wall -Werror
+// RUN: FileCheck --check-prefixes=LLVM --input-file=%t.ll %s
+
+// RUN: %clang_cc1 -x c -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-unknown-linux -target-feature +sse -emit-llvm -o - -Wall -Werror | FileCheck %s -check-prefix=OGCG
+// RUN: %clang_cc1 -x c++ -flax-vector-conversions=none -ffreestanding %s -triple=x86_64-unknown-linux -target-feature +sse -emit-llvm -o - -Wall -Werror | FileCheck %s -check-prefix=OGCG
+
+
+#include <x86intrin.h>
+
+void test_m_prefetch(void *p) {
+  // CIR-LABEL: test_m_prefetch
+  // LLVM-LABEL: test_m_prefetch
+  // OGCG-LABEL: test_m_prefetch
+  return _m_prefetch(p);
+  // CIR: cir.prefetch read locality(0) %{{.*}} : !cir.ptr<!void>
+  // LLVM: call void @llvm.prefetch.p0(ptr {{.*}}, i32 0, i32 0, i32 1)
+  // OGCG: call void @llvm.prefetch.p0(ptr {{.*}}, i32 0, i32 0, i32 1)
+}
+
+void test_m_prefetch_w(void *p) {
+  // CIR-LABEL: test_m_prefetch_w
+  // LLVM-LABEL: test_m_prefetch_w
+  // OGCG-LABEL: test_m_prefetch_w
+  return _m_prefetchw(p);
+  // CIR: cir.prefetch write locality(0) %{{.*}} : !cir.ptr<!void>
+  // LLVM: call void @llvm.prefetch.p0(ptr {{.*}}, i32 0, i32 0, i32 1)
+  // OGCG: call void @llvm.prefetch.p0(ptr {{.*}}, i32 0, i32 0, i32 1)
+}
diff --git a/clang/test/CIR/CodeGen/X86/sse-builtins.c b/clang/test/CIR/CodeGen/X86/sse-builtins.c
index 3a61018741958..07c586d7c0b8c 100644
--- a/clang/test/CIR/CodeGen/X86/sse-builtins.c
+++ b/clang/test/CIR/CodeGen/X86/sse-builtins.c
@@ -26,3 +26,21 @@ void test_mm_sfence(void) {
   // LLVM: call void @llvm.x86.sse.sfence()
   // OGCG: call void @llvm.x86.sse.sfence()
 }
+
+void test_mm_prefetch(char const* p) {
+  // CIR-LABEL: test_mm_prefetch
+  // LLVM-LABEL: test_mm_prefetch
+  _mm_prefetch(p, 0);
+  // CIR: cir.prefetch read locality(0) %{{.*}} : !cir.ptr<!void>
+  // LLVM: call void @llvm.prefetch.p0(ptr {{.*}}, i32 0, i32 0, i32 1)
+  // OGCG: call void @llvm.prefetch.p0(ptr {{.*}}, i32 0, i32 0, i32 1)
+}
+
+void test_mm_prefetch_local(char const* p) {
+  // CIR-LABEL: test_mm_prefetch_local
+  // LLVM-LABEL: test_mm_prefetch_local
+  _mm_prefetch(p, 3);
+  // CIR: cir.prefetch read locality(3) %{{.*}} : !cir.ptr<!void>
+  // LLVM: call void @llvm.prefetch.p0(ptr {{.*}}, i32 0, i32 3, i32 1)
+  // OGCG: call void @llvm.prefetch.p0(ptr {{.*}}, i32 0, i32 3, i32 1)
+}
diff --git a/clang/test/CIR/CodeGen/X86/sse2-builtins.c b/clang/test/CIR/CodeGen/X86/sse2-builtins.c
index 144ca143fbf15..3429eee093f6a 100644
--- a/clang/test/CIR/CodeGen/X86/sse2-builtins.c
+++ b/clang/test/CIR/CodeGen/X86/sse2-builtins.c
@@ -16,7 +16,6 @@
 
 #include <immintrin.h>
 
-
 void test_mm_clflush(void* A) {
   // CIR-LABEL: test_mm_clflush
   // LLVM-LABEL: test_mm_clflush

}

cir::PrefetchOp::create(builder, location, address, locality, isWrite);
return {};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I'm not sure if this is correct: Can I somehow convert the PrefetchOp into an mlir::Value and return it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I requested this change from the incubator because I thought we should be using CIR-specific operations when we have them, and in this case the CIR operation is exactly equivalent to the LLVM intrinsic.

I see that our handling of BI__builtin_prefetch returns RValue::get(nullptr) from emitBuiltinExpr, which is what we'd get by returning {} here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd have to add a result to the operation. Incubator has support for case X86::BI_mm_prefetch: which should be somewhat similar and shared here, it's using cir.llvm.intrinsic directly, which has a result. If there are reasons to deviate here they should also be replayed in the incubator.

Copy link
Contributor Author

@HendrikHuebner HendrikHuebner Nov 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just returning {} a.k.a nullptr does not work here I think because the caller assumes the builtin was not emitted and throws NYI. Instead, how about we make it return an optional<mlir::Value>? If the optional is present the builtin was generated, and if the value is nullptr, there is no return value. I think void type values are also a thing in mlir, so maybe we could use that as well? If you prefer that, how would I construct one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is going to work because we return {} for all of the cases where we have emitted an NYI diagnostic and for the default, where we have done nothing and don't intend to do anything.

Copy link
Contributor Author

@HendrikHuebner HendrikHuebner Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced the return type of the caller with std::optional<mlir::Value> to handle this case. I tested it locally and it still emits the NYI message for missing builtins, and as you can see the tests are passing.

@github-actions
Copy link

github-actions bot commented Nov 14, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

}

cir::PrefetchOp::create(builder, location, address, locality, isWrite);
return {};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd have to add a result to the operation. Incubator has support for case X86::BI_mm_prefetch: which should be somewhat similar and shared here, it's using cir.llvm.intrinsic directly, which has a result. If there are reasons to deviate here they should also be replayed in the incubator.

}

cir::PrefetchOp::create(builder, location, address, locality, isWrite);
return {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I requested this change from the incubator because I thought we should be using CIR-specific operations when we have them, and in this case the CIR operation is exactly equivalent to the LLVM intrinsic.

I see that our handling of BI__builtin_prefetch returns RValue::get(nullptr) from emitBuiltinExpr, which is what we'd get by returning {} here.

@github-actions
Copy link

github-actions bot commented Nov 17, 2025

🐧 Linux x64 Test Results

  • 112048 tests passed
  • 4077 tests skipped


// Now see if we can emit a target-specific builtin.
if (mlir::Value v = emitTargetBuiltinExpr(builtinID, e, returnValue)) {
std::optional<mlir::Value> valueOpt =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just working around the fact that emitTargetBuiltinExpr is not returning an RValue already. It's a bit more work but you should change emitTargetBuiltinExpr to return an RValue instead. You can then change evalKind below and use isScalar(), etc methods from RValue.

Also, whenever you need to emit RValue::get(nullptr), please use RValue::getIgnored() instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code. I would still like to return the optional for emitX86BuiltinExpr, because this keeps the code a little cleaner since we don't have to call RValue::get for each of the cases in the big switch statement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants