Skip to content

Conversation

RiverDave
Copy link
Contributor

@RiverDave RiverDave commented Sep 27, 2025

This PR adds support for address spaces in CIR pointer types by:

  1. Introducing a TargetAddressSpaceAttr to represent target-specific numeric address spaces (A Lang-specific attribute is to be implemented in a different PR)
  2. Extending the PointerType to include an optional address space parameter
  3. Adding helper methods in CIRBaseBuilder to create pointers with address spaces
  4. Implementing custom parsers and printers for address space attributes
  5. Updating the LLVM lowering to properly handle address spaces when converting CIR to LLVM IR

The implementation allows for creating pointers with specific address spaces, which is necessary for supporting language features like Clang's __attribute__((address_space(N))). Address spaces are preserved through the CIR representation and correctly lowered to LLVM IR.

Copy link
Contributor Author

RiverDave commented Sep 27, 2025

Copy link

github-actions bot commented Sep 27, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@RiverDave RiverDave force-pushed the users/riverdave/cir/addrspace-support-for-cir-ptr branch from 93d566f to eb5a41d Compare September 28, 2025 00:08
@RiverDave RiverDave marked this pull request as ready for review September 28, 2025 00:11
@llvmbot llvmbot added clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project labels Sep 28, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 28, 2025

@llvm/pr-subscribers-clangir

Author: David Rivera (RiverDave)

Changes

Related: #160386
Add support for address spaces in CIR pointer types.

This PR adds support for address spaces in CIR pointer types by:

as specified in the incubator:

  1. Introducing a new AddressSpace enum in CIR to represent both language-specific and target-specific address spaces
  2. Extending the PointerType to include an optional address space parameter
  3. Adding helper functions to convert between Clang's LangAS and CIR's AddressSpace
  4. Implementing custom parsers and printers for address space values
  5. Adding support for address space casts in the CIR builder
  6. Updating the LLVM lowering to properly handle address spaces

Some pending items related to the issue above:

  • Handle addrspace casting/conversions
  • Couple AddressSpaceAttr to global ops
  • Couple AddressSpaceAttr to VPtrs

Patch is 25.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/161028.diff

15 Files Affected:

  • (modified) clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h (+23-2)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRAttrs.h (+1)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRAttrs.td (+41)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td (+24)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRTypes.h (+38)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRTypes.td (+45-17)
  • (modified) clang/lib/CIR/CodeGen/CIRGenExpr.cpp (+2-1)
  • (modified) clang/lib/CIR/CodeGen/CIRGenTypeCache.h (+6)
  • (modified) clang/lib/CIR/CodeGen/CIRGenTypes.cpp (+1-1)
  • (modified) clang/lib/CIR/CodeGen/TargetInfo.h (+2)
  • (modified) clang/lib/CIR/Dialect/IR/CIRAttrs.cpp (+9)
  • (modified) clang/lib/CIR/Dialect/IR/CIRTypes.cpp (+115-18)
  • (modified) clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp (+14-5)
  • (added) clang/test/CIR/IR/invalid-addrspace.cir (+36)
  • (added) clang/test/CIR/address-space.c (+22)
diff --git a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
index a3f167e3cde2c..bf4a9b8438982 100644
--- a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+++ b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
@@ -129,8 +129,20 @@ class CIRBaseBuilderTy : public mlir::OpBuilder {
     return cir::PointerType::get(ty);
   }
 
-  cir::PointerType getVoidPtrTy() {
-    return getPointerTo(cir::VoidType::get(getContext()));
+  cir::PointerType getPointerTo(mlir::Type ty, cir::AddressSpace as) {
+    return cir::PointerType::get(ty, as);
+  }
+
+  cir::PointerType getPointerTo(mlir::Type ty, clang::LangAS langAS) {
+    return getPointerTo(ty, cir::toCIRAddressSpace(langAS));
+  }
+
+  cir::PointerType getVoidPtrTy(clang::LangAS langAS = clang::LangAS::Default) {
+    return getPointerTo(cir::VoidType::get(getContext()), langAS);
+  }
+
+  cir::PointerType getVoidPtrTy(cir::AddressSpace as) {
+    return getPointerTo(cir::VoidType::get(getContext()), as);
   }
 
   cir::BoolAttr getCIRBoolAttr(bool state) {
@@ -412,6 +424,15 @@ class CIRBaseBuilderTy : public mlir::OpBuilder {
     return createBitcast(src, getPointerTo(newPointeeTy));
   }
 
+  mlir::Value createAddrSpaceCast(mlir::Location loc, mlir::Value src,
+                                  mlir::Type newTy) {
+    return createCast(loc, cir::CastKind::address_space, src, newTy);
+  }
+
+  mlir::Value createAddrSpaceCast(mlir::Value src, mlir::Type newTy) {
+    return createAddrSpaceCast(src.getLoc(), src, newTy);
+  }
+
   //===--------------------------------------------------------------------===//
   // Binary Operators
   //===--------------------------------------------------------------------===//
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.h b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.h
index 925a9a87e267f..03a6a97dc8c2e 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.h
+++ b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.h
@@ -15,6 +15,7 @@
 
 #include "mlir/IR/Attributes.h"
 #include "mlir/IR/BuiltinAttributeInterfaces.h"
+#include "clang/Basic/AddressSpaces.h"
 
 #include "clang/CIR/Dialect/IR/CIROpsEnums.h"
 
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
index f8358de9a1eb9..01dd2106136ab 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
@@ -601,6 +601,47 @@ def CIR_VTableAttr : CIR_Attr<"VTable", "vtable", [TypedAttrInterface]> {
   }];
 }
 
+//===----------------------------------------------------------------------===//
+// AddressSpaceAttr
+//===----------------------------------------------------------------------===//
+
+def CIR_AddressSpaceAttr : CIR_EnumAttr<CIR_AddressSpace, "address_space"> {
+  let builders = [AttrBuilder<(ins "clang::LangAS":$langAS), [{
+      return $_get($_ctxt, cir::toCIRAddressSpace(langAS));
+    }]>];
+
+  let assemblyFormat = [{
+    `` custom<AddressSpaceValue>($value)
+  }];
+
+  let defaultValue = "cir::AddressSpace::Default";
+
+  let extraClassDeclaration = [{
+    bool isLang() const;
+    bool isTarget() const;
+    unsigned getTargetValue() const;
+    unsigned getAsUnsignedValue() const;
+  }];
+
+  let extraClassDefinition = [{
+    unsigned $cppClass::getAsUnsignedValue() const {
+      return static_cast<unsigned>(getValue());
+    }
+
+    bool $cppClass::isLang() const {
+      return cir::isLangAddressSpace(getValue());
+    }
+
+    bool $cppClass::isTarget() const {
+      return cir::isTargetAddressSpace(getValue());
+    }
+
+    unsigned $cppClass::getTargetValue() const {
+      return cir::getTargetAddressSpaceValue(getValue());
+    }
+  }];
+}
+
 //===----------------------------------------------------------------------===//
 // ConstComplexAttr
 //===----------------------------------------------------------------------===//
diff --git a/clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td b/clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td
index 98b8a31d2a18a..6566b8e771a75 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td
@@ -35,4 +35,28 @@ class CIR_DefaultValuedEnumParameter<EnumAttrInfo info, string value = "">
   let defaultValue = value;
 }
 
+def CIR_AddressSpace
+    : CIR_I32EnumAttr<
+          "AddressSpace", "address space kind",
+          [I32EnumAttrCase<"Default", 0, "default">,
+           I32EnumAttrCase<"OffloadPrivate", 1, "offload_private">,
+           I32EnumAttrCase<"OffloadLocal", 2, "offload_local">,
+           I32EnumAttrCase<"OffloadGlobal", 3, "offload_global">,
+           I32EnumAttrCase<"OffloadConstant", 4, "offload_constant">,
+           I32EnumAttrCase<"OffloadGeneric", 5, "offload_generic">,
+           I32EnumAttrCase<"Target", 6, "target">]> {
+  let description = [{
+    The `address_space` attribute is used to represent address spaces for
+    pointer types in CIR. It provides a unified model on top of `clang::LangAS`
+    and simplifies the representation of address spaces.
+
+    The `value` parameter is an extensible enum, which encodes target address
+    space as an offset to the last language address space. For that reason, the
+    attribute is implemented as custom AddressSpaceAttr, which provides custom
+    printer and parser for the `value` parameter.
+  }];
+
+  let genSpecializedAttr = 0;
+}
+
 #endif // CLANG_CIR_DIALECT_IR_CIRENUMATTR_TD
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRTypes.h b/clang/include/clang/CIR/Dialect/IR/CIRTypes.h
index bfa165cdd945e..6a2b02ce46cd6 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRTypes.h
+++ b/clang/include/clang/CIR/Dialect/IR/CIRTypes.h
@@ -16,6 +16,8 @@
 #include "mlir/IR/BuiltinAttributes.h"
 #include "mlir/IR/Types.h"
 #include "mlir/Interfaces/DataLayoutInterfaces.h"
+#include "clang/Basic/AddressSpaces.h"
+#include "clang/CIR/Dialect/IR/CIROpsEnums.h"
 #include "clang/CIR/Interfaces/CIRTypeInterfaces.h"
 
 namespace cir {
@@ -35,6 +37,42 @@ bool isValidFundamentalIntWidth(unsigned width);
 /// void, or abstract types.
 bool isSized(mlir::Type ty);
 
+//===----------------------------------------------------------------------===//
+// AddressSpace helpers
+//===----------------------------------------------------------------------===//
+
+cir::AddressSpace toCIRAddressSpace(clang::LangAS langAS);
+
+constexpr unsigned getAsUnsignedValue(cir::AddressSpace as) {
+  return static_cast<unsigned>(as);
+}
+
+inline constexpr unsigned targetAddressSpaceOffset =
+    cir::getMaxEnumValForAddressSpace();
+
+// Target address space is used for target-specific address spaces that are not
+// part of the enum. Its value is represented as an offset from the maximum
+// value of the enum. Make sure that it is always the last enum value.
+static_assert(getAsUnsignedValue(cir::AddressSpace::Target) ==
+                  cir::getMaxEnumValForAddressSpace(),
+              "Target address space must be the last enum value");
+
+constexpr bool isTargetAddressSpace(cir::AddressSpace as) {
+  return getAsUnsignedValue(as) >= cir::getMaxEnumValForAddressSpace();
+}
+
+constexpr bool isLangAddressSpace(cir::AddressSpace as) {
+  return !isTargetAddressSpace(as);
+}
+
+constexpr unsigned getTargetAddressSpaceValue(cir::AddressSpace as) {
+  assert(isTargetAddressSpace(as) && "expected target address space");
+  return getAsUnsignedValue(as) - targetAddressSpaceOffset;
+}
+
+constexpr cir::AddressSpace computeTargetAddressSpace(unsigned v) {
+  return static_cast<cir::AddressSpace>(v + targetAddressSpaceOffset);
+}
 } // namespace cir
 
 //===----------------------------------------------------------------------===//
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRTypes.td b/clang/include/clang/CIR/Dialect/IR/CIRTypes.td
index 4eec34cb299ab..13edfc5143650 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRTypes.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRTypes.td
@@ -14,10 +14,12 @@
 #define CLANG_CIR_DIALECT_IR_CIRTYPES_TD
 
 include "clang/CIR/Dialect/IR/CIRDialect.td"
+include "clang/CIR/Dialect/IR/CIREnumAttr.td"
 include "clang/CIR/Dialect/IR/CIRTypeConstraints.td"
 include "clang/CIR/Interfaces/CIRTypeInterfaces.td"
 include "mlir/Interfaces/DataLayoutInterfaces.td"
 include "mlir/IR/AttrTypeBase.td"
+include "mlir/IR/EnumAttr.td"
 
 //===----------------------------------------------------------------------===//
 // CIR Types
@@ -226,31 +228,57 @@ def CIR_PointerType : CIR_Type<"Pointer", "ptr", [
 ]> {
   let summary = "CIR pointer type";
   let description = [{
-    The `!cir.ptr` type represents C and C++ pointer types and C++ reference
-    types, other than pointers-to-members.  The `pointee` type is the type
-    pointed to.
+    The `!cir.ptr` type is a typed pointer type. It is used to represent
+    pointers to objects in C/C++. The type of the pointed-to object is given by
+    the `pointee` parameter. The `addrSpace` parameter is an optional address
+    space attribute that specifies the address space of the pointer. If not
+    specified, the pointer is assumed to be in the default address space.
 
-    TODO(CIR): The address space attribute is not yet implemented.
-  }];
+    The `!cir.ptr` type can point to any type, including fundamental types,
+    records, arrays, vectors, functions, and other pointers. It can also point
+    to incomplete types, such as incomplete records.
 
-  let parameters = (ins "mlir::Type":$pointee);
+    Note: Data-member pointers and method pointers are represented by
+    `!cir.data_member` and `!cir.method` types, respectively not by
+    `!cir.ptr` type.
 
-  let builders = [
-    TypeBuilderWithInferredContext<(ins "mlir::Type":$pointee), [{
-      return $_get(pointee.getContext(), pointee);
-    }]>,
-    TypeBuilder<(ins "mlir::Type":$pointee), [{
-      return $_get($_ctxt, pointee);
-    }]>
-  ];
+    Examples:
 
-  let assemblyFormat = [{
-    `<` $pointee  `>`
+    ```mlir
+    !cir.ptr<!cir.int<u, 8>>
+    !cir.ptr<!cir.float>
+    !cir.ptr<!cir.record<struct "MyStruct">>
+    !cir.ptr<!cir.record<struct "MyStruct">, addrspace(offload_private)>
+    !cir.ptr<!cir.int<u, 8>, addrspace(target<1>)>
+    ```
   }];
 
-  let genVerifyDecl = 1;
+  let parameters = (ins "mlir::Type":$pointee,
+      CIR_DefaultValuedEnumParameter<CIR_AddressSpace,
+                                     "cir::AddressSpace::Default">:$addrSpace);
 
   let skipDefaultBuilders = 1;
+  let builders = [TypeBuilderWithInferredContext<
+                      (ins "mlir::Type":$pointee,
+                          CArg<"cir::AddressSpace",
+                               "cir::AddressSpace::Default">:$addrSpace),
+                      [{
+        return $_get(pointee.getContext(), pointee, addrSpace);
+    }]>,
+                  TypeBuilder<
+                      (ins "mlir::Type":$pointee,
+                          CArg<"cir::AddressSpace",
+                               "cir::AddressSpace::Default">:$addrSpace),
+                      [{
+        return $_get($_ctxt, pointee, addrSpace);
+    }]>];
+
+  let assemblyFormat = [{
+    `<`
+      $pointee
+      ( `,` `addrspace` `(` custom<AddressSpaceValue>($addrSpace)^ `)` )?
+    `>`
+  }];
 
   let extraClassDeclaration = [{
     template <typename ...Types>
diff --git a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
index fa68ad931ba74..44626bbdd1dfa 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
@@ -2053,7 +2053,8 @@ mlir::Value CIRGenFunction::emitAlloca(StringRef name, mlir::Type ty,
   // layout like original CodeGen. The data layout awareness should be done in
   // the lowering pass instead.
   assert(!cir::MissingFeatures::addressSpace());
-  cir::PointerType localVarPtrTy = builder.getPointerTo(ty);
+  cir::PointerType localVarPtrTy =
+      builder.getPointerTo(ty, getCIRAllocaAddressSpace());
   mlir::IntegerAttr alignIntAttr = cgm.getSize(alignment);
 
   mlir::Value addr;
diff --git a/clang/lib/CIR/CodeGen/CIRGenTypeCache.h b/clang/lib/CIR/CodeGen/CIRGenTypeCache.h
index cc3ce09be4f95..b95f0404eb8d9 100644
--- a/clang/lib/CIR/CodeGen/CIRGenTypeCache.h
+++ b/clang/lib/CIR/CodeGen/CIRGenTypeCache.h
@@ -73,6 +73,8 @@ struct CIRGenTypeCache {
   /// The alignment of size_t.
   unsigned char SizeAlignInBytes;
 
+  cir::AddressSpace cirAllocaAddressSpace;
+
   clang::CharUnits getSizeAlign() const {
     return clang::CharUnits::fromQuantity(SizeAlignInBytes);
   }
@@ -80,6 +82,10 @@ struct CIRGenTypeCache {
   clang::CharUnits getPointerAlign() const {
     return clang::CharUnits::fromQuantity(PointerAlignInBytes);
   }
+
+  cir::AddressSpace getCIRAllocaAddressSpace() const {
+    return cirAllocaAddressSpace;
+  }
 };
 
 } // namespace clang::CIRGen
diff --git a/clang/lib/CIR/CodeGen/CIRGenTypes.cpp b/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
index bb24933a22ed7..e65896a9ff109 100644
--- a/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
@@ -417,7 +417,7 @@ mlir::Type CIRGenTypes::convertType(QualType type) {
 
     mlir::Type pointeeType = convertType(elemTy);
 
-    resultType = builder.getPointerTo(pointeeType);
+    resultType = builder.getPointerTo(pointeeType, elemTy.getAddressSpace());
     break;
   }
 
diff --git a/clang/lib/CIR/CodeGen/TargetInfo.h b/clang/lib/CIR/CodeGen/TargetInfo.h
index a5c548aa2c7c4..1c3ba0b9971b3 100644
--- a/clang/lib/CIR/CodeGen/TargetInfo.h
+++ b/clang/lib/CIR/CodeGen/TargetInfo.h
@@ -32,6 +32,8 @@ bool isEmptyFieldForLayout(const ASTContext &context, const FieldDecl *fd);
 /// if the [[no_unique_address]] attribute would have made them empty.
 bool isEmptyRecordForLayout(const ASTContext &context, QualType t);
 
+class CIRGenFunction;
+
 class TargetCIRGenInfo {
   std::unique_ptr<ABIInfo> info;
 
diff --git a/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp b/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
index 95faad6746955..ac3e08c880614 100644
--- a/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
+++ b/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
@@ -43,6 +43,15 @@ parseFloatLiteral(mlir::AsmParser &parser,
                   mlir::FailureOr<llvm::APFloat> &value,
                   cir::FPTypeInterface fpType);
 
+//===----------------------------------------------------------------------===//
+// AddressSpaceAttr
+//===----------------------------------------------------------------------===//
+
+mlir::ParseResult parseAddressSpaceValue(mlir::AsmParser &p,
+                                         cir::AddressSpace &addrSpace);
+
+void printAddressSpaceValue(mlir::AsmPrinter &p, cir::AddressSpace addrSpace);
+
 static mlir::ParseResult parseConstPtr(mlir::AsmParser &parser,
                                        mlir::IntegerAttr &value);
 
diff --git a/clang/lib/CIR/Dialect/IR/CIRTypes.cpp b/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
index 35b4513c5789f..32254fb1e21c1 100644
--- a/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
+++ b/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
@@ -38,6 +38,26 @@ parseFuncTypeParams(mlir::AsmParser &p, llvm::SmallVector<mlir::Type> &params,
 static void printFuncTypeParams(mlir::AsmPrinter &p,
                                 mlir::ArrayRef<mlir::Type> params,
                                 bool isVarArg);
+//===----------------------------------------------------------------------===//
+// CIR Custom Parser/Printer Signatures
+//===----------------------------------------------------------------------===//
+
+static mlir::ParseResult
+parseFuncTypeParams(mlir::AsmParser &p, llvm::SmallVector<mlir::Type> &params,
+                    bool &isVarArg);
+
+static void printFuncTypeParams(mlir::AsmPrinter &p,
+                                mlir::ArrayRef<mlir::Type> params,
+                                bool isVarArg);
+
+//===----------------------------------------------------------------------===//
+// AddressSpace
+//===----------------------------------------------------------------------===//
+
+mlir::ParseResult parseAddressSpaceValue(mlir::AsmParser &p,
+                                         cir::AddressSpace &addrSpace);
+
+void printAddressSpaceValue(mlir::AsmPrinter &p, cir::AddressSpace addrSpace);
 
 //===----------------------------------------------------------------------===//
 // Get autogenerated stuff
@@ -297,6 +317,20 @@ bool RecordType::isLayoutIdentical(const RecordType &other) {
 // Data Layout information for types
 //===----------------------------------------------------------------------===//
 
+llvm::TypeSize
+PointerType::getTypeSizeInBits(const ::mlir::DataLayout &dataLayout,
+                               ::mlir::DataLayoutEntryListRef params) const {
+  // FIXME: improve this in face of address spaces
+  return llvm::TypeSize::getFixed(64);
+}
+
+uint64_t
+PointerType::getABIAlignment(const ::mlir::DataLayout &dataLayout,
+                             ::mlir::DataLayoutEntryListRef params) const {
+  // FIXME: improve this in face of address spaces
+  return 8;
+}
+
 llvm::TypeSize
 RecordType::getTypeSizeInBits(const mlir::DataLayout &dataLayout,
                               mlir::DataLayoutEntryListRef params) const {
@@ -766,30 +800,93 @@ mlir::LogicalResult cir::VectorType::verify(
 }
 
 //===----------------------------------------------------------------------===//
-// PointerType Definitions
-//===----------------------------------------------------------------------===//
-
-llvm::TypeSize
-PointerType::getTypeSizeInBits(const ::mlir::DataLayout &dataLayout,
-                               ::mlir::DataLayoutEntryListRef params) const {
-  // FIXME: improve this in face of address spaces
-  return llvm::TypeSize::getFixed(64);
+// AddressSpace definitions
+//===----------------------------------------------------------------------===//
+
+cir::AddressSpace cir::toCIRAddressSpace(clang::LangAS langAS) {
+  using clang::LangAS;
+  switch (langAS) {
+  case LangAS::Default:
+    return AddressSpace::Default;
+  case LangAS::opencl_global:
+    return AddressSpace::OffloadGlobal;
+  case LangAS::opencl_local:
+  case LangAS::cuda_shared:
+    // Local means local among the work-group (OpenCL) or block (CUDA).
+    // All threads inside the kernel can access local memory.
+    return AddressSpace::OffloadLocal;
+  case LangAS::cuda_device:
+    return AddressSpace::OffloadGlobal;
+  case LangAS::opencl_constant:
+  case LangAS::cuda_constant:
+    return AddressSpace::OffloadConstant;
+  case LangAS::opencl_private:
+    return AddressSpace::OffloadPrivate;
+  case LangAS::opencl_generic:
+    return AddressSpace::OffloadGeneric;
+  case LangAS::opencl_global_device:
+  case LangAS::opencl_global_host:
+  case LangAS::sycl_global:
+  case LangAS::sycl_global_device:
+  case LangAS::sycl_global_host:
+  case LangAS::sycl_local:
+  case LangAS::sycl_private:
+  case LangAS::ptr32_sptr:
+  case LangAS::ptr32_uptr:
+  case LangAS::ptr64:
+  case LangAS::hlsl_groupshared:
+  case LangAS::wasm_funcref:
+    llvm_unreachable("NYI");
+  default:
+    // Target address space offset arithmetics
+    return static_cast<cir::AddressSpace>(clang::toTargetAddressSpace(langAS) +
+                                          cir::getMaxEnumValForAddressSpace());
+  }
 }
 
-uint64_t
-PointerType::getABIAlignment(const ::mlir::DataLayout &dataLayout,
-                             ::mlir::DataLayoutEntryListRef params) const {
-  // FIXME: improve this in face of address spaces
-  return 8;
-}
+mlir::ParseResult parseAddressSpaceValue(mlir::AsmParser &p,
+                                         cir::AddressSpace &addrSpace) {
+  llvm::SMLoc loc = p.getCurrentLocation();
+  mlir::FailureOr<cir::AddressSpace> result =
+      mlir::FieldParser<cir::AddressSpace>::parse(p);
+  if (mlir::failed(result))
+    return p.emitError(loc, "expected address space keyword");
+
+  // Address space is either a target address space or a regular one.
+  // - If it is a target address space, we expect a value to follow in the form
+  // of `<value>`, where value is an integer that represents the target address
+  // space value. This value is kept in the address space enum as an offset
+  // from the maximum address space value, which is defined in
+  // `cir::getMaxEnumValForAddressSpace()`. This allows us to use
+  // the same enum for both regular and target address spaces.
+  // - Otherwise, we just use the parsed value.
+  if (cir::isTargetAddres...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Sep 28, 2025

@llvm/pr-subscribers-clang

Author: David Rivera (RiverDave)

Changes

Related: #160386
Add support for address spaces in CIR pointer types.

This PR adds support for address spaces in CIR pointer types by:

as specified in the incubator:

  1. Introducing a new AddressSpace enum in CIR to represent both language-specific and target-specific address spaces
  2. Extending the PointerType to include an optional address space parameter
  3. Adding helper functions to convert between Clang's LangAS and CIR's AddressSpace
  4. Implementing custom parsers and printers for address space values
  5. Adding support for address space casts in the CIR builder
  6. Updating the LLVM lowering to properly handle address spaces

Some pending items related to the issue above:

  • Handle addrspace casting/conversions
  • Couple AddressSpaceAttr to global ops
  • Couple AddressSpaceAttr to VPtrs

Patch is 25.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/161028.diff

15 Files Affected:

  • (modified) clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h (+23-2)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRAttrs.h (+1)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRAttrs.td (+41)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td (+24)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRTypes.h (+38)
  • (modified) clang/include/clang/CIR/Dialect/IR/CIRTypes.td (+45-17)
  • (modified) clang/lib/CIR/CodeGen/CIRGenExpr.cpp (+2-1)
  • (modified) clang/lib/CIR/CodeGen/CIRGenTypeCache.h (+6)
  • (modified) clang/lib/CIR/CodeGen/CIRGenTypes.cpp (+1-1)
  • (modified) clang/lib/CIR/CodeGen/TargetInfo.h (+2)
  • (modified) clang/lib/CIR/Dialect/IR/CIRAttrs.cpp (+9)
  • (modified) clang/lib/CIR/Dialect/IR/CIRTypes.cpp (+115-18)
  • (modified) clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp (+14-5)
  • (added) clang/test/CIR/IR/invalid-addrspace.cir (+36)
  • (added) clang/test/CIR/address-space.c (+22)
diff --git a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
index a3f167e3cde2c..bf4a9b8438982 100644
--- a/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
+++ b/clang/include/clang/CIR/Dialect/Builder/CIRBaseBuilder.h
@@ -129,8 +129,20 @@ class CIRBaseBuilderTy : public mlir::OpBuilder {
     return cir::PointerType::get(ty);
   }
 
-  cir::PointerType getVoidPtrTy() {
-    return getPointerTo(cir::VoidType::get(getContext()));
+  cir::PointerType getPointerTo(mlir::Type ty, cir::AddressSpace as) {
+    return cir::PointerType::get(ty, as);
+  }
+
+  cir::PointerType getPointerTo(mlir::Type ty, clang::LangAS langAS) {
+    return getPointerTo(ty, cir::toCIRAddressSpace(langAS));
+  }
+
+  cir::PointerType getVoidPtrTy(clang::LangAS langAS = clang::LangAS::Default) {
+    return getPointerTo(cir::VoidType::get(getContext()), langAS);
+  }
+
+  cir::PointerType getVoidPtrTy(cir::AddressSpace as) {
+    return getPointerTo(cir::VoidType::get(getContext()), as);
   }
 
   cir::BoolAttr getCIRBoolAttr(bool state) {
@@ -412,6 +424,15 @@ class CIRBaseBuilderTy : public mlir::OpBuilder {
     return createBitcast(src, getPointerTo(newPointeeTy));
   }
 
+  mlir::Value createAddrSpaceCast(mlir::Location loc, mlir::Value src,
+                                  mlir::Type newTy) {
+    return createCast(loc, cir::CastKind::address_space, src, newTy);
+  }
+
+  mlir::Value createAddrSpaceCast(mlir::Value src, mlir::Type newTy) {
+    return createAddrSpaceCast(src.getLoc(), src, newTy);
+  }
+
   //===--------------------------------------------------------------------===//
   // Binary Operators
   //===--------------------------------------------------------------------===//
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.h b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.h
index 925a9a87e267f..03a6a97dc8c2e 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.h
+++ b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.h
@@ -15,6 +15,7 @@
 
 #include "mlir/IR/Attributes.h"
 #include "mlir/IR/BuiltinAttributeInterfaces.h"
+#include "clang/Basic/AddressSpaces.h"
 
 #include "clang/CIR/Dialect/IR/CIROpsEnums.h"
 
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
index f8358de9a1eb9..01dd2106136ab 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRAttrs.td
@@ -601,6 +601,47 @@ def CIR_VTableAttr : CIR_Attr<"VTable", "vtable", [TypedAttrInterface]> {
   }];
 }
 
+//===----------------------------------------------------------------------===//
+// AddressSpaceAttr
+//===----------------------------------------------------------------------===//
+
+def CIR_AddressSpaceAttr : CIR_EnumAttr<CIR_AddressSpace, "address_space"> {
+  let builders = [AttrBuilder<(ins "clang::LangAS":$langAS), [{
+      return $_get($_ctxt, cir::toCIRAddressSpace(langAS));
+    }]>];
+
+  let assemblyFormat = [{
+    `` custom<AddressSpaceValue>($value)
+  }];
+
+  let defaultValue = "cir::AddressSpace::Default";
+
+  let extraClassDeclaration = [{
+    bool isLang() const;
+    bool isTarget() const;
+    unsigned getTargetValue() const;
+    unsigned getAsUnsignedValue() const;
+  }];
+
+  let extraClassDefinition = [{
+    unsigned $cppClass::getAsUnsignedValue() const {
+      return static_cast<unsigned>(getValue());
+    }
+
+    bool $cppClass::isLang() const {
+      return cir::isLangAddressSpace(getValue());
+    }
+
+    bool $cppClass::isTarget() const {
+      return cir::isTargetAddressSpace(getValue());
+    }
+
+    unsigned $cppClass::getTargetValue() const {
+      return cir::getTargetAddressSpaceValue(getValue());
+    }
+  }];
+}
+
 //===----------------------------------------------------------------------===//
 // ConstComplexAttr
 //===----------------------------------------------------------------------===//
diff --git a/clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td b/clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td
index 98b8a31d2a18a..6566b8e771a75 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIREnumAttr.td
@@ -35,4 +35,28 @@ class CIR_DefaultValuedEnumParameter<EnumAttrInfo info, string value = "">
   let defaultValue = value;
 }
 
+def CIR_AddressSpace
+    : CIR_I32EnumAttr<
+          "AddressSpace", "address space kind",
+          [I32EnumAttrCase<"Default", 0, "default">,
+           I32EnumAttrCase<"OffloadPrivate", 1, "offload_private">,
+           I32EnumAttrCase<"OffloadLocal", 2, "offload_local">,
+           I32EnumAttrCase<"OffloadGlobal", 3, "offload_global">,
+           I32EnumAttrCase<"OffloadConstant", 4, "offload_constant">,
+           I32EnumAttrCase<"OffloadGeneric", 5, "offload_generic">,
+           I32EnumAttrCase<"Target", 6, "target">]> {
+  let description = [{
+    The `address_space` attribute is used to represent address spaces for
+    pointer types in CIR. It provides a unified model on top of `clang::LangAS`
+    and simplifies the representation of address spaces.
+
+    The `value` parameter is an extensible enum, which encodes target address
+    space as an offset to the last language address space. For that reason, the
+    attribute is implemented as custom AddressSpaceAttr, which provides custom
+    printer and parser for the `value` parameter.
+  }];
+
+  let genSpecializedAttr = 0;
+}
+
 #endif // CLANG_CIR_DIALECT_IR_CIRENUMATTR_TD
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRTypes.h b/clang/include/clang/CIR/Dialect/IR/CIRTypes.h
index bfa165cdd945e..6a2b02ce46cd6 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRTypes.h
+++ b/clang/include/clang/CIR/Dialect/IR/CIRTypes.h
@@ -16,6 +16,8 @@
 #include "mlir/IR/BuiltinAttributes.h"
 #include "mlir/IR/Types.h"
 #include "mlir/Interfaces/DataLayoutInterfaces.h"
+#include "clang/Basic/AddressSpaces.h"
+#include "clang/CIR/Dialect/IR/CIROpsEnums.h"
 #include "clang/CIR/Interfaces/CIRTypeInterfaces.h"
 
 namespace cir {
@@ -35,6 +37,42 @@ bool isValidFundamentalIntWidth(unsigned width);
 /// void, or abstract types.
 bool isSized(mlir::Type ty);
 
+//===----------------------------------------------------------------------===//
+// AddressSpace helpers
+//===----------------------------------------------------------------------===//
+
+cir::AddressSpace toCIRAddressSpace(clang::LangAS langAS);
+
+constexpr unsigned getAsUnsignedValue(cir::AddressSpace as) {
+  return static_cast<unsigned>(as);
+}
+
+inline constexpr unsigned targetAddressSpaceOffset =
+    cir::getMaxEnumValForAddressSpace();
+
+// Target address space is used for target-specific address spaces that are not
+// part of the enum. Its value is represented as an offset from the maximum
+// value of the enum. Make sure that it is always the last enum value.
+static_assert(getAsUnsignedValue(cir::AddressSpace::Target) ==
+                  cir::getMaxEnumValForAddressSpace(),
+              "Target address space must be the last enum value");
+
+constexpr bool isTargetAddressSpace(cir::AddressSpace as) {
+  return getAsUnsignedValue(as) >= cir::getMaxEnumValForAddressSpace();
+}
+
+constexpr bool isLangAddressSpace(cir::AddressSpace as) {
+  return !isTargetAddressSpace(as);
+}
+
+constexpr unsigned getTargetAddressSpaceValue(cir::AddressSpace as) {
+  assert(isTargetAddressSpace(as) && "expected target address space");
+  return getAsUnsignedValue(as) - targetAddressSpaceOffset;
+}
+
+constexpr cir::AddressSpace computeTargetAddressSpace(unsigned v) {
+  return static_cast<cir::AddressSpace>(v + targetAddressSpaceOffset);
+}
 } // namespace cir
 
 //===----------------------------------------------------------------------===//
diff --git a/clang/include/clang/CIR/Dialect/IR/CIRTypes.td b/clang/include/clang/CIR/Dialect/IR/CIRTypes.td
index 4eec34cb299ab..13edfc5143650 100644
--- a/clang/include/clang/CIR/Dialect/IR/CIRTypes.td
+++ b/clang/include/clang/CIR/Dialect/IR/CIRTypes.td
@@ -14,10 +14,12 @@
 #define CLANG_CIR_DIALECT_IR_CIRTYPES_TD
 
 include "clang/CIR/Dialect/IR/CIRDialect.td"
+include "clang/CIR/Dialect/IR/CIREnumAttr.td"
 include "clang/CIR/Dialect/IR/CIRTypeConstraints.td"
 include "clang/CIR/Interfaces/CIRTypeInterfaces.td"
 include "mlir/Interfaces/DataLayoutInterfaces.td"
 include "mlir/IR/AttrTypeBase.td"
+include "mlir/IR/EnumAttr.td"
 
 //===----------------------------------------------------------------------===//
 // CIR Types
@@ -226,31 +228,57 @@ def CIR_PointerType : CIR_Type<"Pointer", "ptr", [
 ]> {
   let summary = "CIR pointer type";
   let description = [{
-    The `!cir.ptr` type represents C and C++ pointer types and C++ reference
-    types, other than pointers-to-members.  The `pointee` type is the type
-    pointed to.
+    The `!cir.ptr` type is a typed pointer type. It is used to represent
+    pointers to objects in C/C++. The type of the pointed-to object is given by
+    the `pointee` parameter. The `addrSpace` parameter is an optional address
+    space attribute that specifies the address space of the pointer. If not
+    specified, the pointer is assumed to be in the default address space.
 
-    TODO(CIR): The address space attribute is not yet implemented.
-  }];
+    The `!cir.ptr` type can point to any type, including fundamental types,
+    records, arrays, vectors, functions, and other pointers. It can also point
+    to incomplete types, such as incomplete records.
 
-  let parameters = (ins "mlir::Type":$pointee);
+    Note: Data-member pointers and method pointers are represented by
+    `!cir.data_member` and `!cir.method` types, respectively not by
+    `!cir.ptr` type.
 
-  let builders = [
-    TypeBuilderWithInferredContext<(ins "mlir::Type":$pointee), [{
-      return $_get(pointee.getContext(), pointee);
-    }]>,
-    TypeBuilder<(ins "mlir::Type":$pointee), [{
-      return $_get($_ctxt, pointee);
-    }]>
-  ];
+    Examples:
 
-  let assemblyFormat = [{
-    `<` $pointee  `>`
+    ```mlir
+    !cir.ptr<!cir.int<u, 8>>
+    !cir.ptr<!cir.float>
+    !cir.ptr<!cir.record<struct "MyStruct">>
+    !cir.ptr<!cir.record<struct "MyStruct">, addrspace(offload_private)>
+    !cir.ptr<!cir.int<u, 8>, addrspace(target<1>)>
+    ```
   }];
 
-  let genVerifyDecl = 1;
+  let parameters = (ins "mlir::Type":$pointee,
+      CIR_DefaultValuedEnumParameter<CIR_AddressSpace,
+                                     "cir::AddressSpace::Default">:$addrSpace);
 
   let skipDefaultBuilders = 1;
+  let builders = [TypeBuilderWithInferredContext<
+                      (ins "mlir::Type":$pointee,
+                          CArg<"cir::AddressSpace",
+                               "cir::AddressSpace::Default">:$addrSpace),
+                      [{
+        return $_get(pointee.getContext(), pointee, addrSpace);
+    }]>,
+                  TypeBuilder<
+                      (ins "mlir::Type":$pointee,
+                          CArg<"cir::AddressSpace",
+                               "cir::AddressSpace::Default">:$addrSpace),
+                      [{
+        return $_get($_ctxt, pointee, addrSpace);
+    }]>];
+
+  let assemblyFormat = [{
+    `<`
+      $pointee
+      ( `,` `addrspace` `(` custom<AddressSpaceValue>($addrSpace)^ `)` )?
+    `>`
+  }];
 
   let extraClassDeclaration = [{
     template <typename ...Types>
diff --git a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
index fa68ad931ba74..44626bbdd1dfa 100644
--- a/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenExpr.cpp
@@ -2053,7 +2053,8 @@ mlir::Value CIRGenFunction::emitAlloca(StringRef name, mlir::Type ty,
   // layout like original CodeGen. The data layout awareness should be done in
   // the lowering pass instead.
   assert(!cir::MissingFeatures::addressSpace());
-  cir::PointerType localVarPtrTy = builder.getPointerTo(ty);
+  cir::PointerType localVarPtrTy =
+      builder.getPointerTo(ty, getCIRAllocaAddressSpace());
   mlir::IntegerAttr alignIntAttr = cgm.getSize(alignment);
 
   mlir::Value addr;
diff --git a/clang/lib/CIR/CodeGen/CIRGenTypeCache.h b/clang/lib/CIR/CodeGen/CIRGenTypeCache.h
index cc3ce09be4f95..b95f0404eb8d9 100644
--- a/clang/lib/CIR/CodeGen/CIRGenTypeCache.h
+++ b/clang/lib/CIR/CodeGen/CIRGenTypeCache.h
@@ -73,6 +73,8 @@ struct CIRGenTypeCache {
   /// The alignment of size_t.
   unsigned char SizeAlignInBytes;
 
+  cir::AddressSpace cirAllocaAddressSpace;
+
   clang::CharUnits getSizeAlign() const {
     return clang::CharUnits::fromQuantity(SizeAlignInBytes);
   }
@@ -80,6 +82,10 @@ struct CIRGenTypeCache {
   clang::CharUnits getPointerAlign() const {
     return clang::CharUnits::fromQuantity(PointerAlignInBytes);
   }
+
+  cir::AddressSpace getCIRAllocaAddressSpace() const {
+    return cirAllocaAddressSpace;
+  }
 };
 
 } // namespace clang::CIRGen
diff --git a/clang/lib/CIR/CodeGen/CIRGenTypes.cpp b/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
index bb24933a22ed7..e65896a9ff109 100644
--- a/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
+++ b/clang/lib/CIR/CodeGen/CIRGenTypes.cpp
@@ -417,7 +417,7 @@ mlir::Type CIRGenTypes::convertType(QualType type) {
 
     mlir::Type pointeeType = convertType(elemTy);
 
-    resultType = builder.getPointerTo(pointeeType);
+    resultType = builder.getPointerTo(pointeeType, elemTy.getAddressSpace());
     break;
   }
 
diff --git a/clang/lib/CIR/CodeGen/TargetInfo.h b/clang/lib/CIR/CodeGen/TargetInfo.h
index a5c548aa2c7c4..1c3ba0b9971b3 100644
--- a/clang/lib/CIR/CodeGen/TargetInfo.h
+++ b/clang/lib/CIR/CodeGen/TargetInfo.h
@@ -32,6 +32,8 @@ bool isEmptyFieldForLayout(const ASTContext &context, const FieldDecl *fd);
 /// if the [[no_unique_address]] attribute would have made them empty.
 bool isEmptyRecordForLayout(const ASTContext &context, QualType t);
 
+class CIRGenFunction;
+
 class TargetCIRGenInfo {
   std::unique_ptr<ABIInfo> info;
 
diff --git a/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp b/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
index 95faad6746955..ac3e08c880614 100644
--- a/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
+++ b/clang/lib/CIR/Dialect/IR/CIRAttrs.cpp
@@ -43,6 +43,15 @@ parseFloatLiteral(mlir::AsmParser &parser,
                   mlir::FailureOr<llvm::APFloat> &value,
                   cir::FPTypeInterface fpType);
 
+//===----------------------------------------------------------------------===//
+// AddressSpaceAttr
+//===----------------------------------------------------------------------===//
+
+mlir::ParseResult parseAddressSpaceValue(mlir::AsmParser &p,
+                                         cir::AddressSpace &addrSpace);
+
+void printAddressSpaceValue(mlir::AsmPrinter &p, cir::AddressSpace addrSpace);
+
 static mlir::ParseResult parseConstPtr(mlir::AsmParser &parser,
                                        mlir::IntegerAttr &value);
 
diff --git a/clang/lib/CIR/Dialect/IR/CIRTypes.cpp b/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
index 35b4513c5789f..32254fb1e21c1 100644
--- a/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
+++ b/clang/lib/CIR/Dialect/IR/CIRTypes.cpp
@@ -38,6 +38,26 @@ parseFuncTypeParams(mlir::AsmParser &p, llvm::SmallVector<mlir::Type> &params,
 static void printFuncTypeParams(mlir::AsmPrinter &p,
                                 mlir::ArrayRef<mlir::Type> params,
                                 bool isVarArg);
+//===----------------------------------------------------------------------===//
+// CIR Custom Parser/Printer Signatures
+//===----------------------------------------------------------------------===//
+
+static mlir::ParseResult
+parseFuncTypeParams(mlir::AsmParser &p, llvm::SmallVector<mlir::Type> &params,
+                    bool &isVarArg);
+
+static void printFuncTypeParams(mlir::AsmPrinter &p,
+                                mlir::ArrayRef<mlir::Type> params,
+                                bool isVarArg);
+
+//===----------------------------------------------------------------------===//
+// AddressSpace
+//===----------------------------------------------------------------------===//
+
+mlir::ParseResult parseAddressSpaceValue(mlir::AsmParser &p,
+                                         cir::AddressSpace &addrSpace);
+
+void printAddressSpaceValue(mlir::AsmPrinter &p, cir::AddressSpace addrSpace);
 
 //===----------------------------------------------------------------------===//
 // Get autogenerated stuff
@@ -297,6 +317,20 @@ bool RecordType::isLayoutIdentical(const RecordType &other) {
 // Data Layout information for types
 //===----------------------------------------------------------------------===//
 
+llvm::TypeSize
+PointerType::getTypeSizeInBits(const ::mlir::DataLayout &dataLayout,
+                               ::mlir::DataLayoutEntryListRef params) const {
+  // FIXME: improve this in face of address spaces
+  return llvm::TypeSize::getFixed(64);
+}
+
+uint64_t
+PointerType::getABIAlignment(const ::mlir::DataLayout &dataLayout,
+                             ::mlir::DataLayoutEntryListRef params) const {
+  // FIXME: improve this in face of address spaces
+  return 8;
+}
+
 llvm::TypeSize
 RecordType::getTypeSizeInBits(const mlir::DataLayout &dataLayout,
                               mlir::DataLayoutEntryListRef params) const {
@@ -766,30 +800,93 @@ mlir::LogicalResult cir::VectorType::verify(
 }
 
 //===----------------------------------------------------------------------===//
-// PointerType Definitions
-//===----------------------------------------------------------------------===//
-
-llvm::TypeSize
-PointerType::getTypeSizeInBits(const ::mlir::DataLayout &dataLayout,
-                               ::mlir::DataLayoutEntryListRef params) const {
-  // FIXME: improve this in face of address spaces
-  return llvm::TypeSize::getFixed(64);
+// AddressSpace definitions
+//===----------------------------------------------------------------------===//
+
+cir::AddressSpace cir::toCIRAddressSpace(clang::LangAS langAS) {
+  using clang::LangAS;
+  switch (langAS) {
+  case LangAS::Default:
+    return AddressSpace::Default;
+  case LangAS::opencl_global:
+    return AddressSpace::OffloadGlobal;
+  case LangAS::opencl_local:
+  case LangAS::cuda_shared:
+    // Local means local among the work-group (OpenCL) or block (CUDA).
+    // All threads inside the kernel can access local memory.
+    return AddressSpace::OffloadLocal;
+  case LangAS::cuda_device:
+    return AddressSpace::OffloadGlobal;
+  case LangAS::opencl_constant:
+  case LangAS::cuda_constant:
+    return AddressSpace::OffloadConstant;
+  case LangAS::opencl_private:
+    return AddressSpace::OffloadPrivate;
+  case LangAS::opencl_generic:
+    return AddressSpace::OffloadGeneric;
+  case LangAS::opencl_global_device:
+  case LangAS::opencl_global_host:
+  case LangAS::sycl_global:
+  case LangAS::sycl_global_device:
+  case LangAS::sycl_global_host:
+  case LangAS::sycl_local:
+  case LangAS::sycl_private:
+  case LangAS::ptr32_sptr:
+  case LangAS::ptr32_uptr:
+  case LangAS::ptr64:
+  case LangAS::hlsl_groupshared:
+  case LangAS::wasm_funcref:
+    llvm_unreachable("NYI");
+  default:
+    // Target address space offset arithmetics
+    return static_cast<cir::AddressSpace>(clang::toTargetAddressSpace(langAS) +
+                                          cir::getMaxEnumValForAddressSpace());
+  }
 }
 
-uint64_t
-PointerType::getABIAlignment(const ::mlir::DataLayout &dataLayout,
-                             ::mlir::DataLayoutEntryListRef params) const {
-  // FIXME: improve this in face of address spaces
-  return 8;
-}
+mlir::ParseResult parseAddressSpaceValue(mlir::AsmParser &p,
+                                         cir::AddressSpace &addrSpace) {
+  llvm::SMLoc loc = p.getCurrentLocation();
+  mlir::FailureOr<cir::AddressSpace> result =
+      mlir::FieldParser<cir::AddressSpace>::parse(p);
+  if (mlir::failed(result))
+    return p.emitError(loc, "expected address space keyword");
+
+  // Address space is either a target address space or a regular one.
+  // - If it is a target address space, we expect a value to follow in the form
+  // of `<value>`, where value is an integer that represents the target address
+  // space value. This value is kept in the address space enum as an offset
+  // from the maximum address space value, which is defined in
+  // `cir::getMaxEnumValForAddressSpace()`. This allows us to use
+  // the same enum for both regular and target address spaces.
+  // - Otherwise, we just use the parsed value.
+  if (cir::isTargetAddres...
[truncated]

@RiverDave RiverDave changed the title [CIR] Upstream AddressSpaceAttr support for PointerType [CIR] Upstream AddressSpace support for PointerType Sep 28, 2025
@RiverDave RiverDave force-pushed the users/riverdave/cir/addrspace-support-for-cir-ptr branch 2 times, most recently from 42940a5 to 91950f7 Compare September 28, 2025 04:55
Copy link
Contributor

@xlauko xlauko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was chainging addresspace recently in llvm/clangir#1923
fix it accordingly please.

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this!

I have a couple of questions about the "unified" enum that probably require input from @bcardosolopes. Otherwise, mostly nits.

@xlauko
Copy link
Contributor

xlauko commented Sep 30, 2025

@bcardosolopes @andykaylor In general it might be worthwhile to align more with ptr dialect and its modeling of address spaces. Maybe even use MemorySpaceAttrInterface directly in out pointer type

@andykaylor
Copy link
Contributor

@bcardosolopes @andykaylor In general it might be worthwhile to align more with ptr dialect and its modeling of address spaces. Maybe even use MemorySpaceAttrInterface directly in out pointer type

I definitely like the idea of aligning better with other dialects, but it's not clear to me what that would mean. A few minutes browsing various in-tree dialects left me confused.

I see that there's an LLVM AddressSpaceAddr that uses the MemorySpaceAttrInterface but I only see this attribute attached to ptr.ptr in tests in the form !ptr.ptr<#llvm.address_space<0>>. If I'm reading this correctly, that's defining an opaque pointer in address space zero. This looks like what we're doing with target<n> address spaces.

But I also see that the GPU and AMDGPU dialects have their own address space attributes which look a bit closer to what we're doing currently with "language" address spaces. It's not clear to me how we'd transition from CIR to these. GPU has an enum defining global, workgroup, and private. We have global and private but it's not clear to me which of our other options would map to workgroup. For AMDGPU there are fat_raw_biffer, buffer_src, and fat_structured_buffer. I don't even have a guess there.

@xlauko
Copy link
Contributor

xlauko commented Sep 30, 2025

@bcardosolopes @andykaylor In general it might be worthwhile to align more with ptr dialect and its modeling of address spaces. Maybe even use MemorySpaceAttrInterface directly in out pointer type

I definitely like the idea of aligning better with other dialects, but it's not clear to me what that would mean. A few minutes browsing various in-tree dialects left me confused.

I see that there's an LLVM AddressSpaceAddr that uses the MemorySpaceAttrInterface but I only see this attribute attached to ptr.ptr in tests in the form !ptr.ptr<#llvm.address_space<0>>. If I'm reading this correctly, that's defining an opaque pointer in address space zero. This looks like what we're doing with target<n> address spaces.

But I also see that the GPU and AMDGPU dialects have their own address space attributes which look a bit closer to what we're doing currently with "language" address spaces. It's not clear to me how we'd transition from CIR to these. GPU has an enum defining global, workgroup, and private. We have global and private but it's not clear to me which of our other options would map to workgroup. For AMDGPU there are fat_raw_biffer, buffer_src, and fat_structured_buffer. I don't even have a guess there.

The origin of pointer dialect is that it started as extracting pointer representation from LLVM dialect as it might be usefull for other dialects too. See RFC.
So as we are ultimatelly lowering to LLVM IR, we should be able to easily map to pointer dialect representation.
I've seen there was a lot of updates of the dialect recently, so I am uncertain about how mature the things are. Maybe @fabianmcg or @joker-eph might have better insight?

@fabianmcg
Copy link
Contributor

I've seen there was a lot of updates of the dialect recently, so I am uncertain about how mature the things are. Maybe @fabianmcg or @joker-eph might have better insight?

I'll answer in reverse order, it's not yet mature. I expect it to be mature by the end of the year though.
However, I would recommend the adoption of PtrLikeTypeInterface now.

I do plan to push MemorySpaceAttrInterface adoption into more memory space attributes, for example, GPU address spaces, etc.... However, that's just for interacting with the pointer dialect and its operations.

Therefore, the only case cir would benefit from implementing the interface, is if at some point you want to convert cir ops to ptr, but preserve cir memory space semantics. That is something like:

%v = ptr.load %ptr : !ptr.ptr<#cir.memory_space<f32>> -> f32

I see that there's an LLVM AddressSpaceAddr that uses the MemorySpaceAttrInterface but I only see this attribute attached to ptr.ptr in tests in the form !ptr.ptr<#llvm.address_space<0>>. If I'm reading this correctly, that's defining an opaque pointer in address space zero. This looks like what we're doing with target<n> address spaces.

The original plan is to eventually remove !llvm.ptr<n> in favor of !ptr.ptr<#llvm.address_space<n>>, but, that transition is not happening soon. Also, the plan is to make the transition almost invisible to users of the type.

But it would mean that in the end cir would end up converting to !ptr.ptr under the hood.

But I also see that the GPU and AMDGPU dialects have their own address space attributes which look a bit closer to what we're doing currently with "language" address spaces. It's not clear to me how we'd transition from CIR to these. GPU has an enum defining global, workgroup, and private. We have global and private but it's not clear to me which of our other options would map to workgroup. For AMDGPU there are fat_raw_biffer, buffer_src, and fat_structured_buffer. I don't even have a guess there.

The type converter can handle that conversion. For example, the #gpu.address_space<global> memory space will get converted to 1 on conversion to LLVM:

@kernel_1(%arg0 : memref<f32, #gpu.address_space<global>>)
// After signature conversion
@kernel_1 (%arg0: !llvm.ptr<1>, %arg1: !llvm.ptr<1>, %arg2: i64)

So, it's perfectly doable to convert some of the memory spaces in cir to gpu memory spaces.

@fabianmcg
Copy link
Contributor

fabianmcg commented Sep 30, 2025

Finally, leaving a small comment/suggestion on the PR.

I don't think it is recommended to do what this PR is doing with target<n>, as it's abusing of the enum storage.

Instead, you likely want 2 address space attributes:

  • !cir.ptr<!s32i, clang_addrspace(offload_private)>
  • !cir.ptr<!s32i, target_addresspace(0)>

Where the second attribute takes an int as a parameter.

It's also worth saying that in general numeric memory spaces are somewhat discouraged in MLIR except when they appear in the LLVM dialect. That's why in ptr you have !ptr.ptr<#ptr.generic_space> instead of !ptr.ptr<0>

So, unless clang is using the numeric representation in the AST for some memory spaces, and there's no way of providing a name for those, I would discourage target<n> all together.

@andykaylor
Copy link
Contributor

It's also worth saying that in general numeric memory spaces are somewhat discouraged in MLIR except when they appear in the LLVM dialect. That's why in ptr you have !ptr.ptr<#ptr.generic_space> instead of !ptr.ptr<0>

So, unless clang is using the numeric representation in the AST for some memory spaces, and there's no way of providing a name for those, I would discourage target<n> all together.

The problem is that clang allows the user to directly provide an address space using __attribute__((address_space(n))) and we have no way of knowing what the value means, but that will correspond directly to an LLVM dialect address space, so we can certainly lower that to !ptr.ptr<#llvm.address_space<n>>.

Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. My remaining comments are mostly housekeeping suggestions.

@RiverDave RiverDave force-pushed the users/riverdave/cir/addrspace-support-for-cir-ptr branch from 6cf211b to 9cd2c54 Compare October 2, 2025 12:41
@RiverDave RiverDave force-pushed the users/riverdave/cir/addrspace-support-for-cir-ptr branch from 9cd2c54 to 533e322 Compare October 2, 2025 12:43
Copy link
Contributor

@andykaylor andykaylor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@andykaylor
Copy link
Contributor

The description needs to be updated to reflect the changes made during review.

@RiverDave
Copy link
Contributor Author

The description needs to be updated to reflect the changes made during review.

Will do

Copy link
Member

@bcardosolopes bcardosolopes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after updating both title and description to closely match reviews! It would also be nice to backport some of these if it's now deviating from incubator.

@RiverDave RiverDave changed the title [CIR] Upstream AddressSpace support for PointerType [CIR] Implement Target-specific address space handling support for PointerType Oct 3, 2025
@RiverDave RiverDave changed the title [CIR] Implement Target-specific address space handling support for PointerType [CIR] Implement Target-specific address space handling support for PointerType in function parameters Oct 3, 2025
@RiverDave RiverDave changed the title [CIR] Implement Target-specific address space handling support for PointerType in function parameters [CIR] Implement Target-specific address space handling support for PointerType Oct 3, 2025
@RiverDave RiverDave merged commit 3896212 into main Oct 4, 2025
9 checks passed
@RiverDave RiverDave deleted the users/riverdave/cir/addrspace-support-for-cir-ptr branch October 4, 2025 01:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category ClangIR Anything related to the ClangIR project

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants