Reland [flang][cuda] Allocate derived-type with CUDA component in anaged memory #147416

clementval · 2025-07-07T22:22:24Z

Reviewed in (#146797)

Similarly to descriptor for device data, put derived type holding device descriptor in managed memory.

…anaged memory Reviewed in (llvm#146797) Similarly to descriptor for device data, put derived type holding device descriptor in managed memory.

llvmbot · 2025-07-07T22:22:55Z

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-semantics

Author: Valentin Clement (バレンタインクレメン) (clementval)

Changes

Reviewed in (#146797)

Similarly to descriptor for device data, put derived type holding device descriptor in managed memory.

Full diff: https://github.com/llvm/llvm-project/pull/147416.diff

4 Files Affected:

(modified) flang/include/flang/Semantics/tools.h (+2)
(modified) flang/lib/Lower/ConvertVariable.cpp (+25-2)
(modified) flang/lib/Semantics/tools.cpp (+17-6)
(modified) flang/test/Lower/CUDA/cuda-derived.cuf (+12-2)

diff --git a/flang/include/flang/Semantics/tools.h b/flang/include/flang/Semantics/tools.h
index ea07128a6d240..fb670528f3ce4 100644
--- a/flang/include/flang/Semantics/tools.h
+++ b/flang/include/flang/Semantics/tools.h
@@ -656,6 +656,8 @@ DirectComponentIterator::const_iterator FindAllocatableOrPointerDirectComponent(
     const DerivedTypeSpec &);
 PotentialComponentIterator::const_iterator
 FindPolymorphicAllocatablePotentialComponent(const DerivedTypeSpec &);
+UltimateComponentIterator::const_iterator
+FindCUDADeviceAllocatableUltimateComponent(const DerivedTypeSpec &);
 
 // The LabelEnforce class (given a set of labels) provides an error message if
 // there is a branch to a label which is not in the given set.
diff --git a/flang/lib/Lower/ConvertVariable.cpp b/flang/lib/Lower/ConvertVariable.cpp
index 7ab3c43016bd9..44f534e7d569a 100644
--- a/flang/lib/Lower/ConvertVariable.cpp
+++ b/flang/lib/Lower/ConvertVariable.cpp
@@ -702,6 +702,29 @@ static void instantiateGlobal(Fortran::lower::AbstractConverter &converter,
   mapSymbolAttributes(converter, var, symMap, stmtCtx, cast);
 }
 
+bool needCUDAAlloc(const Fortran::semantics::Symbol &sym) {
+  if (Fortran::semantics::IsDummy(sym))
+    return false;
+  if (const auto *details{
+          sym.GetUltimate()
+              .detailsIf<Fortran::semantics::ObjectEntityDetails>()}) {
+    if (details->cudaDataAttr() &&
+        (*details->cudaDataAttr() == Fortran::common::CUDADataAttr::Device ||
+         *details->cudaDataAttr() == Fortran::common::CUDADataAttr::Managed ||
+         *details->cudaDataAttr() == Fortran::common::CUDADataAttr::Unified ||
+         *details->cudaDataAttr() == Fortran::common::CUDADataAttr::Shared ||
+         *details->cudaDataAttr() == Fortran::common::CUDADataAttr::Pinned))
+      return true;
+    const Fortran::semantics::DeclTypeSpec *type{details->type()};
+    const Fortran::semantics::DerivedTypeSpec *derived{type ? type->AsDerived()
+                                                            : nullptr};
+    if (derived)
+      if (FindCUDADeviceAllocatableUltimateComponent(*derived))
+        return true;
+  }
+  return false;
+}
+
 //===----------------------------------------------------------------===//
 // Local variables instantiation (not for alias)
 //===----------------------------------------------------------------===//
@@ -732,7 +755,7 @@ static mlir::Value createNewLocal(Fortran::lower::AbstractConverter &converter,
   if (ultimateSymbol.test(Fortran::semantics::Symbol::Flag::CrayPointee))
     return builder.create<fir::ZeroOp>(loc, fir::ReferenceType::get(ty));
 
-  if (Fortran::semantics::NeedCUDAAlloc(ultimateSymbol)) {
+  if (needCUDAAlloc(ultimateSymbol)) {
     cuf::DataAttributeAttr dataAttr =
         Fortran::lower::translateSymbolCUFDataAttribute(builder.getContext(),
                                                         ultimateSymbol);
@@ -1087,7 +1110,7 @@ static void instantiateLocal(Fortran::lower::AbstractConverter &converter,
     Fortran::lower::defaultInitializeAtRuntime(converter, var.getSymbol(),
                                                symMap);
   auto *builder = &converter.getFirOpBuilder();
-  if (Fortran::semantics::NeedCUDAAlloc(var.getSymbol()) &&
+  if (needCUDAAlloc(var.getSymbol()) &&
       !cuf::isCUDADeviceContext(builder->getRegion())) {
     cuf::DataAttributeAttr dataAttr =
         Fortran::lower::translateSymbolCUFDataAttribute(builder->getContext(),
diff --git a/flang/lib/Semantics/tools.cpp b/flang/lib/Semantics/tools.cpp
index aed57216f13b7..d27d250b3f11e 100644
--- a/flang/lib/Semantics/tools.cpp
+++ b/flang/lib/Semantics/tools.cpp
@@ -1081,12 +1081,6 @@ const Scope *FindCUDADeviceContext(const Scope *scope) {
   });
 }
 
-std::optional<common::CUDADataAttr> GetCUDADataAttr(const Symbol *symbol) {
-  const auto *object{
-      symbol ? symbol->detailsIf<ObjectEntityDetails>() : nullptr};
-  return object ? object->cudaDataAttr() : std::nullopt;
-}
-
 bool IsDeviceAllocatable(const Symbol &symbol) {
   if (IsAllocatable(symbol)) {
     if (const auto *details{
@@ -1133,6 +1127,23 @@ bool CanCUDASymbolBeGlobal(const Symbol &sym) {
   return true;
 }
 
+std::optional<common::CUDADataAttr> GetCUDADataAttr(const Symbol *symbol) {
+  const auto *details{
+      symbol ? symbol->detailsIf<ObjectEntityDetails>() : nullptr};
+  if (details) {
+    const Fortran::semantics::DeclTypeSpec *type{details->type()};
+    const Fortran::semantics::DerivedTypeSpec *derived{
+        type ? type->AsDerived() : nullptr};
+    if (derived) {
+      if (FindCUDADeviceAllocatableUltimateComponent(*derived)) {
+        return common::CUDADataAttr::Managed;
+      }
+    }
+    return details->cudaDataAttr();
+  }
+  return std::nullopt;
+}
+
 bool IsAccessible(const Symbol &original, const Scope &scope) {
   const Symbol &ultimate{original.GetUltimate()};
   if (ultimate.attrs().test(Attr::PRIVATE)) {
diff --git a/flang/test/Lower/CUDA/cuda-derived.cuf b/flang/test/Lower/CUDA/cuda-derived.cuf
index d280ac722d08f..96250d88d81c4 100644
--- a/flang/test/Lower/CUDA/cuda-derived.cuf
+++ b/flang/test/Lower/CUDA/cuda-derived.cuf
@@ -7,6 +7,16 @@ module m1
 
   type t1; real, device, allocatable :: a(:); end type
   type t2; type(t1) :: b; end type
+contains
+  subroutine sub1()
+    type(ty_device) :: a
+  end subroutine
+
+! CHECK-LABEL: func.func @_QMm1Psub1()
+! CHECK: %[[ALLOC:.*]] = cuf.alloc !fir.type<_QMm1Tty_device{x:!fir.box<!fir.heap<!fir.array<?xi32>>>}> {bindc_name = "a", data_attr = #cuf.cuda<managed>, uniq_name = "_QMm1Fsub1Ea"} -> !fir.ref<!fir.type<_QMm1Tty_device{x:!fir.box<!fir.heap<!fir.array<?xi32>>>}>>
+! CHECK: %[[DECL:.*]]:2 = hlfir.declare %[[ALLOC]] {data_attr = #cuf.cuda<managed>, uniq_name = "_QMm1Fsub1Ea"} : (!fir.ref<!fir.type<_QMm1Tty_device{x:!fir.box<!fir.heap<!fir.array<?xi32>>>}>>) -> (!fir.ref<!fir.type<_QMm1Tty_device{x:!fir.box<!fir.heap<!fir.array<?xi32>>>}>>, !fir.ref<!fir.type<_QMm1Tty_device{x:!fir.box<!fir.heap<!fir.array<?xi32>>>}>>)
+! CHECK: cuf.free %[[DECL]]#0 : !fir.ref<!fir.type<_QMm1Tty_device{x:!fir.box<!fir.heap<!fir.array<?xi32>>>}>> {data_attr = #cuf.cuda<managed>}
+
 end module
 
 program main
@@ -16,5 +26,5 @@ program main
 end
 
 ! CHECK-LABEL: func.func @_QQmain() attributes {fir.bindc_name = "main"}
-! CHECK: %{{.*}} = fir.alloca !fir.type<_QMm1Tty_device{x:!fir.box<!fir.heap<!fir.array<?xi32>>>}> {bindc_name = "a", uniq_name = "_QFEa"}
-! CHECK: %{{.*}} = fir.alloca !fir.type<_QMm1Tt2{b:!fir.type<_QMm1Tt1{a:!fir.box<!fir.heap<!fir.array<?xf32>>>}>}> {bindc_name = "b", uniq_name = "_QFEb"}
+! CHECK: %{{.*}} = cuf.alloc !fir.type<_QMm1Tty_device{x:!fir.box<!fir.heap<!fir.array<?xi32>>>}> {bindc_name = "a", data_attr = #cuf.cuda<managed>, uniq_name = "_QFEa"}
+! CHECK: %{{.*}} = cuf.alloc !fir.type<_QMm1Tt2{b:!fir.type<_QMm1Tt1{a:!fir.box<!fir.heap<!fir.array<?xf32>>>}>}> {bindc_name = "b", data_attr = #cuf.cuda<managed>, uniq_name = "_QFEb"}

Reland [flang][cuda] Allocate derived-type with CUDA componement in m…

2b86a25

…anaged memory Reviewed in (llvm#146797) Similarly to descriptor for device data, put derived type holding device descriptor in managed memory.

llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:semantics labels Jul 7, 2025

clementval merged commit 659c810 into llvm:main Jul 8, 2025
13 checks passed

clementval deleted the cuf_derived_03 branch July 8, 2025 00:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reland [flang][cuda] Allocate derived-type with CUDA component in anaged memory #147416

Reland [flang][cuda] Allocate derived-type with CUDA component in anaged memory #147416

Uh oh!

clementval commented Jul 7, 2025

Uh oh!

llvmbot commented Jul 7, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Reland [flang][cuda] Allocate derived-type with CUDA component in anaged memory #147416

Reland [flang][cuda] Allocate derived-type with CUDA component in anaged memory #147416

Uh oh!

Conversation

clementval commented Jul 7, 2025

Uh oh!

llvmbot commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

llvmbot commented Jul 7, 2025 •

edited

Loading