Skip to content

Conversation

TIFitis
Copy link
Member

@TIFitis TIFitis commented Oct 8, 2025

  • Move automapped Fortran descriptor mappings from map() to has_device_addr in target regions, updating block args and uses accordingly.
  • In libomptarget, detect CFI descriptors during pointer attachment and compute the correct descriptor size to transfer full metadata (including bounds).
  • Resolves lost bounds for automapped Fortran arrays on device; no change for C/C++.

This fixes the offload/test/offloading/fortran/declare-target-automap.f90 test reported broken in #161265.

	- Move automapped Fortran descriptor mappings from map() to has_device_addr in target regions, updating block args and uses accordingly.
	- In libomptarget, detect CFI descriptors during pointer attachment and compute the correct descriptor size to transfer full metadata (including bounds).
	- Resolves lost bounds for automapped Fortran arrays on device; no change for C/C++.

This fixes test offload/test/offloading/fortran/declare-target-automap.f90 reported broken in llvm#161265.
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir offload labels Oct 8, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 8, 2025

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-offload

Author: Akash Banerjee (TIFitis)

Changes
  • Move automapped Fortran descriptor mappings from map() to has_device_addr in target regions, updating block args and uses accordingly.
  • In libomptarget, detect CFI descriptors during pointer attachment and compute the correct descriptor size to transfer full metadata (including bounds).
  • Resolves lost bounds for automapped Fortran arrays on device; no change for C/C++.

This fixes the offload/test/offloading/fortran/declare-target-automap.f90 test reported broken in #161265.


Full diff: https://github.com/llvm/llvm-project/pull/162501.diff

3 Files Affected:

  • (modified) flang/lib/Optimizer/OpenMP/AutomapToTargetData.cpp (+100)
  • (modified) offload/libomptarget/omptarget.cpp (+32)
  • (modified) offload/test/offloading/fortran/declare-target-automap.f90 (-3)
diff --git a/flang/lib/Optimizer/OpenMP/AutomapToTargetData.cpp b/flang/lib/Optimizer/OpenMP/AutomapToTargetData.cpp
index 8b9991301aae8..2edbef151ea4a 100644
--- a/flang/lib/Optimizer/OpenMP/AutomapToTargetData.cpp
+++ b/flang/lib/Optimizer/OpenMP/AutomapToTargetData.cpp
@@ -18,9 +18,11 @@
 #include "mlir/Dialect/OpenMP/OpenMPInterfaces.h"
 #include "mlir/IR/BuiltinAttributes.h"
 #include "mlir/IR/Operation.h"
+#include "mlir/IR/SymbolTable.h"
 #include "mlir/Pass/Pass.h"
 
 #include "llvm/Frontend/OpenMP/OMPConstants.h"
+#include <algorithm>
 
 namespace flangomp {
 #define GEN_PASS_DEF_AUTOMAPTOTARGETDATAPASS
@@ -154,6 +156,104 @@ class AutomapToTargetDataPass
           addMapInfo(globalOp, loadOp);
       }
     }
+
+    // Move automapped descriptors from map() to has_device_addr in target ops.
+    auto originatesFromAutomapGlobal = [&](mlir::Value varPtr) -> bool {
+      if (auto decl = mlir::dyn_cast_or_null<hlfir::DeclareOp>(
+              varPtr.getDefiningOp())) {
+        if (auto addrOp = mlir::dyn_cast_or_null<fir::AddrOfOp>(
+                decl.getMemref().getDefiningOp())) {
+          if (auto g =
+                  mlir::SymbolTable::lookupNearestSymbolFrom<fir::GlobalOp>(
+                      decl, addrOp.getSymbol()))
+            return automapGlobals.contains(g);
+        }
+      }
+      return false;
+    };
+
+    module.walk([&](mlir::omp::TargetOp target) {
+      // Collect candidates to move: descriptor maps of automapped globals.
+      llvm::SmallVector<mlir::Value> newMapOps;
+      llvm::SmallVector<unsigned> removedIndices;
+      llvm::SmallVector<mlir::Value> movedToHDA;
+      llvm::SmallVector<mlir::BlockArgument> oldMapArgsForMoved;
+
+      auto mapRange = target.getMapVars();
+      newMapOps.reserve(mapRange.size());
+
+      auto argIface = llvm::dyn_cast<mlir::omp::BlockArgOpenMPOpInterface>(
+          target.getOperation());
+      llvm::ArrayRef<mlir::BlockArgument> mapBlockArgs =
+          argIface.getMapBlockArgs();
+
+      for (auto [idx, mapVal] : llvm::enumerate(mapRange)) {
+        auto mapOp =
+            mlir::dyn_cast<mlir::omp::MapInfoOp>(mapVal.getDefiningOp());
+        if (!mapOp) {
+          newMapOps.push_back(mapVal);
+          continue;
+        }
+
+        mlir::Type varTy = fir::unwrapRefType(mapOp.getVarType());
+        bool isDescriptor = mlir::isa<fir::BaseBoxType>(varTy);
+        if (isDescriptor && originatesFromAutomapGlobal(mapOp.getVarPtr())) {
+          movedToHDA.push_back(mapVal);
+          removedIndices.push_back(idx);
+          oldMapArgsForMoved.push_back(mapBlockArgs[idx]);
+        } else {
+          newMapOps.push_back(mapVal);
+        }
+      }
+
+      if (movedToHDA.empty())
+        return;
+
+      // Update map vars to exclude moved entries.
+      mlir::MutableOperandRange mapMutable = target.getMapVarsMutable();
+      mapMutable.assign(newMapOps);
+
+      // Append moved entries to has_device_addr and insert corresponding block
+      // arguments.
+      mlir::MutableOperandRange hdaMutable =
+          target.getHasDeviceAddrVarsMutable();
+      llvm::SmallVector<mlir::Value> newHDA;
+      newHDA.reserve(hdaMutable.size() + movedToHDA.size());
+      llvm::for_each(hdaMutable.getAsOperandRange(),
+                     [&](mlir::Value v) { newHDA.push_back(v); });
+
+      unsigned hdaStart = argIface.getHasDeviceAddrBlockArgsStart();
+      unsigned oldHdaCount = argIface.numHasDeviceAddrBlockArgs();
+      llvm::SmallVector<mlir::BlockArgument> newHDAArgsForMoved;
+      unsigned insertIndex = hdaStart + oldHdaCount;
+      for (mlir::Value v : movedToHDA) {
+        newHDA.push_back(v);
+        target->getRegion(0).insertArgument(insertIndex, v.getType(),
+                                            v.getLoc());
+        // Capture the newly inserted block argument.
+        newHDAArgsForMoved.push_back(
+            target->getRegion(0).getArgument(insertIndex));
+        insertIndex++;
+      }
+      hdaMutable.assign(newHDA);
+
+      // Redirect uses in the region: replace old map block args with the
+      // corresponding new has_device_addr block args.
+      for (auto [oldArg, newArg] :
+           llvm::zip_equal(oldMapArgsForMoved, newHDAArgsForMoved))
+        oldArg.replaceAllUsesWith(newArg);
+
+      // Finally, erase corresponding map block arguments (descending order).
+      unsigned mapStart = argIface.getMapBlockArgsStart();
+      // Convert indices to absolute argument numbers before erasing.
+      llvm::SmallVector<unsigned> absArgNos;
+      absArgNos.reserve(removedIndices.size());
+      for (unsigned idx : removedIndices)
+        absArgNos.push_back(mapStart + idx);
+      std::sort(absArgNos.begin(), absArgNos.end(), std::greater<>());
+      for (unsigned absNo : absArgNos)
+        target->getRegion(0).eraseArgument(absNo);
+    });
   }
 };
 } // namespace
diff --git a/offload/libomptarget/omptarget.cpp b/offload/libomptarget/omptarget.cpp
index a1950cbb62908..02c83b4a51078 100644
--- a/offload/libomptarget/omptarget.cpp
+++ b/offload/libomptarget/omptarget.cpp
@@ -33,6 +33,9 @@
 #include "llvm/Frontend/OpenMP/OMPConstants.h"
 #include "llvm/Object/ObjectFile.h"
 
+#include "flang/ISO_Fortran_binding.h"
+
+#include <algorithm>
 #include <cassert>
 #include <cstdint>
 #include <vector>
@@ -378,6 +381,34 @@ static void *calculateTargetPointeeBase(void *HstPteeBase, void *HstPteeBegin,
   return TgtPteeBase;
 }
 
+// Fortran pointer attachments treated descriptors as plain pointers, so
+// automapped arrays lose their declared bounds on the device. Recognize
+// CFI descriptors to compute their actual size before copying, ensuring the
+// full descriptor (including bounds) is transferred during attachment.
+static int64_t getFortranDescriptorSize(void **HstPtrAddr,
+                                        int64_t ReportedSize) {
+  constexpr int64_t VoidPtrSize = sizeof(void *);
+
+  if (!HstPtrAddr || ReportedSize > VoidPtrSize)
+    return ReportedSize;
+
+  const CFI_cdesc_t *Desc = reinterpret_cast<const CFI_cdesc_t *>(HstPtrAddr);
+
+  if (Desc->version != CFI_VERSION)
+    return ReportedSize;
+
+  if (Desc->rank > CFI_MAX_RANK)
+    return ReportedSize;
+
+  const char *RawDesc = reinterpret_cast<const char *>(Desc);
+  const char *DimsAddr = reinterpret_cast<const char *>(&Desc->dim);
+  size_t HeaderBytes = static_cast<size_t>(DimsAddr - RawDesc);
+  size_t DimsBytes = static_cast<size_t>(Desc->rank) * sizeof(CFI_dim_t);
+  size_t TotalBytes = HeaderBytes + DimsBytes;
+
+  return std::max<int64_t>(ReportedSize, static_cast<int64_t>(TotalBytes));
+}
+
 /// Utility function to perform a pointer attachment operation.
 ///
 /// For something like:
@@ -445,6 +476,7 @@ static int performPointerAttachment(DeviceTy &Device, AsyncInfoTy &AsyncInfo,
          "Need a valid pointer entry to perform pointer-attachment");
 
   constexpr int64_t VoidPtrSize = sizeof(void *);
+  HstPtrSize = getFortranDescriptorSize(HstPtrAddr, HstPtrSize);
   assert(HstPtrSize >= VoidPtrSize && "PointerSize is too small");
 
   void *TgtPteeBase =
diff --git a/offload/test/offloading/fortran/declare-target-automap.f90 b/offload/test/offloading/fortran/declare-target-automap.f90
index b44c0b2815274..b9c2d34c834fa 100644
--- a/offload/test/offloading/fortran/declare-target-automap.f90
+++ b/offload/test/offloading/fortran/declare-target-automap.f90
@@ -1,9 +1,6 @@
 !Offloading test for AUTOMAP modifier in declare target enter
 ! REQUIRES: flang, amdgpu
 
-! FIXME: https://github.com/llvm/llvm-project/issues/161265
-! XFAIL: amdgpu
-
 ! RUN: %libomptarget-compile-fortran-run-and-check-generic
 program automap_program
    use iso_c_binding, only: c_loc

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes Fortran automap handling in OpenMP target regions by addressing lost bounds information for automapped Fortran arrays on the device. The changes ensure that Fortran descriptor metadata, including array bounds, is properly transferred during pointer attachment operations.

Key changes:

  • Move automapped Fortran descriptors from map() to has_device_addr in target regions
  • Add CFI descriptor detection in libomptarget to compute correct descriptor sizes
  • Remove test failure expectations for the now-fixed functionality

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
offload/test/offloading/fortran/declare-target-automap.f90 Removes XFAIL directive as the test now passes
offload/libomptarget/omptarget.cpp Adds CFI descriptor size calculation for proper Fortran metadata transfer
flang/lib/Optimizer/OpenMP/AutomapToTargetData.cpp Implements logic to move automapped descriptors from map to has_device_addr

return TgtPteeBase;
}

// Fortran pointer attachments treated descriptors as plain pointers, so
Copy link
Preview

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected grammar from 'treated' to 'treat' for present tense consistency.

Suggested change
// Fortran pointer attachments treated descriptors as plain pointers, so
// Fortran pointer attachments treat descriptors as plain pointers, so

Copilot uses AI. Check for mistakes.

llvm::zip_equal(oldMapArgsForMoved, newHDAArgsForMoved))
oldArg.replaceAllUsesWith(newArg);

// Finally, erase corresponding map block arguments (descending order).
Copy link
Preview

Copilot AI Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should clarify why descending order is necessary - to avoid index invalidation when erasing multiple elements from a container.

Suggested change
// Finally, erase corresponding map block arguments (descending order).
// Finally, erase corresponding map block arguments in descending order.
// Descending order is necessary to avoid index invalidation: erasing arguments
// from highest to lowest index ensures that earlier erases do not shift the
// indices of arguments yet to be erased.

Copilot uses AI. Check for mistakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category offload
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants