Skip to content

Conversation

vzakhari
Copy link
Contributor

@vzakhari vzakhari commented Sep 3, 2025

In order to help LLVM disambiguate accesses to the COMMON
block variables, this patch creates a TBAA sub-tree for each
COMMON block, and the places all variables belonging to this
COMMON block into this sub-tree. The structure looks like this:

  common /blk/ a, b, c

  "global data"
  |
  |- "blk_"
     |
     |- "blk_/bytes_0_to_3"
     |- "blk_/bytes_4_to_7"
     |- "blk_/bytes_8_to_11"

The TBAA tag for "a" is created in "blk_/bytes_0_to_3" root, etc.
The byte spans are created based on the storage information
provided by fir.declare (#155742).

See also: https://discourse.llvm.org/t/rfc-flang-representation-for-objects-inside-physical-storage/88026

In order to help LLVM disambiguate accesses to the COMMON
block variables, this patch creates a TBAA sub-tree for each
COMMON block, and the places all variables belonging to this
COMMON block into this sub-tree. The structure looks like this:
```
  common /blk/ a, b, c

  "global data"
  |
  |- "blk_"
     |
     |- "blk_/bytes_0_to_3"
     |- "blk_/bytes_4_to_7"
     |- "blk_/bytes_8_to_11"
```

The TBAA tag for "a" is created in "blk_/bytes_0_to_3" root, etc.
The byte spans are created based on the `storage` information
provided by `fir.declare` (llvm#155742).

See also: https://discourse.llvm.org/t/rfc-flang-representation-for-objects-inside-physical-storage/88026
@vzakhari vzakhari requested review from tblah and jeanPerier September 3, 2025 00:16
@vzakhari
Copy link
Contributor Author

vzakhari commented Sep 3, 2025

This PR replaces #153918

if (auto boxTy = mlir::dyn_cast_if_present<fir::BaseBoxType>(type))
llvmTy = llvmTypeConverter.convertBoxTypeAsStruct(boxTy, getBoxRank(boxTy));
else
llvmTy = llvmTypeConverter.convertType(type);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type conversion fails for !fir.boxproc members of derived types, e.g. for:

  Type t
    Sequence
    Procedure(),Pointer,NoPass :: sub_ptr
    Integer i
  End Type

I wonder if I should return a pointer type in LLVMTypeConverter::LLVMTypeConverter:

  addConversion([&](BoxProcType boxproc) {
    // TODO: Support for this type will be added later when the Fortran 2003
    // procedure pointer feature is implemented.
    return std::nullopt;
  });

Or, is there a better solution?

This issue appeared in just one test in my local testing, but I cannot merge this without resolving the problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pointer does seem to be what it is lowered to from what I can tell. So I think that should be okay. It would be good to get Jean's thoughts before merging though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantically speaking, fir.boxproc abstraction was meant to allow lowering it to a tuple, which would be a valid and useful abstraction for procedure pointers (but not dummy procedures) to avoid creating executable trampolines when making pointers out internal procedures.

So fir.boxproc has its own codegen to FIR before lowering to LLVM. Hardcoding the type lowering in the LLVMTypeConverter and using that before the fir.boxproc codegen bypasses this completely.

However my point is pedantic in the sense that this is not the codegen chosen/used by flang, and at least for fir.boxproc that may be visible to other compilation units (such as external procedure arguments and derived type members), it is not something the fir.boxproc codegen could decide to change on the fly (it needs to be in sync with semantic offsets).

I feel the ideal would be to reconcile/unify/or link the LLVMTypeConverter and BoxprocTypeRewriter to avoid discrepancies at the FIR level.

I would be fine with just some FIRType helper used in both converter to connect the dots (although the BoxprocTypeRewriter is not using a pointer type, it is using the inner FunctionType and recursively converting the signature, so this may not be direct).

Another solution could be to run the TBAA pass after the BoxedProcedure if that makes any sense.

Last point is that procedure pointers and derived types with procedure pointers are forbidden in COMMON BLOCK and EQUIVALENCE (C8123 and C8110 in F2023). I am not sure if this is enforced in semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like other compilers (NVHPC/gfortran/ifx) do support sequence derived types with procedure pointer components inside COMMON blocks, so it looks like it might be some useful extension of the standard.

I think I am going to try to schedule AddAliasTags pass after the BoxedProcedurePass inside createDefaultFIRCodeGenPassPipeline. I hope this okay with you (given that this is one of the alternatives you listed :) )

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really interesting stuff. Only minor comments.

if (auto boxTy = mlir::dyn_cast_if_present<fir::BaseBoxType>(type))
llvmTy = llvmTypeConverter.convertBoxTypeAsStruct(boxTy, getBoxRank(boxTy));
else
llvmTy = llvmTypeConverter.convertType(type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pointer does seem to be what it is lowered to from what I can tell. So I think that should be okay. It would be good to get Jean's thoughts before merging though.

Comment on lines 266 to 269
const StorageDesc *getStorageDesc(mlir::Operation *op) {
auto it = declToStorageMap.find(op);
return it == declToStorageMap.end() ? nullptr : &it->second;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently only called when processing non-pointer global variables. Populating declToStorageMap looks expensive. I think it would be better to populate it on demand the first time this method is called, so that we don't incur the expense for top level operations that do not refer to any non-pointer global variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable - I will do it!

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Slava!

@@ -22,6 +22,7 @@
namespace fir {
class FIROpsDialect;
class KindMapping;
class LLVMTypeConverter;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why do you need this change, it does not seem to be used in the header.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching it! It is a leftover.

Comment on lines 24 to 25
namedSubtrees.insert(
{name, SubtreeState(context, parentId + '/' + name.str(), parent)});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could return the insert iterator result here instead of doing a new search afterwards.

if (auto boxTy = mlir::dyn_cast_if_present<fir::BaseBoxType>(type))
llvmTy = llvmTypeConverter.convertBoxTypeAsStruct(boxTy, getBoxRank(boxTy));
else
llvmTy = llvmTypeConverter.convertType(type);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantically speaking, fir.boxproc abstraction was meant to allow lowering it to a tuple, which would be a valid and useful abstraction for procedure pointers (but not dummy procedures) to avoid creating executable trampolines when making pointers out internal procedures.

So fir.boxproc has its own codegen to FIR before lowering to LLVM. Hardcoding the type lowering in the LLVMTypeConverter and using that before the fir.boxproc codegen bypasses this completely.

However my point is pedantic in the sense that this is not the codegen chosen/used by flang, and at least for fir.boxproc that may be visible to other compilation units (such as external procedure arguments and derived type members), it is not something the fir.boxproc codegen could decide to change on the fly (it needs to be in sync with semantic offsets).

I feel the ideal would be to reconcile/unify/or link the LLVMTypeConverter and BoxprocTypeRewriter to avoid discrepancies at the FIR level.

I would be fine with just some FIRType helper used in both converter to connect the dots (although the BoxprocTypeRewriter is not using a pointer type, it is using the inner FunctionType and recursively converting the signature, so this may not be direct).

Another solution could be to run the TBAA pass after the BoxedProcedure if that makes any sense.

Last point is that procedure pointers and derived types with procedure pointers are forbidden in COMMON BLOCK and EQUIVALENCE (C8123 and C8110 in F2023). I am not sure if this is enforced in semantics.

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Sep 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Sep 3, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Slava Zakharin (vzakhari)

Changes

In order to help LLVM disambiguate accesses to the COMMON
block variables, this patch creates a TBAA sub-tree for each
COMMON block, and the places all variables belonging to this
COMMON block into this sub-tree. The structure looks like this:

  common /blk/ a, b, c

  "global data"
  |
  |- "blk_"
     |
     |- "blk_/bytes_0_to_3"
     |- "blk_/bytes_4_to_7"
     |- "blk_/bytes_8_to_11"

The TBAA tag for "a" is created in "blk_/bytes_0_to_3" root, etc.
The byte spans are created based on the storage information
provided by fir.declare (#155742).

See also: https://discourse.llvm.org/t/rfc-flang-representation-for-objects-inside-physical-storage/88026


Patch is 85.29 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/156558.diff

17 Files Affected:

  • (modified) flang/include/flang/Optimizer/Analysis/TBAAForest.h (+20-4)
  • (modified) flang/include/flang/Optimizer/Builder/FIRBuilder.h (+6-1)
  • (modified) flang/include/flang/Optimizer/Dialect/FIRType.h (+1)
  • (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+2-1)
  • (modified) flang/lib/Optimizer/Analysis/TBAAForest.cpp (+13-2)
  • (modified) flang/lib/Optimizer/Transforms/AddAliasTags.cpp (+471-20)
  • (modified) flang/test/Transforms/tbaa-derived-with-descriptor.fir (+2)
  • (added) flang/test/Transforms/tbaa-for-common-vars.fir (+436)
  • (added) flang/test/Transforms/tbaa-for-global-equiv-vars.fir (+86)
  • (modified) flang/test/Transforms/tbaa-for-local-vars.fir (+2-1)
  • (modified) flang/test/Transforms/tbaa-local-alloc-threshold.fir (+2)
  • (modified) flang/test/Transforms/tbaa-with-dummy-scope.fir (+4)
  • (modified) flang/test/Transforms/tbaa-with-dummy-scope2.fir (+4)
  • (modified) flang/test/Transforms/tbaa.fir (+11)
  • (modified) flang/test/Transforms/tbaa2.fir (+2)
  • (modified) flang/test/Transforms/tbaa3.fir (+1-1)
  • (modified) flang/test/Transforms/tbaa4.fir (+18-11)
diff --git a/flang/include/flang/Optimizer/Analysis/TBAAForest.h b/flang/include/flang/Optimizer/Analysis/TBAAForest.h
index 4d2281642b43df..b4932594114a1b 100644
--- a/flang/include/flang/Optimizer/Analysis/TBAAForest.h
+++ b/flang/include/flang/Optimizer/Analysis/TBAAForest.h
@@ -46,6 +46,12 @@ struct TBAATree {
 
     mlir::LLVM::TBAATypeDescriptorAttr getRoot() const { return parent; }
 
+    /// For the given name, get or create a subtree in the current
+    /// subtree. For example, this is used for creating subtrees
+    /// inside the "global data" subtree for the COMMON block variables
+    /// belonging to the same COMMON block.
+    SubtreeState &getOrCreateNamedSubtree(mlir::StringAttr name);
+
   private:
     SubtreeState(mlir::MLIRContext *ctx, std::string name,
                  mlir::LLVM::TBAANodeAttr grandParent)
@@ -57,6 +63,9 @@ struct TBAATree {
     const std::string parentId;
     mlir::MLIRContext *const context;
     mlir::LLVM::TBAATypeDescriptorAttr parent;
+    // A map of named sub-trees, e.g. sub-trees of the COMMON blocks
+    // placed under the "global data" root.
+    llvm::DenseMap<mlir::StringAttr, SubtreeState> namedSubtrees;
   };
 
   /// A subtree for POINTER/TARGET variables data.
@@ -131,8 +140,8 @@ class TBAAForrest {
   // responsibility to provide unique name for the scope.
   // If the scope string is empty, returns the TBAA tree for the
   // "root" scope of the given function.
-  inline const TBAATree &getFuncTreeWithScope(mlir::func::FuncOp func,
-                                              llvm::StringRef scope) {
+  inline TBAATree &getMutableFuncTreeWithScope(mlir::func::FuncOp func,
+                                               llvm::StringRef scope) {
     mlir::StringAttr name = func.getSymNameAttr();
     if (!scope.empty())
       name = mlir::StringAttr::get(name.getContext(),
@@ -140,13 +149,20 @@ class TBAAForrest {
     return getFuncTree(name);
   }
 
+  inline const TBAATree &getFuncTreeWithScope(mlir::func::FuncOp func,
+                                              llvm::StringRef scope) {
+    return getMutableFuncTreeWithScope(func, scope);
+  }
+
 private:
-  const TBAATree &getFuncTree(mlir::StringAttr symName) {
+  TBAATree &getFuncTree(mlir::StringAttr symName) {
     if (!separatePerFunction)
       symName = mlir::StringAttr::get(symName.getContext(), "");
     if (!trees.contains(symName))
       trees.insert({symName, TBAATree::buildTree(symName)});
-    return trees.at(symName);
+    auto it = trees.find(symName);
+    assert(it != trees.end());
+    return it->second;
   }
 
   // Should each function use a different tree?
diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
index e3a44f147b4cd9..4b3087ed457880 100644
--- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h
+++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
@@ -365,7 +365,12 @@ class FirOpBuilder : public mlir::OpBuilder, public mlir::OpBuilder::Listener {
   // Linkage helpers (inline). The default linkage is external.
   //===--------------------------------------------------------------------===//
 
-  mlir::StringAttr createCommonLinkage() { return getStringAttr("common"); }
+  static mlir::StringAttr createCommonLinkage(mlir::MLIRContext *context) {
+    return mlir::StringAttr::get(context, "common");
+  }
+  mlir::StringAttr createCommonLinkage() {
+    return createCommonLinkage(getContext());
+  }
 
   mlir::StringAttr createInternalLinkage() { return getStringAttr("internal"); }
 
diff --git a/flang/include/flang/Optimizer/Dialect/FIRType.h b/flang/include/flang/Optimizer/Dialect/FIRType.h
index ecab12de55d61b..6188c4460daddf 100644
--- a/flang/include/flang/Optimizer/Dialect/FIRType.h
+++ b/flang/include/flang/Optimizer/Dialect/FIRType.h
@@ -551,6 +551,7 @@ std::optional<std::pair<uint64_t, unsigned short>>
 getTypeSizeAndAlignment(mlir::Location loc, mlir::Type ty,
                         const mlir::DataLayout &dl,
                         const fir::KindMapping &kindMap);
+
 } // namespace fir
 
 #endif // FORTRAN_OPTIMIZER_DIALECT_FIRTYPE_H
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index 54190f09b1ec8d..e3001454cdf19d 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -326,7 +326,8 @@ def AddAliasTags : Pass<"fir-add-alias-tags", "mlir::ModuleOp"> {
     theory, each operation could be considered in prallel, so long as there
     aren't races adding new tags to the mlir context.
   }];
-  let dependentDialects = [ "fir::FIROpsDialect" ];
+  // The pass inserts TBAA attributes from LLVM dialect.
+  let dependentDialects = ["mlir::LLVM::LLVMDialect"];
 }
 
 def SimplifyRegionLite : Pass<"simplify-region-lite", "mlir::ModuleOp"> {
diff --git a/flang/lib/Optimizer/Analysis/TBAAForest.cpp b/flang/lib/Optimizer/Analysis/TBAAForest.cpp
index cce50e0de1bc76..44a0348da3a6f0 100644
--- a/flang/lib/Optimizer/Analysis/TBAAForest.cpp
+++ b/flang/lib/Optimizer/Analysis/TBAAForest.cpp
@@ -11,12 +11,23 @@
 
 mlir::LLVM::TBAATagAttr
 fir::TBAATree::SubtreeState::getTag(llvm::StringRef uniqueName) const {
-  std::string id = (parentId + "/" + uniqueName).str();
+  std::string id = (parentId + '/' + uniqueName).str();
   mlir::LLVM::TBAATypeDescriptorAttr type =
       mlir::LLVM::TBAATypeDescriptorAttr::get(
           context, id, mlir::LLVM::TBAAMemberAttr::get(parent, 0));
   return mlir::LLVM::TBAATagAttr::get(type, type, 0);
-  // return tag;
+}
+
+fir::TBAATree::SubtreeState &
+fir::TBAATree::SubtreeState::getOrCreateNamedSubtree(mlir::StringAttr name) {
+  auto it = namedSubtrees.find(name);
+  if (it != namedSubtrees.end())
+    return it->second;
+
+  return namedSubtrees
+      .insert(
+          {name, SubtreeState(context, parentId + '/' + name.str(), parent)})
+      .first->second;
 }
 
 mlir::LLVM::TBAATagAttr fir::TBAATree::SubtreeState::getTag() const {
diff --git a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
index 85403ad257657e..d87798ee1c1153 100644
--- a/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
+++ b/flang/lib/Optimizer/Transforms/AddAliasTags.cpp
@@ -14,12 +14,17 @@
 
 #include "flang/Optimizer/Analysis/AliasAnalysis.h"
 #include "flang/Optimizer/Analysis/TBAAForest.h"
+#include "flang/Optimizer/Builder/FIRBuilder.h"
 #include "flang/Optimizer/Dialect/FIRDialect.h"
 #include "flang/Optimizer/Dialect/FirAliasTagOpInterface.h"
+#include "flang/Optimizer/Support/DataLayout.h"
+#include "flang/Optimizer/Support/Utils.h"
 #include "flang/Optimizer/Transforms/Passes.h"
+#include "mlir/Dialect/DLTI/DLTI.h"
 #include "mlir/IR/Dominance.h"
 #include "mlir/Pass/Pass.h"
 #include "llvm/ADT/DenseMap.h"
+#include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/StringRef.h"
 #include "llvm/ADT/Twine.h"
 #include "llvm/Support/CommandLine.h"
@@ -57,12 +62,131 @@ static llvm::cl::opt<unsigned> localAllocsThreshold(
 
 namespace {
 
+// Return the size and alignment (in bytes) for the given type.
+// TODO: this must be combined with DebugTypeGenerator::getFieldSizeAndAlign().
+// We'd better move fir::LLVMTypeConverter out of the FIRCodeGen component.
+static std::pair<std::uint64_t, unsigned short>
+getTypeSizeAndAlignment(mlir::Type type,
+                        fir::LLVMTypeConverter &llvmTypeConverter) {
+  mlir::Type llvmTy;
+  if (auto boxTy = mlir::dyn_cast_if_present<fir::BaseBoxType>(type))
+    llvmTy = llvmTypeConverter.convertBoxTypeAsStruct(boxTy, getBoxRank(boxTy));
+  else
+    llvmTy = llvmTypeConverter.convertType(type);
+
+  const mlir::DataLayout &dataLayout = llvmTypeConverter.getDataLayout();
+  uint64_t byteSize = dataLayout.getTypeSize(llvmTy);
+  unsigned short byteAlign = dataLayout.getTypeABIAlignment(llvmTy);
+  return std::pair{byteSize, byteAlign};
+}
+
+// IntervalTy class describes a range of bytes addressed by a variable
+// within some storage. Zero-sized intervals are not allowed.
+class IntervalTy {
+public:
+  IntervalTy() = delete;
+  IntervalTy(std::uint64_t start, std::size_t size)
+      : start(start), end(start + (size - 1)) {
+    assert(size != 0 && "empty intervals should not be created");
+  }
+  constexpr bool operator<(const IntervalTy &rhs) const {
+    if (start < rhs.start)
+      return true;
+    if (rhs.start < start)
+      return false;
+    return end < rhs.end;
+  }
+  bool overlaps(const IntervalTy &other) const {
+    return end >= other.start && other.end >= start;
+  }
+  bool contains(const IntervalTy &other) const {
+    return start <= other.start && end >= other.end;
+  }
+  void merge(const IntervalTy &other) {
+    start = std::min(start, other.start);
+    end = std::max(end, other.end);
+    assert(start <= end);
+  }
+  void print(llvm::raw_ostream &os) const {
+    os << "[" << start << "," << end << "]";
+  }
+  std::uint64_t getStart() const { return start; }
+  std::uint64_t getEnd() const { return end; }
+
+private:
+  std::uint64_t start;
+  std::uint64_t end;
+};
+
+// IntervalSetTy is an ordered set of IntervalTy entities.
+class IntervalSetTy : public std::set<IntervalTy> {
+public:
+  // Find an interval from the set that contain the given interval.
+  // The complexity is O(log(N)), where N is the size of the set.
+  std::optional<IntervalTy> getContainingInterval(const IntervalTy &interval) {
+    if (empty())
+      return std::nullopt;
+
+    auto it = lower_bound(interval);
+    // The iterator points to the first interval that is not less than
+    // the given interval. The given interval may belong to the one
+    // pointed out by the iterator or to the previous one.
+    //
+    // In the following cases there might be no interval that is not less
+    // than the given interval, e.g.:
+    // Case 1:
+    //   interval: [5,5]
+    //   set: {[4,6]}
+    // Case 2:
+    //   interval: [5,5]
+    //   set: {[4,5]}
+    // We have to look starting from the last interval in the set.
+    if (it == end())
+      --it;
+
+    // The loop must finish in two iterator max.
+    do {
+      if (it->contains(interval))
+        return *it;
+      // If the current interval from the set is less than the given
+      // interval and there is no overlap, we should not look further.
+      if ((!it->overlaps(interval) && *it < interval) || it == begin())
+        break;
+
+      --it;
+    } while (true);
+
+    return std::nullopt;
+  }
+};
+
+// Stream operators for IntervalTy and IntervalSetTy.
+inline llvm::raw_ostream &operator<<(llvm::raw_ostream &os,
+                                     const IntervalTy &interval) {
+  interval.print(os);
+  return os;
+}
+inline llvm::raw_ostream &operator<<(llvm::raw_ostream &os,
+                                     const IntervalSetTy &set) {
+  if (set.empty()) {
+    os << " <empty>";
+    return os;
+  }
+  for (const auto &interval : set)
+    os << ' ' << interval;
+  return os;
+}
+
 /// Shared state per-module
 class PassState {
 public:
-  PassState(mlir::DominanceInfo &domInfo,
+  PassState(mlir::ModuleOp module, const mlir::DataLayout &dl,
+            mlir::DominanceInfo &domInfo,
             std::optional<unsigned> localAllocsThreshold)
-      : domInfo(domInfo), localAllocsThreshold(localAllocsThreshold) {}
+      : domInfo(domInfo), localAllocsThreshold(localAllocsThreshold),
+        symTab(module.getOperation()),
+        llvmTypeConverter(module, /*applyTBAA=*/false,
+                          /*forceUnifiedTBAATree=*/false, dl) {}
   /// memoised call to fir::AliasAnalysis::getSource
   inline const fir::AliasAnalysis::Source &getSource(mlir::Value value) {
     if (!analysisCache.contains(value))
@@ -72,13 +196,14 @@ class PassState {
   }
 
   /// get the per-function TBAATree for this function
-  inline const fir::TBAATree &getFuncTree(mlir::func::FuncOp func) {
-    return forrest[func];
+  inline fir::TBAATree &getMutableFuncTreeWithScope(mlir::func::FuncOp func,
+                                                    fir::DummyScopeOp scope) {
+    auto &scopeMap = scopeNames.at(func);
+    return forrest.getMutableFuncTreeWithScope(func, scopeMap.lookup(scope));
   }
   inline const fir::TBAATree &getFuncTreeWithScope(mlir::func::FuncOp func,
                                                    fir::DummyScopeOp scope) {
-    auto &scopeMap = scopeNames.at(func);
-    return forrest.getFuncTreeWithScope(func, scopeMap.lookup(scope));
+    return getMutableFuncTreeWithScope(func, scope);
   }
 
   void processFunctionScopes(mlir::func::FuncOp func);
@@ -98,8 +223,82 @@ class PassState {
   // attachment.
   bool attachLocalAllocTag();
 
+  // Return fir.global for the given name.
+  fir::GlobalOp getGlobalDefiningOp(mlir::StringAttr name) const {
+    return symTab.lookup<fir::GlobalOp>(name);
+  }
+
+  // Process fir::FortranVariableStorageOpInterface operations within
+  // the given op, and fill in declToStorageMap with the information
+  // about their physical storages and layouts.
+  void collectPhysicalStorageAliasSets(mlir::Operation *op);
+
+  // Return the byte size of the given declaration.
+  std::size_t getDeclarationSize(fir::FortranVariableStorageOpInterface decl) {
+    mlir::Type memType = fir::unwrapRefType(decl.getBase().getType());
+    auto [size, alignment] =
+        getTypeSizeAndAlignment(memType, llvmTypeConverter);
+    return llvm::alignTo(size, alignment);
+  }
+
+  // A StorageDesc specifies an operation that defines a physical storage
+  // and the <offset, size> pair within that physical storage where
+  // a variable resides.
+  struct StorageDesc {
+    StorageDesc() = delete;
+    StorageDesc(mlir::Operation *storageDef, std::uint64_t start,
+                std::size_t size)
+        : storageDef(storageDef), interval(start, size) {}
+
+    // Return a string representing the byte range of the variable within
+    // its storage, e.g. bytes_0_to_0 for a 1-byte variable starting
+    // at offset 0.
+    std::string getByteRangeStr() const {
+      return ("bytes_" + llvm::Twine(interval.getStart()) + "_to_" +
+              llvm::Twine(interval.getEnd()))
+          .str();
+    }
+
+    mlir::Operation *storageDef;
+    IntervalTy interval;
+  };
+
+  // Fills in declToStorageMap on the first invocation.
+  // Returns a storage descriptor for the given op (if registered
+  // in declToStorageMap).
+  const StorageDesc *computeStorageDesc(mlir::Operation *op) {
+    if (!op)
+      return nullptr;
+
+    // TODO: it should be safe to run collectPhysicalStorageAliasSets()
+    // on the parent func.func instead of the module, since the TBAA
+    // tags use different roots per function. This may provide better
+    // results for storages that have members with descriptors
+    // in one function but not the others.
+    if (!declToStorageMapComputed)
+      collectPhysicalStorageAliasSets(op->getParentOfType<mlir::ModuleOp>());
+    return getStorageDesc(op);
+  }
+
+private:
+  const StorageDesc *getStorageDesc(mlir::Operation *op) const {
+    auto it = declToStorageMap.find(op);
+    return it == declToStorageMap.end() ? nullptr : &it->second;
+  }
+
+  StorageDesc &getMutableStorageDesc(mlir::Operation *op) {
+    auto it = declToStorageMap.find(op);
+    assert(it != declToStorageMap.end());
+    return it->second;
+  }
+
 private:
   mlir::DominanceInfo &domInfo;
+  std::optional<unsigned> localAllocsThreshold;
+  // Symbol table cache for the module.
+  mlir::SymbolTable symTab;
+  // Type converter to compute the size of declarations.
+  fir::LLVMTypeConverter llvmTypeConverter;
   fir::AliasAnalysis analysis;
   llvm::DenseMap<mlir::Value, fir::AliasAnalysis::Source> analysisCache;
   fir::TBAAForrest forrest;
@@ -118,7 +317,12 @@ class PassState {
   // member(s), to avoid the cost of isRecordWithDescriptorMember().
   llvm::DenseSet<mlir::Type> typesContainingDescriptors;
 
-  std::optional<unsigned> localAllocsThreshold;
+  // A map between fir::FortranVariableStorageOpInterface operations
+  // and their storage descriptors.
+  llvm::DenseMap<mlir::Operation *, StorageDesc> declToStorageMap;
+  // declToStorageMapComputed is set to true after declToStorageMap
+  // is initialized by collectPhysicalStorageAliasSets().
+  bool declToStorageMapComputed = false;
 };
 
 // Process fir.dummy_scope operations in the given func:
@@ -198,6 +402,202 @@ bool PassState::attachLocalAllocTag() {
   return true;
 }
 
+static mlir::Value getStorageDefinition(mlir::Value storageRef) {
+  while (auto convert =
+             mlir::dyn_cast_or_null<fir::ConvertOp>(storageRef.getDefiningOp()))
+    storageRef = convert.getValue();
+  return storageRef;
+}
+
+void PassState::collectPhysicalStorageAliasSets(mlir::Operation *op) {
+  // A map between fir::FortranVariableStorageOpInterface operations
+  // and the intervals describing their layout within their physical
+  // storages.
+  llvm::DenseMap<mlir::Operation *, IntervalSetTy> memberIntervals;
+  // A map between operations defining physical storages (e.g. fir.global)
+  // and sets of fir::FortranVariableStorageOpInterface operations
+  // declaring their member variables.
+  llvm::DenseMap<mlir::Operation *, llvm::SmallVector<mlir::Operation *, 10>>
+      storageDecls;
+
+  bool seenUnknownStorage = false;
+  bool seenDeclWithDescriptor = false;
+  op->walk([&](fir::FortranVariableStorageOpInterface decl) {
+    mlir::Value storageRef = decl.getStorage();
+    if (!storageRef)
+      return mlir::WalkResult::advance();
+
+    // If we have seen a declaration of a variable containing
+    // a descriptor, and we have not been able to identify
+    // a storage of any variable, then any variable may
+    // potentially overlap with the variable containing
+    // a descriptor. In this case, it is hard to make any
+    // assumptions about any variable with physical
+    // storage. Exit early.
+    if (seenUnknownStorage && seenDeclWithDescriptor)
+      return mlir::WalkResult::interrupt();
+
+    if (typeReferencesDescriptor(decl.getBase().getType()))
+      seenDeclWithDescriptor = true;
+
+    mlir::Operation *storageDef =
+        getStorageDefinition(storageRef).getDefiningOp();
+    // All physical storages that are defined by non-global
+    // objects (e.g. via fir.alloca) indicate an EQUIVALENCE.
+    // Inside an EQUIVALENCE each variable overlaps
+    // with at least one another variable. So all EQUIVALENCE
+    // variables belong to the same alias set, and there is
+    // no reason to investigate them further.
+    // Note that, in general, the storage may be defined by a block
+    // argument.
+    auto addrOfOp = mlir::dyn_cast_or_null<fir::AddrOfOp>(storageDef);
+    if (!storageDef ||
+        (!addrOfOp && !mlir::dyn_cast<fir::AllocaOp>(storageDef))) {
+      seenUnknownStorage = true;
+      return mlir::WalkResult::advance();
+    }
+    if (!addrOfOp)
+      return mlir::WalkResult::advance();
+    fir::GlobalOp globalDef =
+        getGlobalDefiningOp(addrOfOp.getSymbol().getRootReference());
+    std::uint64_t storageOffset = decl.getStorageOffset();
+    std::size_t declSize = getDeclarationSize(decl);
+    LLVM_DEBUG(llvm::dbgs()
+               << "Found variable with storage:\n"
+               << "Declaration: " << decl << "\n"
+               << "Storage: " << (globalDef ? globalDef : nullptr) << "\n"
+               << "Offset: " << storageOffset << "\n"
+               << "Size: " << declSize << "\n");
+    if (!globalDef) {
+      seenUnknownStorage = true;
+      return mlir::WalkResult::advance();
+    }
+    // Zero-sized variables do not need any TBAA tags, because
+    // they cannot be accessed.
+    if (declSize == 0)
+      return mlir::WalkResult::advance();
+
+    declToStorageMap.try_emplace(decl.getOperation(), globalDef.getOperation(),
+                                 storageOffset, declSize);
+    storageDecls.try_emplace(globalDef.getOperation())
+        .first->second.push_back(decl.getOperation());
+
+    auto &set =
+        memberIntervals.try_emplace(globalDef.getOperation()).first->second;
+    set.insert(IntervalTy(storageOffset, declSize));
+    return mlir::WalkResult::advance();
+  });
+
+  // Mark the map as computed before any early exits below.
+  declToStorageMapComputed = true;
+
+  if (seenUnknownStorage && seenDeclWithDescriptor) {
+    declToStorageMap.clear();
+    ret...
[truncated]

@vzakhari
Copy link
Contributor Author

vzakhari commented Sep 3, 2025

I am testing the relocation of AddAliasTags pass, and will do that in a separate PR, if it is successful.

vzakhari added a commit to vzakhari/llvm-project that referenced this pull request Sep 4, 2025
Move AddAliasTags after BoxedProcedurePass to be able to compute
the size of Fortran variables by converting their types to LLVM
types and applying DL size/alignment computations.
This is one of the approaches to unblock llvm#156558.

This change does not affect performance in my testing.
Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. LGTM but wait for Jean's approval as well.

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

vzakhari added a commit that referenced this pull request Sep 4, 2025
Move AddAliasTags after BoxedProcedurePass to be able to compute
the size of Fortran variables by converting their types to LLVM
types and applying DL size/alignment computations.
This is one of the approaches to unblock #156558.

This change does not affect performance in my testing.
@vzakhari vzakhari merged commit 556ff19 into llvm:main Sep 4, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants