-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[flang][NFC] Characterize allocation based on MemAlloc effect instead of pattern matching #166806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@llvm/pr-subscribers-flang-fir-hlfir Author: Susan Tan (ス-ザン タン) (SusanTan) ChangesFlang alias analysis used to find allocation site by pattern matching allocation ops in mainly FIR dialect. This MR extends the characterization to instead characterize based on whether an op has MemAlloc effect (can check both value-based and op-based). Full diff: https://github.com/llvm/llvm-project/pull/166806.diff 2 Files Affected:
diff --git a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
index 73ddd1ff80126..94f9ec5892c58 100644
--- a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
+++ b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
@@ -22,11 +22,73 @@
#include "llvm/ADT/TypeSwitch.h"
#include "llvm/Support/Casting.h"
#include "llvm/Support/Debug.h"
+#include "mlir/Interfaces/ViewLikeInterface.h"
using namespace mlir;
#define DEBUG_TYPE "fir-alias-analysis"
+//===----------------------------------------------------------------------===//
+// AliasAnalysis: alias helpers
+//===----------------------------------------------------------------------===//
+
+static bool tryClassifyAllocateFromEffects(mlir::Operation *op,
+ mlir::Value candidate, bool allowValueScoped, bool allowOpScoped,
+ mlir::Value &v, mlir::Operation *&defOp,
+ fir::AliasAnalysis::SourceKind &type) {
+ auto iface = llvm::dyn_cast<mlir::MemoryEffectOpInterface>(op);
+ if (!iface)
+ return false;
+
+ llvm::SmallVector<mlir::MemoryEffects::EffectInstance, 4> effects;
+ iface.getEffects(effects);
+
+ if (allowValueScoped) {
+ for (mlir::MemoryEffects::EffectInstance &e : effects) {
+ if (mlir::isa<mlir::MemoryEffects::Allocate>(e.getEffect()) &&
+ e.getValue() && e.getValue() == candidate) {
+ v = candidate;
+ defOp = op;
+ type = fir::AliasAnalysis::SourceKind::Allocate;
+ return true;
+ }
+ }
+ }
+
+ if (!allowOpScoped)
+ return false;
+
+ bool hasOpScopedAlloc = llvm::any_of(
+ effects, [](const mlir::MemoryEffects::EffectInstance &e) {
+ return !e.getValue() &&
+ mlir::isa<mlir::MemoryEffects::Allocate>(e.getEffect());
+ });
+ if (!hasOpScopedAlloc)
+ return false;
+
+ bool opIsViewLike =
+ (bool)mlir::dyn_cast_or_null<mlir::ViewLikeOpInterface>(op);
+ auto isMemoryRefLikeType = [](mlir::Type type) {
+ return fir::isa_ref_type(type) || mlir::isa<mlir::BaseMemRefType>(type) ||
+ mlir::isa<mlir::LLVM::LLVMPointerType>(type);
+ };
+ bool hasMemOperands = llvm::any_of(op->getOperands(), [&](mlir::Value o) {
+ return isMemoryRefLikeType(o.getType());
+ });
+ if (opIsViewLike || hasMemOperands)
+ return false;
+
+ for (mlir::Value res : op->getResults()) {
+ if (res == candidate && isMemoryRefLikeType(res.getType())) {
+ v = candidate;
+ defOp = op;
+ type = fir::AliasAnalysis::SourceKind::Allocate;
+ return true;
+ }
+ }
+ return false;
+}
+
//===----------------------------------------------------------------------===//
// AliasAnalysis: alias
//===----------------------------------------------------------------------===//
@@ -533,30 +595,47 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v,
mlir::SymbolRefAttr global;
Source::Attributes attributes;
mlir::Operation *instantiationPoint{nullptr};
+ // Helper to conservatively classify a candidate value as coming from a
+ // dummy argument or as indirect when no allocation or global can be proven.
+ auto classifyFallbackFrom = [&](mlir::Value candidate) {
+ if (isDummyArgument(candidate)) {
+ defOp = nullptr;
+ v = candidate;
+ } else {
+ type = SourceKind::Indirect;
+ }
+ };
+
+ // Helper to detect memory-ref-like types.
+ auto isMemoryRefLikeType = [](mlir::Type t) {
+ return fir::isa_ref_type(t) || mlir::isa<mlir::BaseMemRefType>(t) ||
+ mlir::isa<mlir::LLVM::LLVMPointerType>(t);
+ };
+
while (defOp && !breakFromLoop) {
ty = defOp->getResultTypes()[0];
+
+ // Effect-based detection (op-scoped heuristic only at this level).
+ if (tryClassifyAllocateFromEffects(defOp, v,
+ /*allowValueScoped=*/false,
+ /*allowOpScoped=*/true,
+ v, defOp, type))
+ break;
+
llvm::TypeSwitch<Operation *>(defOp)
.Case<hlfir::AsExprOp>([&](auto op) {
v = op.getVar();
defOp = v.getDefiningOp();
})
.Case<hlfir::AssociateOp>([&](auto op) {
+ // Do not pattern-match Allocate. Trace through the source.
mlir::Value source = op.getSource();
- if (fir::isa_trivial(source.getType())) {
- // Trivial values will always use distinct temp memory,
- // so we can classify this as Allocate and stop.
- type = SourceKind::Allocate;
- breakFromLoop = true;
- } else {
- // AssociateOp may reuse the expression storage,
- // so we have to trace further.
- v = source;
- defOp = v.getDefiningOp();
- }
+ v = source;
+ defOp = v.getDefiningOp();
})
.Case<fir::AllocaOp, fir::AllocMemOp>([&](auto op) {
- // Unique memory allocation.
- type = SourceKind::Allocate;
+ // Do not pattern-match allocations by op name; rely on memory
+ // effects classification above. Nothing to do here.
breakFromLoop = true;
})
.Case<fir::ConvertOp>([&](auto op) {
@@ -627,18 +706,14 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v,
if (global) {
type = SourceKind::Global;
} else {
- auto def = llvm::cast<mlir::Value>(boxSrc.origin.u);
- // TODO: Add support to fir.allocmem
- if (auto allocOp = def.template getDefiningOp<fir::AllocaOp>()) {
- v = def;
- defOp = v.getDefiningOp();
- type = SourceKind::Allocate;
- } else if (isDummyArgument(def)) {
- defOp = nullptr;
- v = def;
- } else {
- type = SourceKind::Indirect;
- }
+ mlir::Value def = llvm::cast<mlir::Value>(boxSrc.origin.u);
+ bool classified = false;
+ if (auto defDefOp = def.getDefiningOp())
+ classified = tryClassifyAllocateFromEffects(
+ defDefOp, def,
+ /*allowValueScoped=*/true, /*allowOpScoped=*/true,
+ v, defOp, type);
+ if (!classified) classifyFallbackFrom(def);
}
breakFromLoop = true;
return;
diff --git a/flang/test/Analysis/AliasAnalysis/cuf-alloc-source-kind.mlir b/flang/test/Analysis/AliasAnalysis/cuf-alloc-source-kind.mlir
new file mode 100644
index 0000000000000..6a911f8ff25e3
--- /dev/null
+++ b/flang/test/Analysis/AliasAnalysis/cuf-alloc-source-kind.mlir
@@ -0,0 +1,22 @@
+// REQUIRES: asserts
+// RUN: fir-opt %s -pass-pipeline='builtin.module(func.func(test-fir-alias-analysis))' -debug-only=fir-alias-analysis --mlir-disable-threading 2>&1 | FileCheck %s
+
+// Verify that a CUF allocation is recognized as SourceKind::Allocate by
+// fir::AliasAnalysis::getSource.
+
+module {
+ func.func @_QQmain() attributes {fir.bindc_name = "TEST"} {
+ // Allocate two independent device arrays and tag the results; with
+ // op-scoped MemAlloc handling in AA, these should be classified as
+ // Allocate and not alias.
+ %a = cuf.alloc !fir.box<!fir.heap<!fir.array<?xf32>>> {bindc_name = "a1", data_attr = #cuf.cuda<device>, uniq_name = "_QFEa1", test.ptr = "cuf_alloc_a"} -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
+ %b = cuf.alloc !fir.box<!fir.heap<!fir.array<?xf32>>> {bindc_name = "a2", data_attr = #cuf.cuda<device>, uniq_name = "_QFEa2", test.ptr = "cuf_alloc_b"} -> !fir.ref<!fir.box<!fir.heap<!fir.array<?xf32>>>>
+ return
+ }
+}
+
+// CHECK-LABEL: Testing : "_QQmain"
+// Distinct allocations should not alias.
+// CHECK: cuf_alloc_a#0 <-> cuf_alloc_b#0: NoAlias
+
+
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
vzakhari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, Susan!
I have a concern about the "scoped" case (see the comment inlined), otherwise, the approach looks good to me.
|
|
||
| if (!allowOpScoped) | ||
| return false; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we can apply the logic below to operations in general. Operations may have multiple results, so we would not know which result is a new allocation. Even with a single result, I think, an operation can do MemAlloc without returning it as a result. Sorry for being pedantic :)
Can we instead modify cuf.alloc to attach the MemAlloc effect to its result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok yea.... that was what I was originally doing but didn't know if it's ok to modify cuf. I'll add @clementval in for review as well then once im done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it's ok to modify the op. Just make it in a separate PR maybe.
vzakhari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the update! It looks good, though I have a couple of suggestions in the inline comments.
Please also add [NFC] to the PR title.
| mlir::MemoryEffects::Allocate::get(), | ||
| mlir::cast<mlir::OpResult>(getOperation()->getResult(0)), | ||
| mlir::SideEffects::AutomaticAllocationScopeResource::get()); | ||
| // Preserve previous behavior: op-scoped Allocate for passes relying on it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we do not need this here and for AllocMemOp. Did you find any LIT tests that fail without this?
I believe the effect above implies the effect that you are adding below. If it is not the case I am interested to see an example that fails.
|
|
||
| #define DEBUG_TYPE "fir-alias-analysis" | ||
|
|
||
| static bool classifyAllocateFromEffects(mlir::Operation *op, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a header comment.
I would also try to simplify the interface of this function (I am not a fan of returning results through references, if this can be avoided :) ).
I recommend getting rid of arguments v, defOp and type. Instead of returning bool we can return either SourceKind::Unknown or SourceKind::Allocate. Then the callers can update their v and defOp values correspondingly if it returns SourceKind::Allocate. I think the interface will become more clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, thank you!
| def fir_AllocaOp : fir_Op<"alloca", [AttrSizedOperandSegments, | ||
| MemoryEffects<[MemAlloc<AutomaticAllocationScopeResource>]>]> { | ||
| def fir_AllocaOp : fir_Op<"alloca", | ||
| [AttrSizedOperandSegments, DeclareOpInterfaceMethods<MemoryEffectsOpInterface>]> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just FYI: if you can remove code in fir::AllocaOp::getEffects as I suggest below, then you can declare the MemAlloc effect in the operation definition like this:
| [MemAlloc<AutomaticAllocationScopeResource>]>:$res); |
This way you won't need to change the cpp files.
jeanPerier
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing on top of Slava's comments. Thanks!
vzakhari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for updating the tests, Susan! Sorry that you have to deal with it, but I consider your change as an improvement :)
Many of these FIR-lowering tests are not relevant any more, but we still need to keep them working. I suppose those fir.alloca operations that are used by unused [hl]fir.declare will stay, since [hl]fir.declare has side effects. So for HLFIR-lowering your change should only affect fir allocation operations for which we do not create [hl]fir.declare and that are unused, and it is good that the compiler will just remove them now.
| %c0 = arith.constant 0 : i32 | ||
| fir.store %c0 to %1 : !fir.ref<i32> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not understand why we need this. Both %0 and %1 are directly or indirectly used by fir.store, so they should not become dead.
flang/test/Fir/alloc.fir
Outdated
| %2 = fir.alloca !fir.box<none> | ||
| %3 = fir.alloca !fir.box<!fir.array<?xnone>> | ||
| // Add real uses so allocas are not trivially dead. | ||
| func.call @__use_class_none(%0) : (!fir.ref<!fir.class<none>>) -> () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please use fir.call instead?
| fir.call @_QFPsb(%18, %19#0) fastmath<contract> : (!fir.ref<!fir.array<?xf64>>, !fir.ref<i32>) -> () | ||
| hlfir.copy_out %0, %17#1 : (!fir.ref<!fir.box<!fir.heap<!fir.array<?xf64>>>>, i1) -> () | ||
| hlfir.end_associate %19#1, %19#2 : !fir.ref<i32>, i1 | ||
| // Keep %0 live to avoid DCE after inlining when no copy_out is needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I do not get what happens in this case - can you please explain? I think we do not have to insert the fake use here.
| // CHECK: %[[VAL_22:.*]] = fir.box_addr %[[VAL_21:.*]]#0 : (!fir.box<!fir.array<?xf64>>) -> !fir.ref<!fir.array<?xf64>> | ||
| // CHECK: %[[VAL_23:.*]]:3 = hlfir.associate %[[VAL_5:.*]] {adapt.valuebyref} : (i32) -> (!fir.ref<i32>, !fir.ref<i32>, i1) | ||
| // CHECK: fir.call @_QFPsb(%[[VAL_22:.*]], %[[VAL_23:.*]]#0) fastmath<contract> : (!fir.ref<!fir.array<?xf64>>, !fir.ref<i32>) -> () | ||
| // CHECK: hlfir.copy_out %16, %15#1 : (!fir.ref<!fir.box<!fir.array<?xf64>>>, i1) -> () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like some checks are using literal values, so I guess some SSA renumbering could fail it.
| fir.call @_QFPsb(%18, %19#1) fastmath<contract> : (!fir.ref<!fir.array<?xf64>>, !fir.ref<i32>) -> () | ||
| hlfir.copy_out %0, %17#1 to %16 : (!fir.ref<!fir.box<!fir.heap<!fir.array<?xf64>>>>, i1, !fir.box<!fir.array<?xf64>>) -> () | ||
| hlfir.end_associate %19#1, %19#2 : !fir.ref<i32>, i1 | ||
| // Keep %0 live to avoid DCE after inlining. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: no need for a fake use.
| %3:2 = hlfir.copy_in %2#0 to %0 : (!fir.box<!fir.array<*:f32>>, !fir.ref<!fir.box<!fir.heap<!fir.array<*:f32>>>>) -> (!fir.box<!fir.array<*:f32>>, i1) | ||
| fir.call @_QPtakes_contiguous_intentin(%3#0) fastmath<contract> : (!fir.box<!fir.array<*:f32>>) -> () | ||
| hlfir.copy_out %0, %3#1 : (!fir.ref<!fir.box<!fir.heap<!fir.array<*:f32>>>>, i1) -> () | ||
| // Keep %0 live to avoid DCE after inlining. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: no need for a fake use.
|
|
||
| ! CHECK-LABEL: func @_QMdPlocal_derived( | ||
| subroutine local_derived() | ||
| ! CHECK-DAG: fir.alloca !fir.type<_QMdTc2{ch_array:!fir.array<20x30x!fir.char<1,10>>}> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to keep these checks by adding fake uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed! please check the most updated commit. for some the llvm diff doesn't show this being outdated
| ! CHECK: cf.cond_br %{{.*}}, ^[[BODY:.*]], ^[[EXIT:.*]] | ||
|
|
||
| ! CHECK: ^[[BODY]]: | ||
| ! CHECK-NEXT: %{{.*}} = fir.alloca !fir.logical<4> {bindc_name = "success", {{.*}}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need some fake use here (e.g. a store into success).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed! please check the most updated commit. for some the llvm diff doesn't show this being outdated
| ! CHECK-DAG: fir.alloca !fir.array<?x?x?xi32>, %{{.*}}, %{{.*}}, %{{.*}} {bindc_name = "a", fir.target, uniq_name = "_QFlisEa"} | ||
| ! CHECK-DAG: fir.alloca !fir.array<?x?xi32>, %{{.*}}, %{{.*}} {bindc_name = "r", uniq_name = "_QFlisEr"} | ||
| ! CHECK-DAG: fir.alloca !fir.array<?x?xi32>, %{{.*}}, %{{.*}} {bindc_name = "s", uniq_name = "_QFlisEs"} | ||
| ! CHECK-DAG: fir.alloca !fir.array<?x?xi32>, %{{.*}}, %{{.*}} {bindc_name = "t", uniq_name = "_QFlisEt"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a store into t to keep it alive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ignore this comment. This fir.alloca is unused (due to local(t) below), and it does not add much to check for this alloca.
| ! First test is here to have a reference with non polymorphic on both sides. | ||
| ! CHECK-LABEL: func.func @_QMpolymorphic_testPpointer_assign_parent( | ||
| ! CHECK-SAME: %[[ARG0:.*]]: !fir.ref<!fir.type<_QMpolymorphic_testTp2{a:i32,b:i32,c:f32}>> {fir.bindc_name = "p", fir.target}) { | ||
| ! CHECK: %[[TP:.*]] = fir.alloca !fir.box<!fir.ptr<!fir.type<_QMpolymorphic_testTp1{a:i32,b:i32}>>> {bindc_name = "tp", uniq_name = "_QMpolymorphic_testFpointer_assign_parentEtp"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some fake uses here and below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the line below is a fir.alloca of a pointer for "tp" as tp is declared as a pointer. We decided to not change this.
vzakhari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @SusanTan! Looks great!
|
It causes failure in |
Flang alias analysis used to find allocation site by pattern matching allocation ops in mainly FIR dialect. This MR extends the characterization to instead characterize based on whether the result of an op has MemAlloc effect.