Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 45 additions & 14 deletions llvm/lib/Transforms/IPO/AttributorAttributes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -12583,16 +12583,36 @@ struct AAAddressSpaceImpl : public AAAddressSpace {
}

ChangeStatus updateImpl(Attributor &A) override {
unsigned FlatAS = A.getInfoCache().getFlatAddressSpace().value();
uint32_t OldAddressSpace = AssumedAddressSpace;
auto *AUO = A.getOrCreateAAFor<AAUnderlyingObjects>(getIRPosition(), this,
DepClassTy::REQUIRED);
auto Pred = [&](Value &Obj) {

auto CheckAddressSpace = [&](Value &Obj) {
if (isa<UndefValue>(&Obj))
return true;
// If an argument in flat address space only has addrspace cast uses, and
// those casts are same, then we take the dst addrspace.
if (auto *Arg = dyn_cast<Argument>(&Obj)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it is safe for us to directly say here that, if it is an argument of an entry point function, it is in global address space.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you can say that generically with the information available in a non-target pass. I would also consider this case a frontend bug, and they should emit parameters with the correct address space to begin with. We already do some hacking around clang using flat pointers in the backend

Copy link
Contributor Author

@shiltian shiltian Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I'd say that as well, but this is exactly what we do in the kernel argument promotion pass. The front end emits things correctly but we did this on purpose in the pass, hoping InferAddressSpacePass to pick it up. I have conservative opinions on whether the pass should do that way. I don't know why we can't just update the argument type to ptr addrspace(1) instead of making this hassle. In OpenCL pre 2.0 we have that requirement anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked to @rampitec that we can't change kernel's signature. We can only recognize the pattern and support it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course you can change the kernel signature. clang should emit IR with addrspace(1) values directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion with @rampitec, in HIP all pointers which can be passed to a kernel are flat.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a source level, yes. But that has no bearing on what IR clang produces. clang can and should emit ptr addrspace(1) pointers with a cast to addrspace(0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, the AMDGPUPromoteKernelArgumentsPass should not exist at the first place.
Both Flang and Clang (for OpenMP target offloading) still generate generic pointer.
HIP and OpenCL front ends generate AS1 pointer, which is convenient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it shouldn't. It's a hack for frontends not emitting the nicer IR

if (Arg->getType()->getPointerAddressSpace() == FlatAS) {
unsigned CastAddrSpace = FlatAS;
for (auto *U : Arg->users()) {
auto *ASCI = dyn_cast<AddrSpaceCastInst>(U);
if (!ASCI)
return takeAddressSpace(Obj.getType()->getPointerAddressSpace());
if (CastAddrSpace != FlatAS &&
CastAddrSpace != ASCI->getDestAddressSpace())
return false;
CastAddrSpace = ASCI->getDestAddressSpace();
}
if (CastAddrSpace != FlatAS)
return takeAddressSpace(CastAddrSpace);
}
}
return takeAddressSpace(Obj.getType()->getPointerAddressSpace());
};

if (!AUO->forallUnderlyingObjects(Pred))
auto *AUO = A.getOrCreateAAFor<AAUnderlyingObjects>(getIRPosition(), this,
DepClassTy::REQUIRED);
if (!AUO->forallUnderlyingObjects(CheckAddressSpace))
return indicatePessimisticFixpoint();

return OldAddressSpace == AssumedAddressSpace ? ChangeStatus::UNCHANGED
Expand All @@ -12601,17 +12621,21 @@ struct AAAddressSpaceImpl : public AAAddressSpace {

/// See AbstractAttribute::manifest(...).
ChangeStatus manifest(Attributor &A) override {
if (getAddressSpace() == InvalidAddressSpace ||
getAddressSpace() == getAssociatedType()->getPointerAddressSpace())
unsigned NewAS = getAddressSpace();

if (NewAS == InvalidAddressSpace ||
NewAS == getAssociatedType()->getPointerAddressSpace())
return ChangeStatus::UNCHANGED;

unsigned FlatAS = A.getInfoCache().getFlatAddressSpace().value();

Value *AssociatedValue = &getAssociatedValue();
Value *OriginalValue = peelAddrspacecast(AssociatedValue);
Value *OriginalValue = peelAddrspacecast(AssociatedValue, FlatAS);

PointerType *NewPtrTy =
PointerType::get(getAssociatedType()->getContext(), getAddressSpace());
PointerType::get(getAssociatedType()->getContext(), NewAS);
bool UseOriginalValue =
OriginalValue->getType()->getPointerAddressSpace() == getAddressSpace();
OriginalValue->getType()->getPointerAddressSpace() == NewAS;

bool Changed = false;

Expand Down Expand Up @@ -12671,12 +12695,19 @@ struct AAAddressSpaceImpl : public AAAddressSpace {
return AssumedAddressSpace == AS;
}

static Value *peelAddrspacecast(Value *V) {
if (auto *I = dyn_cast<AddrSpaceCastInst>(V))
return peelAddrspacecast(I->getPointerOperand());
static Value *peelAddrspacecast(Value *V, unsigned FlatAS) {
if (auto *I = dyn_cast<AddrSpaceCastInst>(V)) {
assert(I->getSrcAddressSpace() != FlatAS &&
"there should not be flat AS -> non-flat AS");
return I->getPointerOperand();
}
if (auto *C = dyn_cast<ConstantExpr>(V))
if (C->getOpcode() == Instruction::AddrSpaceCast)
return peelAddrspacecast(C->getOperand(0));
if (C->getOpcode() == Instruction::AddrSpaceCast) {
assert(C->getOperand(0)->getType()->getPointerAddressSpace() !=
FlatAS &&
"there should not be flat AS -> non-flat AS X");
return C->getOperand(0);
}
return V;
}
};
Expand Down
35 changes: 35 additions & 0 deletions llvm/test/CodeGen/AMDGPU/aa-as-infer.ll
Original file line number Diff line number Diff line change
Expand Up @@ -243,3 +243,38 @@ define void @foo(ptr addrspace(3) %val) {
ret void
}

define void @kernel_argument_promotion_pattern_intra_procedure(ptr %p, i32 %val) {
; CHECK-LABEL: define void @kernel_argument_promotion_pattern_intra_procedure(
; CHECK-SAME: ptr [[P:%.*]], i32 [[VAL:%.*]]) #[[ATTR0]] {
; CHECK-NEXT: [[P_CAST_0:%.*]] = addrspacecast ptr [[P]] to ptr addrspace(1)
; CHECK-NEXT: store i32 [[VAL]], ptr addrspace(1) [[P_CAST_0]], align 4
; CHECK-NEXT: ret void
;
%p.cast.0 = addrspacecast ptr %p to ptr addrspace(1)
%p.cast.1 = addrspacecast ptr addrspace(1) %p.cast.0 to ptr
store i32 %val, ptr %p.cast.1
ret void
}

define internal void @use_argument_after_promotion(ptr %p, i32 %val) {
; CHECK-LABEL: define internal void @use_argument_after_promotion(
; CHECK-SAME: ptr [[P:%.*]], i32 [[VAL:%.*]]) #[[ATTR0]] {
; CHECK-NEXT: [[TMP1:%.*]] = addrspacecast ptr [[P]] to ptr addrspace(1)
; CHECK-NEXT: store i32 [[VAL]], ptr addrspace(1) [[TMP1]], align 4
; CHECK-NEXT: ret void
;
store i32 %val, ptr %p
ret void
}

define void @kernel_argument_promotion_pattern_inter_procedure(ptr %p, i32 %val) {
; CHECK-LABEL: define void @kernel_argument_promotion_pattern_inter_procedure(
; CHECK-SAME: ptr [[P:%.*]], i32 [[VAL:%.*]]) #[[ATTR0]] {
; CHECK-NEXT: call void @use_argument_after_promotion(ptr [[P]], i32 [[VAL]])
; CHECK-NEXT: ret void
;
%p.cast.0 = addrspacecast ptr %p to ptr addrspace(1)
%p.cast.1 = addrspacecast ptr addrspace(1) %p.cast.0 to ptr
call void @use_argument_after_promotion(ptr %p.cast.1, i32 %val)
ret void
}
Loading