Skip to content

Commit b2edd95

Browse files
authored
[Codegen][AMDGPU] Drop backend reverts, emergency RDNA4 lowering fix (iree-org#21906)
The lowering for amdgpu.lds_barrier upstream is incorrect in the face of the backend changes whose reverts this commit drops. The lowering will be fixed in the near future. Meanwhile, stick with gpu.barrier (incurring the full __synchthreads() semantics and thus potentially introduceing unwanted fencing on global memory) for RDNA4 until the issue upstream is resolved. This may incur performance penalties in some cases.
1 parent a8e9fe9 commit b2edd95

File tree

2 files changed

+9
-4
lines changed

2 files changed

+9
-4
lines changed

compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToROCDL.cpp

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -69,8 +69,13 @@ struct ReplaceGPUBarrierWithLDSBarrier
6969
}
7070
};
7171

72-
static void populateConvertGPUToAMDGPUPatterns(RewritePatternSet &patterns) {
73-
patterns.add<ReplaceGPUBarrierWithLDSBarrier>(patterns.getContext());
72+
static void populateConvertGPUToAMDGPUPatterns(RewritePatternSet &patterns,
73+
const amdgpu::Chipset &chipset) {
74+
// TODO(kdrewnia): This if statement is an emergency fix for an incorrect
75+
// lowering of amdgpu.lds_barrier.
76+
if (chipset.majorVersion != 12) {
77+
patterns.add<ReplaceGPUBarrierWithLDSBarrier>(patterns.getContext());
78+
}
7479
}
7580

7681
/// Hacky pattern to swap `s_setprio` operations with `amdgpu.mfma` ops.
@@ -266,7 +271,7 @@ struct ConvertToROCDLPass final
266271
/*allowPackedF16Rtz=*/false, /*chipset=*/*maybeChipset);
267272
arith::populateCeilFloorDivExpandOpsPatterns(patterns);
268273
populateSwapSetPrioWithMFMAPatterns(patterns);
269-
populateConvertGPUToAMDGPUPatterns(patterns);
274+
populateConvertGPUToAMDGPUPatterns(patterns, *maybeChipset);
270275
populateConvertSharedMemoryAllocOps(patterns);
271276
populateDropSharedMemoryDeallocOpPatterns(patterns);
272277
vector::populateVectorToVectorCanonicalizationPatterns(patterns);

0 commit comments

Comments
 (0)