Skip to content

Commit 3b4d632

Browse files
authored
[AMD] Disable f16/bf16 buffer atomic operations (triton-lang#6090)
Buffer atomic for f16/bf16 dtypes are lowered to ``@llvm.amdgcn.raw.buffer.atomic.fadd.v2f16`` intrinsics which require its input addr to be 4-bytes alligned. Disable this cases for now until the propper fix is implemented. Signed-off-by: joviliast <[email protected]>
1 parent 02f101f commit 3b4d632

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

third_party/amd/lib/TritonAMDGPUTransforms/ConvertToBufferOps.cpp

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -313,8 +313,12 @@ struct ConvertTritonAtomicRMWOpToBufferAtomicRMW
313313
// 4. Buffer atomic RMW does not support FP8 ops
314314
// easier to just check what we support
315315
auto checkType = getElementTypeOrSelf(op.getVal());
316-
bool isSupportedType = checkType.isF16() || checkType.isBF16() ||
317-
checkType.isF32() || checkType.isF64() ||
316+
// TODO: F16 and BF16 data types are supported by intrinsics with packed
317+
// arithmetic on adjacent addresses, requiring the leading address to be
318+
// 4-byte aligned. A runtime check should be implemented to enforce this
319+
// requirement and ensure fallback to regular atomic operations when
320+
// alignment is not met.
321+
bool isSupportedType = checkType.isF32() || checkType.isF64() ||
318322
checkType.isInteger(32) || checkType.isInteger(64);
319323
if (!isSupportedType) {
320324
return rewriter.notifyMatchFailure(op, "RMW with unsupported type");

0 commit comments

Comments
 (0)