Skip to content

Conversation

rampitec
Copy link
Collaborator

DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 shall not be claused (we already do
not clause DS instructions) and needs waits before and after.

DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 shall not be claused (we already do
not clause DS instructions) and needs waits before and after.
Copy link
Collaborator Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@rampitec rampitec requested review from changpeng and shiltian August 15, 2025 20:40
@rampitec rampitec marked this pull request as ready for review August 15, 2025 20:40
@llvmbot
Copy link
Member

llvmbot commented Aug 15, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Stanislav Mekhanoshin (rampitec)

Changes

DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 shall not be claused (we already do
not clause DS instructions) and needs waits before and after.


Full diff: https://github.com/llvm/llvm-project/pull/153872.diff

4 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp (+17)
  • (modified) llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h (+1)
  • (modified) llvm/lib/Target/AMDGPU/GCNSubtarget.h (+6)
  • (added) llvm/test/CodeGen/AMDGPU/hazards-gfx1250.mir (+17)
diff --git a/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp b/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
index 1f291ce5c5342..5e297c7540c48 100644
--- a/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
@@ -1202,6 +1202,8 @@ void GCNHazardRecognizer::fixHazards(MachineInstr *MI) {
   fixRequiredExportPriority(MI);
   if (ST.requiresWaitIdleBeforeGetReg())
     fixGetRegWaitIdle(MI);
+  if (ST.hasDsAtomicAsyncBarrierArriveB64PipeBug())
+    fixDsAtomicAsyncBarrierArriveB64(MI);
 }
 
 static bool isVCmpXWritesExec(const SIInstrInfo &TII, const SIRegisterInfo &TRI,
@@ -3451,3 +3453,18 @@ bool GCNHazardRecognizer::fixGetRegWaitIdle(MachineInstr *MI) {
       .addImm(0);
   return true;
 }
+
+bool GCNHazardRecognizer::fixDsAtomicAsyncBarrierArriveB64(MachineInstr *MI) {
+  if (MI->getOpcode() != AMDGPU::DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64)
+    return false;
+
+  const SIInstrInfo *TII = ST.getInstrInfo();
+  BuildMI(*MI->getParent(), MI, MI->getDebugLoc(),
+          TII->get(AMDGPU::S_WAITCNT_DEPCTR))
+      .addImm(0xFFE3);
+  BuildMI(*MI->getParent(), std::next(MI->getIterator()), MI->getDebugLoc(),
+          TII->get(AMDGPU::S_WAITCNT_DEPCTR))
+      .addImm(0xFFE3);
+
+  return true;
+}
diff --git a/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h b/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
index a078f50219c3c..890d5cbd154d6 100644
--- a/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
+++ b/llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
@@ -111,6 +111,7 @@ class GCNHazardRecognizer final : public ScheduleHazardRecognizer {
   bool fixVALUMaskWriteHazard(MachineInstr *MI);
   bool fixRequiredExportPriority(MachineInstr *MI);
   bool fixGetRegWaitIdle(MachineInstr *MI);
+  bool fixDsAtomicAsyncBarrierArriveB64(MachineInstr *MI);
 
   int checkMAIHazards(MachineInstr *MI);
   int checkMAIHazards908(MachineInstr *MI);
diff --git a/llvm/lib/Target/AMDGPU/GCNSubtarget.h b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
index 92de024cc6fcc..436f5c0801fad 100644
--- a/llvm/lib/Target/AMDGPU/GCNSubtarget.h
+++ b/llvm/lib/Target/AMDGPU/GCNSubtarget.h
@@ -1815,6 +1815,12 @@ class GCNSubtarget final : public AMDGPUGenSubtargetInfo,
     // to the same register.
     return false;
   }
+
+  // DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 shall not be claused with anything
+  // and surronded by S_WAIT_ALU(0xFFE3).
+  bool hasDsAtomicAsyncBarrierArriveB64PipeBug() const {
+    return getGeneration() == GFX12;
+  }
 };
 
 class GCNUserSGPRUsageInfo {
diff --git a/llvm/test/CodeGen/AMDGPU/hazards-gfx1250.mir b/llvm/test/CodeGen/AMDGPU/hazards-gfx1250.mir
new file mode 100644
index 0000000000000..f1dbabf1e1a83
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/hazards-gfx1250.mir
@@ -0,0 +1,17 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+# RUN: llc -mtriple=amdgcn -mcpu=gfx1250 -run-pass=post-RA-hazard-rec %s -o - | FileCheck -check-prefix=GCN %s
+
+---
+name: ds_atomic_async_barrier_arrive_b64
+tracksRegLiveness: true
+body: |
+  bb.0:
+    liveins: $vgpr0, $vgpr1
+    ; GCN-LABEL: name: ds_atomic_async_barrier_arrive_b64
+    ; GCN: liveins: $vgpr0, $vgpr1
+    ; GCN-NEXT: {{  $}}
+    ; GCN-NEXT: S_WAITCNT_DEPCTR 65507
+    ; GCN-NEXT: DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 $vgpr1, 0, 0, implicit-def $asynccnt, implicit $asynccnt, implicit $exec
+    ; GCN-NEXT: S_WAITCNT_DEPCTR 65507
+    DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 $vgpr1, 0, 0, implicit-def $asynccnt, implicit $asynccnt, implicit $exec
+...

@rampitec rampitec merged commit 1f25c48 into main Aug 15, 2025
13 checks passed
@rampitec rampitec deleted the users/rampitec/08-15-_amdgpu_mitigate_ds_atomic_async_barrier_arrive_b64_bug branch August 15, 2025 21:17
@llvm-ci
Copy link
Collaborator

llvm-ci commented Aug 15, 2025

LLVM Buildbot has detected a new failure on builder clangd-ubuntu-tsan running on clangd-ubuntu-clang while building llvm at step 5 "build-clangd-clangd-index-server-clangd-indexer".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/134/builds/24508

Here is the relevant piece of the build log for the reference
Step 5 (build-clangd-clangd-index-server-clangd-indexer) failure: build (failure)
0.008 [3338/18/1] Creating /vol/worker/clangd-ubuntu-clang/clangd-ubuntu-tsan/build/NATIVE...
0.009 [3337/18/2] Building CXX object lib/Demangle/CMakeFiles/LLVMDemangle.dir/DLangDemangle.cpp.o
FAILED: lib/Demangle/CMakeFiles/LLVMDemangle.dir/DLangDemangle.cpp.o 
ccache /usr/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/vol/worker/clangd-ubuntu-clang/clangd-ubuntu-tsan/build/lib/Demangle -I/vol/worker/clangd-ubuntu-clang/clangd-ubuntu-tsan/llvm-project/llvm/lib/Demangle -I/vol/worker/clangd-ubuntu-clang/clangd-ubuntu-tsan/build/include -I/vol/worker/clangd-ubuntu-clang/clangd-ubuntu-tsan/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fno-omit-frame-pointer -gline-tables-only -fsanitize=thread -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -std=c++17 -MD -MT lib/Demangle/CMakeFiles/LLVMDemangle.dir/DLangDemangle.cpp.o -MF lib/Demangle/CMakeFiles/LLVMDemangle.dir/DLangDemangle.cpp.o.d -o lib/Demangle/CMakeFiles/LLVMDemangle.dir/DLangDemangle.cpp.o -c /vol/worker/clangd-ubuntu-clang/clangd-ubuntu-tsan/llvm-project/llvm/lib/Demangle/DLangDemangle.cpp
ccache: error: /vol/ccache/ccache.conf: No such file or directory
0.070 [3337/17/3] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/AutoConvert.cpp.o
1.054 [3337/16/4] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/AArch64BuildAttributes.cpp.o
1.237 [3337/15/5] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/ARMBuildAttributes.cpp.o
1.275 [3337/14/6] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/ARMWinEH.cpp.o
1.345 [3337/13/7] Building CXX object lib/Demangle/CMakeFiles/LLVMDemangle.dir/RustDemangle.cpp.o
1.345 [3337/12/8] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/FormattedStream.cpp.o
1.348 [3337/11/9] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Allocator.cpp.o
1.434 [3337/10/10] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/APSInt.cpp.o
1.542 [3337/9/11] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/AArch64AttributeParser.cpp.o
1.605 [3337/8/12] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/FloatingPointMode.cpp.o
1.654 [3337/7/13] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/FoldingSet.cpp.o
2.076 [3337/6/14] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/APFixedPoint.cpp.o
2.401 [3337/5/15] Building CXX object lib/Demangle/CMakeFiles/LLVMDemangle.dir/MicrosoftDemangle.cpp.o
2.637 [3337/4/16] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/ARMAttributeParser.cpp.o
4.181 [3337/3/17] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/APInt.cpp.o
4.349 [3337/2/18] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/APFloat.cpp.o
14.660 [3337/1/19] Configuring NATIVE LLVM...
-- The C compiler identification is Clang 18.1.0
-- The CXX compiler identification is Clang 18.1.0
-- The ASM compiler identification is Clang with GNU-like command-line
-- Found assembler: /usr/bin/clang
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/clang - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/clang++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- bolt project is disabled
-- clang project is enabled
-- clang-tools-extra project is enabled
-- compiler-rt project is disabled
-- cross-project-tests project is disabled
-- libclc project is disabled
-- lld project is disabled
-- lldb project is disabled
-- mlir project is disabled
-- openmp project is disabled
-- polly project is disabled
-- flang project is disabled
-- libc project is disabled
-- Found Python3: /usr/bin/python3.8 (found suitable version "3.8.0", minimum required is "3.0") found components: Interpreter 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants