[RISCV][LLVM] Enable atomics for 'Zalrsc' #163672

slachowsky · 2025-10-16T01:07:11Z

The 'A' atomics extension is composed of two subextensions, 'Zaamo' which has
atomic memory operation instructions, and 'Zalrsc' which has load-reserve /
store-conditional instructions.

For machines where 'Zalrsc' is present, but 'Zaamo' is not, implement and enable
atomics memory operations through pseudo expansion. Updates the predication
and lowering control to be more precise about which 'Zaamo'/'Zalrsc' feature
was truly requisite.

There will be no functional change to subtargets supporting 'A', while allowing
'Zalrsc' only subtargets to utilize atomics at an increased code footprint.

github-actions · 2025-10-16T01:07:30Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-10-16T01:08:00Z

@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-clang

Author: None (slachowsky)

Changes

The 'A' atomics extension is composed of two subextensions, 'Zaamo' which has
atomic memory operation instructions, and 'Zalrsc' which has load-reserve /
store-conditional instructions.

For machines where 'Zalrsc' is present, but 'Zaamo' is not, implement and enable
atomics memory operations through pseudo expansion. Updates the predication
and lowering control to be more precise about which 'Zaamo'/'Zalrsc' feature
was truly requisite.

There will be no functional change to subtargets supporting 'A', while allowing
'Zalrsc' only subtargets to utilize atomics at an increased code footprint.

Patch is 1.02 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/163672.diff

11 Files Affected:

(modified) clang/lib/Basic/Targets/RISCV.cpp (+3-1)
(modified) clang/lib/Basic/Targets/RISCV.h (+2-2)
(modified) llvm/lib/Target/RISCV/RISCVExpandAtomicPseudoInsts.cpp (+258-28)
(modified) llvm/lib/Target/RISCV/RISCVFeatures.td (+2-1)
(modified) llvm/lib/Target/RISCV/RISCVISelLowering.cpp (+3-3)
(modified) llvm/lib/Target/RISCV/RISCVInstrInfoA.td (+68-8)
(modified) llvm/test/CodeGen/RISCV/atomic-fence.ll (+4)
(modified) llvm/test/CodeGen/RISCV/atomic-load-store.ll (+406)
(modified) llvm/test/CodeGen/RISCV/atomic-rmw-sub.ll (+102)
(modified) llvm/test/CodeGen/RISCV/atomic-rmw.ll (+17688)
(modified) llvm/test/CodeGen/RISCV/atomic-signext.ll (+2644)

diff --git a/clang/lib/Basic/Targets/RISCV.cpp b/clang/lib/Basic/Targets/RISCV.cpp
index 04da4e637af51..edfd92fc20bde 100644
--- a/clang/lib/Basic/Targets/RISCV.cpp
+++ b/clang/lib/Basic/Targets/RISCV.cpp
@@ -192,8 +192,10 @@ void RISCVTargetInfo::getTargetDefines(const LangOptions &Opts,
     Builder.defineMacro("__riscv_muldiv");
   }
 
-  if (ISAInfo->hasExtension("a")) {
+  if (ISAInfo->hasExtension("a"))
     Builder.defineMacro("__riscv_atomic");
+
+  if (ISAInfo->hasExtension("a") || ISAInfo->hasExtension("zalrsc")) {
     Builder.defineMacro("__GCC_HAVE_SYNC_COMPARE_AND_SWAP_1");
     Builder.defineMacro("__GCC_HAVE_SYNC_COMPARE_AND_SWAP_2");
     Builder.defineMacro("__GCC_HAVE_SYNC_COMPARE_AND_SWAP_4");
diff --git a/clang/lib/Basic/Targets/RISCV.h b/clang/lib/Basic/Targets/RISCV.h
index d8b0e64c90dd6..3e1442db131a7 100644
--- a/clang/lib/Basic/Targets/RISCV.h
+++ b/clang/lib/Basic/Targets/RISCV.h
@@ -195,7 +195,7 @@ class LLVM_LIBRARY_VISIBILITY RISCV32TargetInfo : public RISCVTargetInfo {
   void setMaxAtomicWidth() override {
     MaxAtomicPromoteWidth = 128;
 
-    if (ISAInfo->hasExtension("a"))
+    if (ISAInfo->hasExtension("a") || ISAInfo->hasExtension("zalrsc"))
       MaxAtomicInlineWidth = 32;
   }
 };
@@ -225,7 +225,7 @@ class LLVM_LIBRARY_VISIBILITY RISCV64TargetInfo : public RISCVTargetInfo {
   void setMaxAtomicWidth() override {
     MaxAtomicPromoteWidth = 128;
 
-    if (ISAInfo->hasExtension("a"))
+    if (ISAInfo->hasExtension("a") || ISAInfo->hasExtension("zalrsc"))
       MaxAtomicInlineWidth = 64;
   }
 };
diff --git a/llvm/lib/Target/RISCV/RISCVExpandAtomicPseudoInsts.cpp b/llvm/lib/Target/RISCV/RISCVExpandAtomicPseudoInsts.cpp
index 5dd4bf415a23c..8f033b6b14038 100644
--- a/llvm/lib/Target/RISCV/RISCVExpandAtomicPseudoInsts.cpp
+++ b/llvm/lib/Target/RISCV/RISCVExpandAtomicPseudoInsts.cpp
@@ -109,12 +109,72 @@ bool RISCVExpandAtomicPseudo::expandMI(MachineBasicBlock &MBB,
   // expanded instructions for each pseudo is correct in the Size field of the
   // tablegen definition for the pseudo.
   switch (MBBI->getOpcode()) {
+  case RISCV::PseudoAtomicSwap32:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Xchg, false, 32,
+                             NextMBBI);
+  case RISCV::PseudoAtomicSwap64:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Xchg, false, 64,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadAdd32:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Add, false, 32,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadAdd64:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Add, false, 64,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadSub32:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Sub, false, 32,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadSub64:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Sub, false, 64,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadAnd32:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::And, false, 32,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadAnd64:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::And, false, 64,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadOr32:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Or, false, 32,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadOr64:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Or, false, 64,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadXor32:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Xor, false, 32,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadXor64:
+    return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Xor, false, 64,
+                             NextMBBI);
   case RISCV::PseudoAtomicLoadNand32:
     return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Nand, false, 32,
                              NextMBBI);
   case RISCV::PseudoAtomicLoadNand64:
     return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Nand, false, 64,
                              NextMBBI);
+  case RISCV::PseudoAtomicLoadMin32:
+    return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::Min, false, 32,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadMin64:
+    return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::Min, false, 64,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadMax32:
+    return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::Max, false, 32,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadMax64:
+    return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::Max, false, 64,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadUMin32:
+    return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::UMin, false, 32,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadUMin64:
+    return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::UMin, false, 64,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadUMax32:
+    return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::UMax, false, 32,
+                             NextMBBI);
+  case RISCV::PseudoAtomicLoadUMax64:
+    return expandAtomicMinMaxOp(MBB, MBBI, AtomicRMWInst::UMax, false, 64,
+                             NextMBBI);
   case RISCV::PseudoMaskedAtomicSwap32:
     return expandAtomicBinOp(MBB, MBBI, AtomicRMWInst::Xchg, true, 32,
                              NextMBBI);
@@ -264,6 +324,9 @@ static void doAtomicBinOpExpansion(const RISCVInstrInfo *TII, MachineInstr &MI,
   Register ScratchReg = MI.getOperand(1).getReg();
   Register AddrReg = MI.getOperand(2).getReg();
   Register IncrReg = MI.getOperand(3).getReg();
+  bool IsUnsigned = BinOp == AtomicRMWInst::UMin ||
+                    BinOp == AtomicRMWInst::UMax;
+  bool Zext = IsUnsigned && STI->is64Bit() && Width == 32;
   AtomicOrdering Ordering =
       static_cast<AtomicOrdering>(MI.getOperand(4).getImm());
 
@@ -277,6 +340,36 @@ static void doAtomicBinOpExpansion(const RISCVInstrInfo *TII, MachineInstr &MI,
   switch (BinOp) {
   default:
     llvm_unreachable("Unexpected AtomicRMW BinOp");
+  case AtomicRMWInst::Xchg:
+    BuildMI(LoopMBB, DL, TII->get(RISCV::ADDI), ScratchReg)
+        .addReg(IncrReg)
+        .addImm(0);
+    break;
+  case AtomicRMWInst::Add:
+    BuildMI(LoopMBB, DL, TII->get(RISCV::ADD), ScratchReg)
+        .addReg(DestReg)
+        .addReg(IncrReg);
+    break;
+  case AtomicRMWInst::Sub:
+    BuildMI(LoopMBB, DL, TII->get(RISCV::SUB), ScratchReg)
+        .addReg(DestReg)
+        .addReg(IncrReg);
+    break;
+  case AtomicRMWInst::And:
+    BuildMI(LoopMBB, DL, TII->get(RISCV::AND), ScratchReg)
+        .addReg(DestReg)
+        .addReg(IncrReg);
+    break;
+  case AtomicRMWInst::Or:
+    BuildMI(LoopMBB, DL, TII->get(RISCV::OR), ScratchReg)
+        .addReg(DestReg)
+        .addReg(IncrReg);
+    break;
+  case AtomicRMWInst::Xor:
+    BuildMI(LoopMBB, DL, TII->get(RISCV::XOR), ScratchReg)
+        .addReg(DestReg)
+        .addReg(IncrReg);
+    break;
   case AtomicRMWInst::Nand:
     BuildMI(LoopMBB, DL, TII->get(RISCV::AND), ScratchReg)
         .addReg(DestReg)
@@ -285,6 +378,34 @@ static void doAtomicBinOpExpansion(const RISCVInstrInfo *TII, MachineInstr &MI,
         .addReg(ScratchReg)
         .addImm(-1);
     break;
+  case AtomicRMWInst::Min:
+    BuildMI(LoopMBB, DL, TII->get(RISCV::MIN), ScratchReg)
+        .addReg(DestReg)
+        .addReg(IncrReg);
+    break;
+  case AtomicRMWInst::Max:
+    BuildMI(LoopMBB, DL, TII->get(RISCV::MAX), ScratchReg)
+        .addReg(DestReg)
+        .addReg(IncrReg);
+    break;
+  case AtomicRMWInst::UMin:
+    if (Zext)
+      BuildMI(LoopMBB, DL, TII->get(RISCV::ADD_UW), ScratchReg)
+          .addReg(DestReg)
+          .addReg(RISCV::X0);
+    BuildMI(LoopMBB, DL, TII->get(RISCV::MINU), ScratchReg)
+        .addReg(Zext ? ScratchReg : DestReg)
+        .addReg(IncrReg);
+    break;
+  case AtomicRMWInst::UMax:
+    if (Zext)
+      BuildMI(LoopMBB, DL, TII->get(RISCV::ADD_UW), ScratchReg)
+          .addReg(DestReg)
+          .addReg(RISCV::X0);
+    BuildMI(LoopMBB, DL, TII->get(RISCV::MAXU), ScratchReg)
+        .addReg(Zext ? ScratchReg : DestReg)
+        .addReg(IncrReg);
+    break;
   }
   BuildMI(LoopMBB, DL, TII->get(getSCForRMW(Ordering, Width, STI)), ScratchReg)
       .addReg(ScratchReg)
@@ -433,38 +554,105 @@ static void insertSext(const RISCVInstrInfo *TII, DebugLoc DL,
       .addReg(ShamtReg);
 }
 
-bool RISCVExpandAtomicPseudo::expandAtomicMinMaxOp(
-    MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
-    AtomicRMWInst::BinOp BinOp, bool IsMasked, int Width,
-    MachineBasicBlock::iterator &NextMBBI) {
-  assert(IsMasked == true &&
-         "Should only need to expand masked atomic max/min");
-  assert(Width == 32 && "Should never need to expand masked 64-bit operations");
+static void insertZext(const RISCVInstrInfo *TII, DebugLoc DL,
+                       MachineBasicBlock *MBB, Register ValReg,
+                       Register SrcReg, int64_t Shamt) {
+  BuildMI(MBB, DL, TII->get(RISCV::SLLI), ValReg)
+      .addReg(SrcReg)
+      .addImm(Shamt);
+  BuildMI(MBB, DL, TII->get(RISCV::SRLI), ValReg)
+      .addReg(ValReg)
+      .addImm(Shamt);
+}
 
-  MachineInstr &MI = *MBBI;
-  DebugLoc DL = MI.getDebugLoc();
-  MachineFunction *MF = MBB.getParent();
-  auto LoopHeadMBB = MF->CreateMachineBasicBlock(MBB.getBasicBlock());
-  auto LoopIfBodyMBB = MF->CreateMachineBasicBlock(MBB.getBasicBlock());
-  auto LoopTailMBB = MF->CreateMachineBasicBlock(MBB.getBasicBlock());
-  auto DoneMBB = MF->CreateMachineBasicBlock(MBB.getBasicBlock());
+static void doAtomicMinMaxOpExpansion(const RISCVInstrInfo *TII, MachineInstr &MI,
+                                      DebugLoc DL, MachineBasicBlock *ThisMBB,
+                                      MachineBasicBlock *LoopHeadMBB,
+                                      MachineBasicBlock *LoopIfBodyMBB,
+                                      MachineBasicBlock *LoopTailMBB,
+                                      MachineBasicBlock *DoneMBB,
+                                      AtomicRMWInst::BinOp BinOp, int Width,
+                                      const RISCVSubtarget *STI) {
+  Register DestReg = MI.getOperand(0).getReg();
+  Register ScratchReg = MI.getOperand(1).getReg();
+  Register AddrReg = MI.getOperand(2).getReg();
+  Register IncrReg = MI.getOperand(3).getReg();
+  bool IsUnsigned = BinOp == AtomicRMWInst::UMin ||
+                    BinOp == AtomicRMWInst::UMax;
+  bool Zext = IsUnsigned && STI->is64Bit() && Width == 32;
+  AtomicOrdering Ordering =
+      static_cast<AtomicOrdering>(MI.getOperand(4).getImm());
 
-  // Insert new MBBs.
-  MF->insert(++MBB.getIterator(), LoopHeadMBB);
-  MF->insert(++LoopHeadMBB->getIterator(), LoopIfBodyMBB);
-  MF->insert(++LoopIfBodyMBB->getIterator(), LoopTailMBB);
-  MF->insert(++LoopTailMBB->getIterator(), DoneMBB);
+  // .loophead:
+  //   lr.[w|d] dest, (addr)
+  //   mv scratch, dest
+  //   ifnochangeneeded scratch, incr, .looptail
+  BuildMI(LoopHeadMBB, DL, TII->get(getLRForRMW(Ordering, Width, STI)), DestReg)
+      .addReg(AddrReg);
+  if (Zext)
+    insertZext(TII, DL, LoopHeadMBB, ScratchReg, DestReg, 32);
+  else
+    BuildMI(LoopHeadMBB, DL, TII->get(RISCV::ADDI), ScratchReg)
+        .addReg(DestReg)
+        .addImm(0);
+  switch (BinOp) {
+  default:
+    llvm_unreachable("Unexpected AtomicRMW BinOp");
+  case AtomicRMWInst::Max: {
+    BuildMI(LoopHeadMBB, DL, TII->get(RISCV::BGE))
+        .addReg(ScratchReg)
+        .addReg(IncrReg)
+        .addMBB(LoopTailMBB);
+    break;
+  }
+  case AtomicRMWInst::Min: {
+    BuildMI(LoopHeadMBB, DL, TII->get(RISCV::BGE))
+        .addReg(IncrReg)
+        .addReg(ScratchReg)
+        .addMBB(LoopTailMBB);
+    break;
+  }
+  case AtomicRMWInst::UMax:
+    BuildMI(LoopHeadMBB, DL, TII->get(RISCV::BGEU))
+        .addReg(ScratchReg)
+        .addReg(IncrReg)
+        .addMBB(LoopTailMBB);
+    break;
+  case AtomicRMWInst::UMin:
+    BuildMI(LoopHeadMBB, DL, TII->get(RISCV::BGEU))
+        .addReg(IncrReg)
+        .addReg(ScratchReg)
+        .addMBB(LoopTailMBB);
+    break;
+  }
 
-  // Set up successors and transfer remaining instructions to DoneMBB.
-  LoopHeadMBB->addSuccessor(LoopIfBodyMBB);
-  LoopHeadMBB->addSuccessor(LoopTailMBB);
-  LoopIfBodyMBB->addSuccessor(LoopTailMBB);
-  LoopTailMBB->addSuccessor(LoopHeadMBB);
-  LoopTailMBB->addSuccessor(DoneMBB);
-  DoneMBB->splice(DoneMBB->end(), &MBB, MI, MBB.end());
-  DoneMBB->transferSuccessors(&MBB);
-  MBB.addSuccessor(LoopHeadMBB);
+  // .loopifbody:
+  //   mv scratch, incr
+  BuildMI(LoopIfBodyMBB, DL, TII->get(RISCV::ADDI), ScratchReg)
+      .addReg(IncrReg)
+      .addImm(0);
+
+  // .looptail:
+  //   sc.[w|d] scratch, scratch, (addr)
+  //   bnez scratch, loop
+  BuildMI(LoopTailMBB, DL, TII->get(getSCForRMW(Ordering, Width, STI)), ScratchReg)
+      .addReg(ScratchReg)
+      .addReg(AddrReg);
+  BuildMI(LoopTailMBB, DL, TII->get(RISCV::BNE))
+      .addReg(ScratchReg)
+      .addReg(RISCV::X0)
+      .addMBB(LoopHeadMBB);
+}
 
+static void doMaskedAtomicMinMaxOpExpansion(const RISCVInstrInfo *TII, MachineInstr &MI,
+                                            DebugLoc DL, MachineBasicBlock *ThisMBB,
+                                            MachineBasicBlock *LoopHeadMBB,
+                                            MachineBasicBlock *LoopIfBodyMBB,
+                                            MachineBasicBlock *LoopTailMBB,
+                                            MachineBasicBlock *DoneMBB,
+                                            AtomicRMWInst::BinOp BinOp, int Width,
+                                            const RISCVSubtarget *STI) {
+  assert(Width == 32 && "Should never need to expand masked 64-bit operations");
   Register DestReg = MI.getOperand(0).getReg();
   Register Scratch1Reg = MI.getOperand(1).getReg();
   Register Scratch2Reg = MI.getOperand(2).getReg();
@@ -541,6 +729,48 @@ bool RISCVExpandAtomicPseudo::expandAtomicMinMaxOp(
       .addReg(Scratch1Reg)
       .addReg(RISCV::X0)
       .addMBB(LoopHeadMBB);
+}
+
+bool RISCVExpandAtomicPseudo::expandAtomicMinMaxOp(
+    MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+    AtomicRMWInst::BinOp BinOp, bool IsMasked, int Width,
+    MachineBasicBlock::iterator &NextMBBI) {
+
+  // Prefer expansion using MIN/MAX/MINU/MAXU instructions if available
+  if (STI->hasStdExtZbb() && !IsMasked)
+    return expandAtomicBinOp(MBB, MBBI, BinOp, IsMasked, Width, NextMBBI);
+
+  MachineInstr &MI = *MBBI;
+  DebugLoc DL = MI.getDebugLoc();
+  MachineFunction *MF = MBB.getParent();
+  auto LoopHeadMBB = MF->CreateMachineBasicBlock(MBB.getBasicBlock());
+  auto LoopIfBodyMBB = MF->CreateMachineBasicBlock(MBB.getBasicBlock());
+  auto LoopTailMBB = MF->CreateMachineBasicBlock(MBB.getBasicBlock());
+  auto DoneMBB = MF->CreateMachineBasicBlock(MBB.getBasicBlock());
+
+  // Insert new MBBs.
+  MF->insert(++MBB.getIterator(), LoopHeadMBB);
+  MF->insert(++LoopHeadMBB->getIterator(), LoopIfBodyMBB);
+  MF->insert(++LoopIfBodyMBB->getIterator(), LoopTailMBB);
+  MF->insert(++LoopTailMBB->getIterator(), DoneMBB);
+
+  // Set up successors and transfer remaining instructions to DoneMBB.
+  LoopHeadMBB->addSuccessor(LoopIfBodyMBB);
+  LoopHeadMBB->addSuccessor(LoopTailMBB);
+  LoopIfBodyMBB->addSuccessor(LoopTailMBB);
+  LoopTailMBB->addSuccessor(LoopHeadMBB);
+  LoopTailMBB->addSuccessor(DoneMBB);
+  DoneMBB->splice(DoneMBB->end(), &MBB, MI, MBB.end());
+  DoneMBB->transferSuccessors(&MBB);
+  MBB.addSuccessor(LoopHeadMBB);
+
+  if (!IsMasked)
+    doAtomicMinMaxOpExpansion(TII, MI, DL, &MBB, LoopHeadMBB, LoopIfBodyMBB,
+                              LoopTailMBB, DoneMBB, BinOp, Width, STI);
+  else
+    doMaskedAtomicMinMaxOpExpansion(TII, MI, DL, &MBB, LoopHeadMBB,
+                                    LoopIfBodyMBB, LoopTailMBB, DoneMBB, BinOp,
+                                    Width, STI);
 
   NextMBBI = MBB.end();
   MI.eraseFromParent();
diff --git a/llvm/lib/Target/RISCV/RISCVFeatures.td b/llvm/lib/Target/RISCV/RISCVFeatures.td
index 5ceb477069188..f5c12998a8baf 100644
--- a/llvm/lib/Target/RISCV/RISCVFeatures.td
+++ b/llvm/lib/Target/RISCV/RISCVFeatures.td
@@ -218,6 +218,7 @@ def HasStdExtZaamo
     : Predicate<"Subtarget->hasStdExtZaamo()">,
       AssemblerPredicate<(any_of FeatureStdExtZaamo),
                          "'Zaamo' (Atomic Memory Operations)">;
+def NoStdExtZaamo : Predicate<"!Subtarget->hasStdExtZaamo()">;
 
 def FeatureStdExtZalrsc
     : RISCVExtension<1, 0, "Load-Reserved/Store-Conditional">;
@@ -1861,7 +1862,7 @@ def FeatureForcedAtomics : SubtargetFeature<
     "forced-atomics", "HasForcedAtomics", "true",
     "Assume that lock-free native-width atomics are available">;
 def HasAtomicLdSt
-    : Predicate<"Subtarget->hasStdExtA() || Subtarget->hasForcedAtomics()">;
+    : Predicate<"Subtarget->hasStdExtZalrsc() || Subtarget->hasForcedAtomics()">;
 
 def FeatureTaggedGlobals : SubtargetFeature<"tagged-globals",
     "AllowTaggedGlobals",
diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
index 7123a2d706787..1e1b4b785059f 100644
--- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp
@@ -688,7 +688,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
   else if (Subtarget.hasStdExtZicbop())
     setOperationAction(ISD::PREFETCH, MVT::Other, Legal);
 
-  if (Subtarget.hasStdExtA()) {
+  if (Subtarget.hasStdExtZalrsc()) {
     setMaxAtomicSizeInBitsSupported(Subtarget.getXLen());
     if (Subtarget.hasStdExtZabha() && Subtarget.hasStdExtZacas())
       setMinCmpXchgSizeInBits(8);
@@ -1558,7 +1558,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
     }
   }
 
-  if (Subtarget.hasStdExtA())
+  if (Subtarget.hasStdExtZaamo())
     setOperationAction(ISD::ATOMIC_LOAD_SUB, XLenVT, Expand);
 
   if (Subtarget.hasForcedAtomics()) {
@@ -21876,7 +21876,7 @@ unsigned RISCVTargetLowering::ComputeNumSignBitsForTargetNode(
       // result is then sign extended to XLEN. With +A, the minimum width is
       // 32 for both 64 and 32.
       assert(getMinCmpXchgSizeInBits() == 32);
-      assert(Subtarget.hasStdExtA());
+      assert(Subtarget.hasStdExtZalrsc());
       return Op.getValueSizeInBits() - 31;
     }
     break;
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfoA.td b/llvm/lib/Target/RISCV/RISCVInstrInfoA.td
index 571d72f904ca7..8b88b2ab89941 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfoA.td
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfoA.td
@@ -308,7 +308,67 @@ class PseudoMaskedAMOMinMaxPat<Intrinsic intrin, Pseudo AMOInst>
           (AMOInst GPR:$addr, GPR:$incr, GPR:$mask, GPR:$shiftamt,
            timm:$ordering)>;
 
-let Predicates = [HasStdExtA] in {
+let Predicates = [HasStdExtZalrsc, NoStdExtZaamo] in {
+
+let Size = 16 in {
+def PseudoAtomicSwap32 : PseudoAMO;
+def PseudoAtomicLoadAdd32 : PseudoAMO;
+def PseudoAtomicLoadSub32 : PseudoAMO;
+def PseudoAtomicLoadAnd32 : PseudoAMO;
+def PseudoAtomicLoadOr32 : PseudoAMO;
+def PseudoAtomicLoadXor32 : PseudoAMO;
+} // Size = 16
+let Size = 24 in {
+def PseudoAtomicLoadMax32 : PseudoAMO;
+def PseudoAtomicLoadMin32 : PseudoAMO;
+} // Size = 24
+let Size = 28 in {
+def PseudoAtomicLoadUMax32 : PseudoAMO;
+def PseudoAtomicLoadUMin32 : PseudoAMO;
+} // Size = 28
+
+defm : PseudoAMOPat<"atomic_swap_i32", PseudoAtomicSwap32>;
+defm : PseudoAMOPat<"atomic_load_add_i32", PseudoAtomicLoadAdd32>;
+defm : PseudoAMOPat<"atomic_load_sub_i32", PseudoAtomicLoadSub32>;
+defm : PseudoAMOPat<"atomic_load_and_i32", PseudoAtomicLoadAnd32>;
+defm : PseudoAMOPat<"atomic_load_or_i32", PseudoAtomicLoadOr32>;
+defm : PseudoAMOPat<"atomic_load_xor_i32", PseudoAtomicLoadXor32>;
+defm : PseudoAMOPat<"atomic_load_max_i32", PseudoAtomicLoadMax32>;
+defm : PseudoAMOPat<"atomic_load_min_i32", PseudoAtomicLoadMin32>;
+defm : PseudoAMOPat<"atomic_load_umax_i32", PseudoAtomicLoadUMax32>;
+defm : PseudoAMOPat<"atomic_load_umin_i32", PseudoAtomicLoadUMin32>;
+} // Predicates = [HasStdExtZalrsc, NoStdExtZaamo]
+
+let Predicates = [HasStdExtZalrsc, NoStdExtZaamo, IsRV64] in {
+
+let Size = 16 in {
+def PseudoAtomicSwap64 : PseudoAMO;
...
[truncated]

slachowsky · 2025-10-16T01:52:42Z

@topperc kept me awake rebasing past 3bec46f

topperc · 2025-10-16T02:08:50Z

llvm/lib/Target/RISCV/RISCVExpandAtomicPseudoInsts.cpp

+    break;
+  case AtomicRMWInst::UMin:
+    if (Zext)
+      BuildMI(LoopMBB, DL, TII->get(RISCV::ADD_UW), ScratchReg)


ADD_UW belong to Zba not Zbb which I don't think you checked for.

add.uw, min(u), max(u), aren't allowed to be in an LR/SC loop by my reading of the ISA manual

"The dynamic
code executed between the LR and SC instructions can only contain instructions from the base ''I''
instruction set, excluding loads, stores, backward jumps, taken backward branches, JALR, FENCE,
and SYSTEM instructions. If the ''C'' extension is supported, then compressed forms of the
aforementioned ''I'' instructions are also permitted."

I missed that. We can drop the middle patch which implements the 'Zbb' enhancement, unless there is interest in keeping this path for machines which exceed the specification.

Let's drop it from this PR. If there is interest we can make another PR and add some way of enabling it.

llvm/lib/Target/RISCV/RISCVExpandAtomicPseudoInsts.cpp

topperc · 2025-10-16T02:46:24Z

llvm/test/CodeGen/RISCV/atomic-rmw.ll

+; RV64I-ZALRSC-NEXT:  .LBB165_1: # =>This Inner Loop Header: Depth=1
+; RV64I-ZALRSC-NEXT:    lr.w a2, (a0)
+; RV64I-ZALRSC-NEXT:    mv a3, a2
+; RV64I-ZALRSC-NEXT:    bge a3, a1, .LBB165_3


The upper 32 bits of a1 are garbage.

topperc · 2025-10-16T02:47:57Z

llvm/test/CodeGen/RISCV/atomic-rmw.ll

+; RV64I-ZALRSC-NEXT:    lr.w a2, (a0)
+; RV64I-ZALRSC-NEXT:    slli a3, a2, 32
+; RV64I-ZALRSC-NEXT:    srli a3, a3, 32
+; RV64I-ZALRSC-NEXT:    bgeu a3, a1, .LBB175_3


The upper 32 bits of a1 are garbage

This is unfortunate.

The type legalizer is handling the 32-bit arguments properly for the cmpxchg case because TLI.getExtendForAtomicCmpSwapArg() controls the argument promotion in DAGTypeLegalizer::PromoteIntRes_AtomicCmpSwap(), and RISCV TLI overrides this virtual to return SIGN_EXTEND for correct behavior.

No such control was plumbed for DAGTypeLegalizer::PromoteIntRes_Atomic1() used for atomicrmw <op> cases, so we get any extended arguments always.

We could add another virtual TLI function to control this more explicitly for anyone else that might want it.

Or some early RISCV DAGCombine to adjust atomicrmw {min,max,umin,umax} arguments and avoid the default legalizer behavior.

Or solve this in the pseudo expansion, but that is not ideal because I believe it would require a second scratch register.

Thoughts?

Reference to the same issue that plagued cmpxchg many years ago: 616289e

Resolved in additional commit: f8ac954

Let me know if you dislike this approach, or want to squash this into the first commit.

lenary · 2025-10-16T04:19:35Z

Is there an abridged history of the RISC-V "A" spec anywhere? I feel reasonably out-of-touch with how it has developed (and been split into sub-extensions).

clang/lib/Basic/Targets/RISCV.h

wangpc-pp · 2025-10-16T08:37:24Z

Is there an abridged history of the RISC-V "A" spec anywhere? I feel reasonably out-of-touch with how it has developed (and been split into sub-extensions).

A exists for a long time (maybe since the birth of RISC-V?). The problem is that A extension puts two categories of atomics (LR/SC and AMO) into one extension. For some MCUs cores, they would prefer to implement LR/SC (or AMO, if exists) only. So the RVI decided to split A into Zalrsc and Zaamo (https://github.com/riscv/riscv-zaamo-zalrsc/blob/main/zaamo-zalrsc.adoc). Such things have happened several times, like M and Zmmul, C and Zca/Zcf/Zcd/...

I added these two extensions in the MC level but haven't gotten a chance to support them in CodeGen. :-)

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/lib/Target/RISCV/RISCVInstrInfoA.td

topperc · 2025-10-16T23:38:37Z

clang/lib/Basic/Targets/RISCV.cpp


-  if (ISAInfo->hasExtension("a")) {
+  // The "a" extension is composed of "zalrsc" and "zaamo"
+  if (ISAInfo->hasExtension("zalrsc") && ISAInfo->hasExtension("zaamo"))


I think this should stay "a". I put up #163890 to make Zalrsc+Zaamo imply A.

rebased on top of #163890

lenary · 2025-10-20T18:18:03Z

I'm not going to manage to review this PR this week. Once the other reviewers are happy, don't block waiting for me to look at this in depth.

topperc · 2025-10-20T18:24:39Z

llvm/lib/Target/RISCV/RISCVInstrInfoA.td

           timm:$ordering)>;

-let Predicates = [HasStdExtA] in {
+let Predicates = [HasStdExtZalrsc, NoStdExtZaamo] in {


There is a comment around line 161 above where HasAtomicLdSt is used that says Atomic load/store are available under both +a and +force-atomics.. This comment needs to be updated to +zalrsc

github-actions · 2025-10-20T18:42:41Z

✅ With the latest revision this PR passed the C/C++ code formatter.

clang/lib/Basic/Targets/RISCV.cpp

fpetrogalli

LGTM, thank you.

clang/test/Preprocessor/riscv-atomics.c

topperc · 2025-10-20T23:37:02Z

I think you lost some of your patch?

The 'A' atomics extension is composed of two subextensions, 'Zaamo' which has atomic memory operation instructions, and 'Zalrsc' which has load-reserve / store-conditional instructions. For machines where 'Zalrsc' is present, but 'Zaamo' is not, implement and enable atomics memory operations through pseudo expansion. Updates the predication and lowering control to be more precise about which 'Zaamo'/'Zalrsc' feature was truly requisite. There will be no functional change to subtargets supporting 'A', while allowing 'Zalrsc' only subtargets to utilize atomics at an increased code footprint.

Similar to the argument sign-extend control provided by getExtendForAtomicCmpSwapArg for `cmpxchg`, this change adds getExtendForAtomicRMWArg for `atomicrmw <op>`. This mechanism is used to correct the RISCV code generation when using atomic pseudo expansions that perform sub-word comparions inside a LR/SC block, and expect a correctly sign-extended input argument. A shorter instruction sequence is possible for umin/max by leveraging the fact that two sign-extended arguments still order correctly during an unsigned comparison.

'Zalrsc' only subtargets are atomics capable, ensure the RISCV target exposes this correctly.

slachowsky · 2025-10-21T00:03:42Z

I think you lost some of your patch?

It should be restored, bad rebase.

I don't have contributor access, so GitHub says a maintainer needs to approve the "Build and Test " workflows, I hope you can oblige.

topperc

LGTM

github-actions · 2025-10-21T16:33:42Z

@slachowsky Congratulations on having your first Pull Request (PR) merged into the LLVM Project!

Your changes will be combined with recent changes from other authors, then tested by our build bots. If there is a problem with a build, you may receive a report in an email or a comment on this PR.

Please check whether problems have been caused by your change specifically, as the builds can include changes from many authors. It is not uncommon for your change to be included in a build that fails due to someone else's changes, or infrastructure issues.

How to do this, and the rest of the post-merge process, is covered in detail here.

If your change does cause a problem, it may be reverted, or you can revert it yourself. This is a normal part of LLVM development. You can fix your changes and open a new PR to merge them again.

If you don't get any reports, no action is required from you. Your changes are working as expected, well done!

The 'A' atomics extension is composed of two subextensions, 'Zaamo' which has atomic memory operation instructions, and 'Zalrsc' which has load-reserve / store-conditional instructions. For machines where 'Zalrsc' is present, but 'Zaamo' is not, implement and enable atomics memory operations through pseudo expansion. Updates the predication and lowering control to be more precise about which 'Zaamo'/'Zalrsc' feature was truly requisite. There will be no functional change to subtargets supporting 'A', while allowing 'Zalrsc' only subtargets to utilize atomics at an increased code footprint.

llvmbot added clang Clang issues not falling into any other category backend:RISC-V clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Oct 16, 2025

topperc requested a review from lenary October 16, 2025 02:01