Skip to content

Conversation

@rcorcs
Copy link
Contributor

@rcorcs rcorcs commented Mar 19, 2025

This patch fixes the following two issues with the createCmpJE for AArch64:

  1. Avoids overwriting the value of the input register RegNo by use XZR as the destination register.
    subs xzr, RegNo, #Imm
    which is equivalent to a simple
    cmp RegNo, #Imm
  2. The immediate operand to the Bcc instruction must be EQ instead of #Imm.

This patch also adds a new function for createCmpJNE and unit tests for the both createCmpJE and createCmpJNE for X86 and AArch64.

@llvmbot
Copy link
Member

llvmbot commented Mar 19, 2025

@llvm/pr-subscribers-bolt

Author: Rodrigo Rocha (rcorcs)

Changes

This patch fixes the following two issues with the createCmpJE for AArch64:

  1. Avoids overwriting the value of the input register RegNo by use XZR as the destination register.
    subs xzr, RegNo, #Imm
    which is equivalent to a simple
    cmp RegNo, #Imm
  2. The immediate operand to the Bcc instruction must be EQ instead of #Imm.

This patch also ds a new function for createCmpJNE and unit tests for the both createCmpJE and createCmpJNE for X86 and AArch64.


Full diff: https://github.com/llvm/llvm-project/pull/131949.diff

4 Files Affected:

  • (modified) bolt/include/bolt/Core/MCPlusBuilder.h (+10)
  • (modified) bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp (+33-1)
  • (modified) bolt/lib/Target/X86/X86MCPlusBuilder.cpp (+14)
  • (modified) bolt/unittests/Core/MCPlusBuilder.cpp (+88)
diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h
index 1d45a314a17b6..a9d79112b641a 100644
--- a/bolt/include/bolt/Core/MCPlusBuilder.h
+++ b/bolt/include/bolt/Core/MCPlusBuilder.h
@@ -1714,6 +1714,16 @@ class MCPlusBuilder {
     return {};
   }
 
+  /// Create a sequence of instructions to compare contents of a register
+  /// \p RegNo to immediate \Imm and jump to \p Target if they are different.
+  virtual InstructionListType createCmpJNE(MCPhysReg RegNo, int64_t Imm,
+                                          const MCSymbol *Target,
+                                          MCContext *Ctx) const {
+    llvm_unreachable("not implemented");
+    return {};
+  }
+
+
   /// Creates inline memcpy instruction. If \p ReturnEnd is true, then return
   /// (dest + n) instead of dest.
   virtual InstructionListType createInlineMemcpy(bool ReturnEnd) const {
diff --git a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
index 685b2279e5afb..bb627ce2586b0 100644
--- a/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+++ b/bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
@@ -1321,17 +1321,49 @@ class AArch64MCPlusBuilder : public MCPlusBuilder {
 
   int getUncondBranchEncodingSize() const override { return 28; }
 
+  //This helper function creates the snippet of code
+  //that compares a register RegNo with an immedaite Imm,
+  //and jumps to Target if they are equal.
+  //cmp RegNo, #Imm
+  //b.eq Target
+  //where cmp is an for subs, which results in the code below:
+  //subs xzr, RegNo, #Imm
+  //b.eq Target.
   InstructionListType createCmpJE(MCPhysReg RegNo, int64_t Imm,
                                   const MCSymbol *Target,
                                   MCContext *Ctx) const override {
     InstructionListType Code;
     Code.emplace_back(MCInstBuilder(AArch64::SUBSXri)
-                          .addReg(RegNo)
+                          .addReg(AArch64::XZR)
                           .addReg(RegNo)
                           .addImm(Imm)
                           .addImm(0));
     Code.emplace_back(MCInstBuilder(AArch64::Bcc)
+                          .addImm(AArch64CC::EQ)
+                          .addExpr(MCSymbolRefExpr::create(
+                              Target, MCSymbolRefExpr::VK_None, *Ctx)));
+    return Code;
+  }
+
+  //This helper function creates the snippet of code
+  //that compares a register RegNo with an immedaite Imm,
+  //and jumps to Target if they are not equal.
+  //cmp RegNo, #Imm
+  //b.ne Target
+  //where cmp is an for subs, which results in the code below:
+  //subs xzr, RegNo, #Imm
+  //b.ne Target.
+  InstructionListType createCmpJNE(MCPhysReg RegNo, int64_t Imm,
+                                  const MCSymbol *Target,
+                                  MCContext *Ctx) const override {
+    InstructionListType Code;
+    Code.emplace_back(MCInstBuilder(AArch64::SUBSXri)
+                          .addReg(AArch64::XZR)
+                          .addReg(RegNo)
                           .addImm(Imm)
+                          .addImm(0));
+    Code.emplace_back(MCInstBuilder(AArch64::Bcc)
+                          .addImm(AArch64CC::NE)
                           .addExpr(MCSymbolRefExpr::create(
                               Target, MCSymbolRefExpr::VK_None, *Ctx)));
     return Code;
diff --git a/bolt/lib/Target/X86/X86MCPlusBuilder.cpp b/bolt/lib/Target/X86/X86MCPlusBuilder.cpp
index 465533ee71f2b..bc827e6fd6f1e 100644
--- a/bolt/lib/Target/X86/X86MCPlusBuilder.cpp
+++ b/bolt/lib/Target/X86/X86MCPlusBuilder.cpp
@@ -2436,6 +2436,20 @@ class X86MCPlusBuilder : public MCPlusBuilder {
     return Code;
   }
 
+  InstructionListType createCmpJNE(MCPhysReg RegNo, int64_t Imm,
+                                  const MCSymbol *Target,
+                                  MCContext *Ctx) const override {
+    InstructionListType Code;
+    Code.emplace_back(MCInstBuilder(X86::CMP64ri8)
+                          .addReg(RegNo)
+                          .addImm(Imm));
+    Code.emplace_back(MCInstBuilder(X86::JCC_1)
+                          .addExpr(MCSymbolRefExpr::create(
+                              Target, MCSymbolRefExpr::VK_None, *Ctx))
+                          .addImm(X86::COND_NE));
+    return Code;
+  }
+
   std::optional<Relocation>
   createRelocation(const MCFixup &Fixup,
                    const MCAsmBackend &MAB) const override {
diff --git a/bolt/unittests/Core/MCPlusBuilder.cpp b/bolt/unittests/Core/MCPlusBuilder.cpp
index d367eb07f7767..5cd9ee2a6ff76 100644
--- a/bolt/unittests/Core/MCPlusBuilder.cpp
+++ b/bolt/unittests/Core/MCPlusBuilder.cpp
@@ -107,6 +107,52 @@ TEST_P(MCPlusBuilderTester, AliasSmallerX0) {
   testRegAliases(Triple::aarch64, AArch64::X0, AliasesX0, AliasesX0Count, true);
 }
 
+TEST_P(MCPlusBuilderTester, AArch64_CmpJE) {
+  if (GetParam() != Triple::aarch64)
+    GTEST_SKIP();
+  BinaryFunction *BF = BC->createInjectedBinaryFunction("BF", true);
+  std::unique_ptr<BinaryBasicBlock> BB = BF->createBasicBlock();
+
+  InstructionListType Instrs = BC->MIB->createCmpJE(AArch64::X0, 2, BB->getLabel(), BC->Ctx.get());
+  BB->addInstructions(Instrs.begin(), Instrs.end());
+  BB->addSuccessor(BB.get());
+  
+  auto II = BB->begin();
+  ASSERT_EQ(II->getOpcode(), AArch64::SUBSXri);
+  ASSERT_EQ(II->getOperand(0).getReg(), AArch64::XZR);
+  ASSERT_EQ(II->getOperand(1).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(2).getImm(), 2);
+  ASSERT_EQ(II->getOperand(3).getImm(), 0);
+  II++;
+  ASSERT_EQ(II->getOpcode(), AArch64::Bcc);
+  ASSERT_EQ(II->getOperand(0).getImm(), AArch64CC::EQ);
+  const MCSymbol *Label = BC->MIB->getTargetSymbol(*II,1);
+  ASSERT_EQ(Label, BB->getLabel());
+}
+
+TEST_P(MCPlusBuilderTester, AArch64_CmpJNE) {
+  if (GetParam() != Triple::aarch64)
+    GTEST_SKIP();
+  BinaryFunction *BF = BC->createInjectedBinaryFunction("BF", true);
+  std::unique_ptr<BinaryBasicBlock> BB = BF->createBasicBlock();
+  
+  InstructionListType Instrs = BC->MIB->createCmpJNE(AArch64::X0, 2, BB->getLabel(), BC->Ctx.get());
+  BB->addInstructions(Instrs.begin(), Instrs.end());
+  BB->addSuccessor(BB.get());
+  
+  auto II = BB->begin();
+  ASSERT_EQ(II->getOpcode(), AArch64::SUBSXri);
+  ASSERT_EQ(II->getOperand(0).getReg(), AArch64::XZR);
+  ASSERT_EQ(II->getOperand(1).getReg(), AArch64::X0);
+  ASSERT_EQ(II->getOperand(2).getImm(), 2);
+  ASSERT_EQ(II->getOperand(3).getImm(), 0);
+  II++;
+  ASSERT_EQ(II->getOpcode(), AArch64::Bcc);
+  ASSERT_EQ(II->getOperand(0).getImm(), AArch64CC::NE);
+  const MCSymbol *Label = BC->MIB->getTargetSymbol(*II,1);
+  ASSERT_EQ(Label, BB->getLabel());
+}
+
 #endif // AARCH64_AVAILABLE
 
 #ifdef X86_AVAILABLE
@@ -143,6 +189,48 @@ TEST_P(MCPlusBuilderTester, ReplaceRegWithImm) {
   ASSERT_EQ(II->getOperand(1).getImm(), 1);
 }
 
+TEST_P(MCPlusBuilderTester, X86_CmpJE) {
+  if (GetParam() != Triple::x86_64)
+    GTEST_SKIP();
+  BinaryFunction *BF = BC->createInjectedBinaryFunction("BF", true);
+  std::unique_ptr<BinaryBasicBlock> BB = BF->createBasicBlock();
+
+  InstructionListType Instrs = BC->MIB->createCmpJE(X86::EAX, 2, BB->getLabel(), BC->Ctx.get());
+  BB->addInstructions(Instrs.begin(), Instrs.end());
+  BB->addSuccessor(BB.get());
+
+  auto II = BB->begin();
+  ASSERT_EQ(II->getOpcode(), X86::CMP64ri8);
+  ASSERT_EQ(II->getOperand(0).getReg(), X86::EAX);
+  ASSERT_EQ(II->getOperand(1).getImm(), 2);
+  II++;
+  ASSERT_EQ(II->getOpcode(), X86::JCC_1);
+  const MCSymbol *Label = BC->MIB->getTargetSymbol(*II,0);
+  ASSERT_EQ(Label, BB->getLabel());
+  ASSERT_EQ(II->getOperand(1).getImm(), X86::COND_E);
+}
+
+TEST_P(MCPlusBuilderTester, X86_CmpJNE) {
+  if (GetParam() != Triple::x86_64)
+    GTEST_SKIP();
+  BinaryFunction *BF = BC->createInjectedBinaryFunction("BF", true);
+  std::unique_ptr<BinaryBasicBlock> BB = BF->createBasicBlock();
+
+  InstructionListType Instrs = BC->MIB->createCmpJNE(X86::EAX, 2, BB->getLabel(), BC->Ctx.get());
+  BB->addInstructions(Instrs.begin(), Instrs.end());
+  BB->addSuccessor(BB.get());
+
+  auto II = BB->begin();
+  ASSERT_EQ(II->getOpcode(), X86::CMP64ri8);
+  ASSERT_EQ(II->getOperand(0).getReg(), X86::EAX);
+  ASSERT_EQ(II->getOperand(1).getImm(), 2);
+  II++;
+  ASSERT_EQ(II->getOpcode(), X86::JCC_1);
+  const MCSymbol *Label = BC->MIB->getTargetSymbol(*II,0);
+  ASSERT_EQ(Label, BB->getLabel());
+  ASSERT_EQ(II->getOperand(1).getImm(), X86::COND_NE);
+}
+
 #endif // X86_AVAILABLE
 
 TEST_P(MCPlusBuilderTester, Annotation) {

@github-actions
Copy link

github-actions bot commented Mar 19, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution!

The change to createCmpJE() modifies the code generated by createInstrumentedIndCallHandlerEntryBB(). Was the code broken before? Particularly, what was the result of passing Imm to Bcc? We can use instrumentation as a test case for this change in addition to the unit test.

What is the reason for introducing createCmpJNE()?

@rcorcs
Copy link
Contributor Author

rcorcs commented Mar 24, 2025

Hi @maksfb , thank you for your comments.

  1. createInstrumentedIndCallHandlerEntryBB() was not broken before because by chance it was comparing to zero which is exactly the code that represents AArch64CC::EQ. However, it would be broken for immediate values other than zero because it would also change the conditional operation of the branch.

Below is a code snippet from AArch64MCPlusBuilder.cpp with the call to createCmpJE().
createInstrumentedIndCallHandlerEntryBB_AArch64

Below is a code snippet from llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h where AArch64CC::EQ is defined.
AArch64CC

However, note in the example below, that the AArch64::Bcc instruction expects a CondCode as its immediate operand. Below we can see an example extracted from AArch64InstrInfo.cpp.
BccUseExample

While the immediate value passed to createCmpJE() must be used only in the comparison instruction, such as in cmp RegNo, #Imm.

  1. The reason for introducing createCmpJNE() is because if I want to implement a multiversioning strategy with some checks, such as in the example below, I need to be able to jump if not equal as well.
UsecaseExample

The use case for this is a work in progress.

Copy link
Contributor

@maksfb maksfb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for detailed explanation. Please format the title to align with LLVM guidelines before committing. Namely, change to imperative form, drop trailing period, and limit the length.

@rcorcs rcorcs changed the title [BOLT] Compare and Jump (CmpJE and CmpJNE) generation in MCPlusBuilder for both X86 and AArch64. [BOLT] Correct generation of compare and jump (CmpJE and CmpJNE) in MCPlusBuilder for both X86 and AArch64. Apr 4, 2025
@rcorcs
Copy link
Contributor Author

rcorcs commented Apr 4, 2025

@maksfb thanks for approving this PR. I don't have commit/merge permission. Could you please help me merge this PR?

@maksfb maksfb changed the title [BOLT] Correct generation of compare and jump (CmpJE and CmpJNE) in MCPlusBuilder for both X86 and AArch64. [BOLT] Handle generation of compare and jump sequences Apr 4, 2025
@maksfb maksfb merged commit b989171 into llvm:main Apr 4, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants