[llvm-exegesis] Adding in llvm-exegesis for Aarch64 for handling FPR8/16/32 and FPCR setReg warning #130595

lakshayk-nv · 2025-03-10T12:56:28Z

[llvm-exegesis] Adding in llvm-exegesis for Aarch64 for handling FPR8/16/32 and FPCR setReg warning

Current implementation (for Aarch64) in llvm-exegesis only supports GRP32, GPR64, FPR64/128, PPR16 and ZPR128 register class, thus for opcodes variants which used llvm-exegesis throws warning "setReg is not implemented". This code will handle the all rest floating point register classes and initialize the registers using appropriate base instruction. And appropriate testcases for the four register classes.

…ning in Aarch64. Current implementation (for Aarch64) in llvm-exegesis only supports GRP32, GPR64, FPR64/128, PPR16 and ZPR128 register class, thus for opcodes variants which used llvm-exegesis throws warning "setReg is not implemented". This code will handle the all rest floating point register classes and initialize the registers using appropriate base instruction. And appropriate testcases for the four register classes.

llvmbot · 2025-03-10T12:57:06Z

@llvm/pr-subscribers-tools-llvm-exegesis

Author: None (lakshayk-nv)

Changes

…ning in Aarch64.

Current implementation (for Aarch64) in llvm-exegesis only supports GRP32, GPR64, FPR64/128, PPR16 and ZPR128 register class, thus for opcodes variants which used llvm-exegesis throws warning "setReg is not implemented". This code will handle the all rest floating point register classes and initialize the registers using appropriate base instruction. And appropriate testcases for the four register classes.

Full diff: https://github.com/llvm/llvm-project/pull/130595.diff

2 Files Affected:

(modified) llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s (+37)
(modified) llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp (+51-7)

diff --git a/llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s b/llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s
index da28388c96094..a5ae91e6ebb3a 100644
--- a/llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s
+++ b/llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s
@@ -37,3 +37,40 @@ RUN: FileCheck %s --check-prefix=FPR64-ASM < %t.s
 FPR64-ASM:          <foo>:
 FPR64-ASM:          movi d{{[0-9]+}}, #0000000000000000
 FPR64-ASM-NEXT:     addv h{{[0-9]+}}, v{{[0-9]+}}.4h
+
+## FPR32 Register Class Initialization Testcase
+RUN: llvm-exegesis -mcpu=neoverse-v2 -mode=latency --dump-object-to-disk=%d --opcode-name=FABSSr --benchmark-phase=assemble-measured-code 2>&1
+RUN: llvm-objdump -d %d > %t.s
+RUN: FileCheck %s --check-prefix=FPR32-ASM < %t.s
+FPR32-ASM:         <foo>:
+FPR32-ASM:         movi d{{[0-9]+}}, #0000000000000000
+FPR32-ASM-NEXT:    fabs s{{[0-9]+}}, s{{[0-9]+}}
+
+
+## FPR16 Register Class Initialization Testcase
+RUN: llvm-exegesis -mcpu=neoverse-v2 -mode=latency --dump-object-to-disk=%d --opcode-name=FABSHr --benchmark-phase=assemble-measured-code 2>&1
+RUN: llvm-objdump -d %d > %t.s
+RUN: FileCheck %s --check-prefix=FPR16-ASM < %t.s
+FPR16-ASM:         <foo>:
+FPR16-ASM:         movi d{{[0-9]+}}, #0000000000000000
+FPR16-ASM-NEXT:    fabs h{{[0-9]+}}, h{{[0-9]+}}
+
+## FPR8 Register Class Initialization Testcase
+RUN: llvm-exegesis -mcpu=neoverse-v2 -mode=latency --dump-object-to-disk=%d --opcode-name=SQABSv1i8 --benchmark-phase=assemble-measured-code 2>&1
+RUN: llvm-objdump -d %d > %t.s
+RUN: FileCheck %s --check-prefix=FPR8-ASM < %t.s
+FPR8-ASM:         <foo>:
+FPR8-ASM:         mov     w{{[0-9]+}}, #0x0
+FPR8-ASM-NEXT:    fmov    h{{[0-9]+}}, w{{[0-9]+}}
+FPR8-ASM-NEXT:    sqabs   b{{[0-9]+}}, b{{[0-9]+}}
+
+
+## FPCR Register Class Initialization Testcase
+RUN: llvm-exegesis -mcpu=neoverse-v2 -mode=latency --dump-object-to-disk=%d --opcode-name=BFCVT --benchmark-phase=assemble-measured-code 2>&1
+RUN: llvm-objdump -d %d > %t.s
+RUN: FileCheck %s --check-prefix=FPCR-ASM < %t.s
+FPCR-ASM:         <foo>:
+FPCR-ASM:         movi d{{[0-9]+}}, #0000000000000000
+FPCR-ASM-NEXT:    mov     x{{[0-9]+}}, #0x0
+FPCR-ASM-NEXT:    msr     FPCR, x{{[0-9]+}}
+FPCR-ASM-NEXT:    bfcvt   h{{[0-9]+}}, s{{[0-9]+}}
diff --git a/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp b/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
index ed36cb2f75d5b..026dd240708ab 100644
--- a/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
+++ b/llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp
@@ -28,8 +28,8 @@ static unsigned getLoadImmediateOpcode(unsigned RegBitWidth) {
 // Generates instruction to load an immediate value into a register.
 static MCInst loadImmediate(MCRegister Reg, unsigned RegBitWidth,
                             const APInt &Value) {
-  assert (Value.getBitWidth() <= RegBitWidth &&
-          "Value must fit in the Register");
+  assert(Value.getBitWidth() <= RegBitWidth &&
+         "Value must fit in the Register");
   return MCInstBuilder(getLoadImmediateOpcode(RegBitWidth))
       .addReg(Reg)
       .addImm(Value.getZExtValue());
@@ -53,9 +53,44 @@ static MCInst loadPPRImmediate(MCRegister Reg, unsigned RegBitWidth,
       .addImm(31); // All lanes true for 16 bits
 }
 
+// Generates instructions to load an immediate value into an FPCR register.
+static std::vector<MCInst>
+loadFPCRImmediate(MCRegister Reg, unsigned RegBitWidth, const APInt &Value) {
+  MCRegister TempReg = AArch64::X8;
+  MCInst LoadImm = MCInstBuilder(AArch64::MOVi64imm).addReg(TempReg).addImm(0);
+  MCInst MoveToFPCR =
+      MCInstBuilder(AArch64::MSR).addImm(AArch64SysReg::FPCR).addReg(TempReg);
+  return {LoadImm, MoveToFPCR};
+}
+
+// Generates instructions to load an immediate value into an FPR8 register.
+static std::vector<MCInst>
+loadFP8Immediate(MCRegister Reg, unsigned RegBitWidth, const APInt &Value) {
+  assert(Value.getBitWidth() <= 8 && "Value must fit in 8 bits");
+
+  // Use a temporary general-purpose register (W8) to hold the 8-bit value
+  MCRegister TempReg = AArch64::W8;
+
+  // Load the 8-bit value into a general-purpose register (W8)
+  MCInst LoadImm = MCInstBuilder(AArch64::MOVi32imm)
+                       .addReg(TempReg)
+                       .addImm(Value.getZExtValue());
+
+  // Move the value from the general-purpose register to the FPR16 register
+  // Convert the FPR8 register to an FPR16 register
+  MCRegister FPR16Reg = Reg + (AArch64::H0 - AArch64::B0);
+  MCInst MoveToFPR =
+      MCInstBuilder(AArch64::FMOVWHr).addReg(FPR16Reg).addReg(TempReg);
+  return {LoadImm, MoveToFPR};
+}
+
 // Fetch base-instruction to load an FP immediate value into a register.
 static unsigned getLoadFPImmediateOpcode(unsigned RegBitWidth) {
   switch (RegBitWidth) {
+  case 16:
+    return AArch64::FMOVH0; //FMOVHi;
+  case 32:
+    return AArch64::FMOVS0; //FMOVSi;
   case 64:
     return AArch64::MOVID; //FMOVDi;
   case 128:
@@ -67,11 +102,12 @@ static unsigned getLoadFPImmediateOpcode(unsigned RegBitWidth) {
 // Generates instruction to load an FP immediate value into a register.
 static MCInst loadFPImmediate(MCRegister Reg, unsigned RegBitWidth,
                               const APInt &Value) {
-  assert(Value.getZExtValue() == 0 &&
-         "Expected initialisation value 0");
-  return MCInstBuilder(getLoadFPImmediateOpcode(RegBitWidth))
-      .addReg(Reg)
-      .addImm(Value.getZExtValue());
+  assert(Value.getZExtValue() == 0 && "Expected initialisation value 0");
+  MCInst Instructions =
+      MCInstBuilder(getLoadFPImmediateOpcode(RegBitWidth)).addReg(Reg);
+  if (RegBitWidth >= 64)
+    Instructions.addOperand(MCOperand::createImm(Value.getZExtValue()));
+  return Instructions;
 }
 
 #include "AArch64GenExegesis.inc"
@@ -92,12 +128,20 @@ class ExegesisAArch64Target : public ExegesisTarget {
       return {loadImmediate(Reg, 64, Value)};
     if (AArch64::PPRRegClass.contains(Reg))
       return {loadPPRImmediate(Reg, 16, Value)};
+    if (AArch64::FPR8RegClass.contains(Reg))
+      return loadFP8Immediate(Reg, 8, Value);
+    if (AArch64::FPR16RegClass.contains(Reg))
+      return {loadFPImmediate(Reg, 16, Value)};
+    if (AArch64::FPR32RegClass.contains(Reg))
+      return {loadFPImmediate(Reg, 32, Value)};
     if (AArch64::FPR64RegClass.contains(Reg))
       return {loadFPImmediate(Reg, 64, Value)};
     if (AArch64::FPR128RegClass.contains(Reg))
       return {loadFPImmediate(Reg, 128, Value)};
     if (AArch64::ZPRRegClass.contains(Reg))
       return {loadZPRImmediate(Reg, 128, Value)};
+    if (Reg == AArch64::FPCR)
+      return {loadFPCRImmediate(Reg, 32, Value)};
 
     errs() << "setRegTo is not implemented, results will be unreliable\n";
     return {};

github-actions · 2025-03-10T13:00:57Z

✅ With the latest revision this PR passed the C/C++ code formatter.

sjoerdmeijer · 2025-03-10T13:41:40Z

llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s

+RUN: FileCheck %s --check-prefix=FPR8-ASM < %t.s
+FPR8-ASM:         <foo>:
+FPR8-ASM:         mov     w{{[0-9]+}}, #0x0
+FPR8-ASM-NEXT:    fmov    h{{[0-9]+}}, w{{[0-9]+}}


You can directly use:

fmov h, #imm

No need to move 0 to a GPR first and then to a FPR?

There was no instruction/pseudo-instruction to load for for FPR8 Reg class. Thus, can't move (fmov) directly. So moved Value to GPR and then mov to FPR.

Okay, fair enough about movi above, but why are we not also using that here then?

Yupp, it makes sense to use movi as suggested by @davemgreen too.
Updated code for FPR8, now returning instruction loadFPImmediate(Reg - AArch64::B0 + AArch64::D0, 64, Value) which will generate assembly for movi $reg #0

sjoerdmeijer · 2025-03-10T13:44:39Z

llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s

+RUN: FileCheck %s --check-prefix=FPCR-ASM < %t.s
+FPCR-ASM:         <foo>:
+FPCR-ASM:         movi d{{[0-9]+}}, #0000000000000000
+FPCR-ASM-NEXT:    mov     x{{[0-9]+}}, #0x0


You're using x9 as a "hard-coded" temp register (in the loadFPCRImmediate), so to test that, you'll need to match that here and not any value.

llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s

davemgreen · 2025-03-10T20:52:10Z

llvm/tools/llvm-exegesis/lib/AArch64/Target.cpp

    if (AArch64::PPRRegClass.contains(Reg))
      return {loadPPRImmediate(Reg, 16, Value)};
+    if (AArch64::FPR8RegClass.contains(Reg))
+      return loadFP8Immediate(Reg, 8, Value);


Could this use return {loadFPImmediate(Reg - AArch64::B0 + AArch64::D0, 64, Value.zext(64))}?

Sure, better to use movi for FPR8 too.
Updated
Thanks for suggestion.

…ponding test cases.

sjoerdmeijer · 2025-03-11T12:08:23Z

llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s

+RUN: llvm-objdump -d %d > %t.s
+RUN: FileCheck %s --check-prefix=FPR8-ASM < %t.s
+FPR8-ASM:         <foo>:
+FPR8-ASM:         movi d{{[0-9]+}}, #0000000000000000


Nit: align the d-register with b-register of the next instruction below.

Done, Thanks!

sjoerdmeijer

LGTM.

Let's wait a day with merging this in case Dave has more comments.

sjoerdmeijer · 2025-03-11T12:09:17Z

llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s

+RUN: llvm-objdump -d %d > %t.s
+RUN: FileCheck %s --check-prefix=FPCR-ASM < %t.s
+FPCR-ASM:         <foo>:
+FPCR-ASM:         movi d{{[0-9]+}}, #0000000000000000


davemgreen · 2025-03-11T12:48:00Z

Thanks for the update. Looks OK to me.

llvmbot added the tools:llvm-exegesis label Mar 10, 2025

lakshayk-nv changed the title ~~[llvm-exegesis] Adding support for FPR8/16/32 and FPCR for setReg warning~~ [llvm-exegesis] Adding in llvm-exegesis for Aarch64 for handling FPR8/16/32 and FPCR setReg warning Mar 10, 2025

Modified: clang formatter changes

62e8e36

sjoerdmeijer requested review from boomanaiden154, davemgreen and sjoerdmeijer March 10, 2025 13:13

sjoerdmeijer reviewed Mar 10, 2025

View reviewed changes

llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s Show resolved Hide resolved

llvm/test/tools/llvm-exegesis/AArch64/setReg_init_check.s Show resolved Hide resolved

davemgreen reviewed Mar 10, 2025

View reviewed changes

[llvm-exegesis] Simplifying the FPR8 setReg implementation and corres…

b684c97

…ponding test cases.

sjoerdmeijer reviewed Mar 11, 2025

View reviewed changes

sjoerdmeijer approved these changes Mar 11, 2025

View reviewed changes

[llvm-exegesis] Style nits for testcase.

fae356f

sjoerdmeijer merged commit 9cc477b into llvm:main Mar 11, 2025
6 of 10 checks passed

[llvm-exegesis] Adding in llvm-exegesis for Aarch64 for handling FPR8/16/32 and FPCR setReg warning #130595

[llvm-exegesis] Adding in llvm-exegesis for Aarch64 for handling FPR8/16/32 and FPCR setReg warning #130595

Uh oh!

Conversation

lakshayk-nv commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 10, 2025

Uh oh!

github-actions bot commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lakshayk-nv Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sjoerdmeijer left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davemgreen commented Mar 11, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

lakshayk-nv commented Mar 10, 2025 •

edited

Loading

github-actions bot commented Mar 10, 2025 •

edited

Loading

lakshayk-nv Mar 11, 2025 •

edited

Loading