Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
REQUIRES: aarch64-registered-target
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REQUIRES/RUN/CHECK lines should still be in comments even if this file isn't loaded anywhere (or at least I prefer it that way to be consistent with other tests).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Thanks!


## Check for skipping of illegal instruction errors (AUT and LDGM)
RUN: llvm-exegesis -mcpu=neoverse-v2 -mode=latency --dump-object-to-disk=%d --min-instructions=1 --opcode-name=AUTIA --benchmark-phase=assemble-measured-code 2>&1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This never actually pipes anything into FileCheck, so you're not actually checking anything?

Also, --dump-object-to-disk here seems unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, Removed unnecessary flags.
Thanks!

CHECK: AUTIA: Unsupported opcode: isPointerAuth/isUncheckedAccess

RUN: llvm-exegesis -mcpu=neoverse-v2 -mode=latency --dump-object-to-disk=%d --min-instructions=1 --opcode-name=LDGM --benchmark-phase=assemble-measured-code 2>&1
CHECK: LDGM: Unsupported opcode: isPointerAuth/isUncheckedAccess
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: newline at end of file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

10 changes: 10 additions & 0 deletions llvm/tools/llvm-exegesis/lib/Target.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,14 @@ const ExegesisTarget *ExegesisTarget::lookup(Triple TT) {
return nullptr;
}

static bool isPointerAuthOpcode(unsigned Opcode) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another nitpick, I wanted to suggest that we create one isUnsupportedOpcode() function that could use these new helper functions, but then realised that is function getIgnoredOpcodeReasonOrNull(). We might as well use that function then, here would be my suggestion and as you see I also split up the first couple of if-statements as I don't like error messages that say "it can be this or that":

    const char *
    ExegesisTarget::getIgnoredOpcodeReasonOrNull(const LLVMState &State,
                                             unsigned Opcode) const {
        const MCInstrDesc &InstrDesc = State.getIC().getInstr(Opcode).Description;

        if (InstrDesc.isPseudo())
          return "Unsupported opcode: isPseudo";
        if ( InstrDesc.usesCustomInsertionHook())
          return "Unsupported opcode: usesCustomInserter";
        if (InstrDesc.isBranch())
          return "Unsupported opcode: isBranch";
        if (InstrDesc.isIndirectBranch())
          return "Unsupported opcode: isIndirectBranch";
        if (InstrDesc.isCall())
          return "Unsupported opcode: isCall";
        if (InstrDesc.isReturn())
          return "Unsupported opcode: isReturn";

         switch (Opcode) {
         default:
           return nullptr;

         // FIXME: Pointer Authentication instructions.
         // We would like to measure these instructions, but they can behave differently on
         // different platforms, and maybe the snippets need to look different for these instructions,
         // so let's disable this for now. 
         case AArch64::AUTDA:
         case AArch64::AUTDB:
         case AArch64::AUTDZA:
         case AArch64::AUTDZB:
         case AArch64::AUTIA:
         case AArch64::AUTIA1716:
         case AArch64::AUTIASP:
         case AArch64::AUTIAZ:
         case AArch64::AUTIB:
         case AArch64::AUTIB1716:
         case AArch64::AUTIBSP:
         case AArch64::AUTIBZ:
         case AArch64::AUTIZA:
         case AArch64::AUTIZB:
           return ""Unsupported opcode: isPointerAuth";

         // Load tag multiple instruction
         case AArch64::LDGM:
           return "Unsupported opcode: load tag multiple";
       }
       return nullptr;
    }

return (Opcode >= 1648 && Opcode <= 1667); // AUT instruction class range
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These numbers are not stable and could change as new instructions are added - you will need to check the opcodes directly (AArch64::AUT etc).

I would expect they were useful instructions to check the throughput of though, if there was a way to do that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These numbers are not stable and could change as new instructions are added - you will need to check the opcodes directly (AArch64::AUT etc).

I would expect they were useful instructions to check the throughput of though, if there was a way to do that.

Agree on the usefulness, but was thinking to leave this for another day and mark them as unsupported for now. The problem is that the snippet might need to look a bit different, as some of these need to come in pairs. If used in isolation, you might run into a useless "Illegal instruction" kernel/linux error. This confused me as that made it look like the pauth extension wasn't enabled in the kernel, but turned out be a user error, e.g. checking a pointer before it was signed, that kind of stuff. So the idea is to worry about this later, avoid crashing snippets, and figure out what to do with this later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And fully agreed on checking the opcodes by the way, this needs to be a switch statement on all the pauth opcodes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More generally, I am now wondering how this is supposed to work for optional extensions that may not need to be implemented on the platform where llvm-exegesis is run. I guess it needs to query the host for the supported extension, is anything like that already in place?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which ones fail for you? I tried AUTDA and it seemed to do OK when I tried it. My understanding was that they corrupted bits in the return address, which only faults when they are used. FEAT_FPAC might enable more faulting of the instructions directly.

(In general, this being a developer tool, I wasn't sure of the benefit or downside of marking an instruction as unsupported as opposed to just trying it and seeing if it worked. Disabling it has the downside that no-one can use it even if it could work. But I can see how that might make analyzing many instructions more.. messy).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @davemgreen , this is on Grace, which has the all the required extensions (and kernel modules).
I am able to successfully run the code example from the pointer auth learning path. I see pauth instructions and everything runs fine.

Here's is a small reproducer that runs fine:

extern "C" void kernel();
asm(R"(
.global kernel
kernel:
    paciasp
    autiasp
    ret
)");
int main(void) {
  kernel();
  return 0;
}

If you modify this, removes the signing and only authenticate this:

asm(R"(
.global kernel
kernel:
    autiasp
    ret
)");

Then you'll get an illegal instruction exception. At least, on Grace this is the observed behaviour. That's why I wrote these instructions need to come in pairs, but you're saying that the behaviour can be different on different platforms. Hmmmm, not ideal. But I think we do have a problem reliably testing these things, and on top of all of this, I don't think this is the right snippet to test it latency/throughput:

 0000000000000000 <foo>:
 0: f81f0ff6      str     x22, [sp, #-0x10]!
 4: d280000a      mov     x10, #0x0               // =0
 8: d2800016      mov     x22, #0x0               // =0. 
 c: dac11aca      autda   x10, x22
10: f84107f6      ldr     x22, [sp], #0x10
14: d65f03c0      ret

So, what do you think about disabling for now?
Or would you like to see a Grace specific workaround to disable this for now?

Copy link
Collaborator

@sjoerdmeijer sjoerdmeijer Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry forgot to add this: I don't know what that FEAT_FPAC things is.
I don't think it is advertised in the linux kernel features, maybe it is a bit of a different name, but don't think I see something that looks like it. Maybe I am missing something because I didn't research it too long. But if this is extension for raising the exception, then it looks like Grace has that like you wrote.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of it is that when the AUT instructions were originally added they purely modified the values of registers, corrupting the top bits which then cause the return to fault when the register is used. So on a machine like Grace I which does not have FEAT_FPAC they work OK. It seems to produce a file that looks like this, and produces a sensible looking latency result similar to the optimization guide:

       0:       f81f0ff4        str     x20, [sp, #-16]!                        
       4:       d280000d        mov     x13, #0x0                       // #0   
       8:       d2800014        mov     x20, #0x0                       // #0   
       c:       dac11a8d        autda   x13, x20                                
      10:       dac11a8d        autda   x13, x20                                
...
    9c44:       dac11a8d        autda   x13, x20               
    9c48:       dac11a8d        autda   x13, x20               
    9c4c:       f84107f4        ldr     x20, [sp], #16         
    9c50:       d65f03c0        ret                            

AUTIASP also works fine for me, as x30 (LR) is spilled and reloaded around the snippet.

FEAT_FPAC changes the instructions so that they fault immediately when the check fails. So a cpu with that feature (#132368 ;)) might then hit the segfault you are seeing.

So maybe we can check that the current CPU has FEAT_FPAC, and disable them only then?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or - perhaps Grace does have FEAT_FPAC (but we don't mark it as such in LLVM) and the system I am running on doesn't fault due to some way it is setup? I would imagine there was some way to disable the faulting at at least the OS level.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So maybe we can check that the current CPU has FEAT_FPAC, and disable them only then?

Okidoki, thanks, that indeed seems the best thing to do.

Don't know if it's easy to query a CPU for features in llvm exegesis, but I guess that's the first thing to investigate now for @lakshayk-nv.

}

static bool isUncheckedAccessOpcode(unsigned Opcode) {
return Opcode == 4694; // LDGM instruction
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here regarding the opcode number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}

const char *
ExegesisTarget::getIgnoredOpcodeReasonOrNull(const LLVMState &State,
unsigned Opcode) const {
Expand All @@ -45,6 +53,8 @@ ExegesisTarget::getIgnoredOpcodeReasonOrNull(const LLVMState &State,
return "Unsupported opcode: isBranch/isIndirectBranch";
if (InstrDesc.isCall() || InstrDesc.isReturn())
return "Unsupported opcode: isCall/isReturn";
if (isPointerAuthOpcode(Opcode) || isUncheckedAccessOpcode(Opcode))
return "Unsupported opcode: isPointerAuth/isUncheckedAccess";
return nullptr;
}

Expand Down
Loading