Skip to content

Conversation

@yonghong-song
Copy link
Contributor

Two new insns are added to BPF instruction set:
gotol_or_nop
encoding: gotol encoding with src_reg = 1
nop_or_gotol
encoding: gotol encoding with src_reg = 3

Basically src_reg 'bit_0 == 1' means it is gotol_or_nop/nop_or_gotol insn. The src_reg 'bit_1' indicates the insn itself will be a 'goto' 'bit_1 == 0' or a 'nop' 'bit_1 == 1'.

Two insns intend to support kernel static key like transformation where the insn can be a nop or a ja.

The following is an example, where two labels,
static_key_loc_1 and static_key_loc_2, can be used to identify the location of a particular gotol_or_nop/nop_or_gotol location.

It is possible that user space could do
static_key_enable("static_key_loc_1")
libbpf can validate that the label "static_key_loc_1" indeed
corresponds to a gotol_or_nop/nop_or_gotol insn and it
can translated the 'static_key_enable("static_key_loc_1")'
to something like bpf syscall command 'static_key_enable prog, insn offset 1'
and kernel will do proper adjustment.
the same for static_key_disable, static_key_enabled, etc.

$ cat t.c
int bar(void);
int foo1(int arg1)
{
        int a = arg1, b;

        asm volatile goto ("r0 = 0; \
                            static_key_loc_1: \
                            gotol_or_nop %l[label]; \
                            r2 = 2; \
                            r3 = 3; \
                           "::
                            : "r0", "r2", "r3"
                            :label);
        a = bar();
label:
        b = 20 * a;
        return b;
}
int foo2(int arg1)
{
        int a = arg1, b;

        asm volatile goto ("r0 = 0; \
                            static_key_loc_2: \
                            nop_or_gotol %l[label]; \
                            r2 = 2; \
                            r3 = 3; \
                           "::
                            : "r0", "r2", "r3"
                            :label);
        a = bar();
label:
        b = 20 * a;
        return b;
}
$ clang --target=bpf -O2 -g -c t.c
$ llvm-objdump -S t.o
t.o:    file format elf64-bpf

Disassembly of section .text:

0000000000000000 <foo1>:
;       asm volatile goto ("r0 = 0; \
       0:       b7 00 00 00 00 00 00 00 r0 = 0x0

0000000000000008 <static_key_loc_1>:
       1:       06 10 00 00 04 00 00 00 gotol_or_nop +0x4 <LBB0_2>
       2:       b7 02 00 00 02 00 00 00 r2 = 0x2
       3:       b7 03 00 00 03 00 00 00 r3 = 0x3
;       a = bar();
       4:       85 10 00 00 ff ff ff ff call -0x1
       5:       bf 01 00 00 00 00 00 00 r1 = r0

0000000000000030 <LBB0_2>:
;       b = 20 * a;
       6:       27 01 00 00 14 00 00 00 r1 *= 0x14
;       return b;
       7:       bf 10 00 00 00 00 00 00 r0 = r1
       8:       95 00 00 00 00 00 00 00 exit

0000000000000048 <foo2>:
;       asm volatile goto ("r0 = 0; \
       9:       b7 00 00 00 00 00 00 00 r0 = 0x0

0000000000000050 <static_key_loc_2>:
      10:       06 30 00 00 04 00 00 00 nop_or_gotol +0x4 <LBB1_2>
      11:       b7 02 00 00 02 00 00 00 r2 = 0x2
      12:       b7 03 00 00 03 00 00 00 r3 = 0x3
;       a = bar();
      13:       85 10 00 00 ff ff ff ff call -0x1
      14:       bf 01 00 00 00 00 00 00 r1 = r0

0000000000000078 <LBB1_2>:
;       b = 20 * a;
      15:       27 01 00 00 14 00 00 00 r1 *= 0x14
;       return b;
      16:       bf 10 00 00 00 00 00 00 r0 = r1
      17:       95 00 00 00 00 00 00 00 exit

@llvmbot llvmbot added the llvm:mc Machine (object) code label Dec 12, 2023
@llvmbot
Copy link
Member

llvmbot commented Dec 12, 2023

@llvm/pr-subscribers-mc

Author: None (yonghong-song)

Changes

Two new insns are added to BPF instruction set:
gotol_or_nop
encoding: gotol encoding with src_reg = 1
nop_or_gotol
encoding: gotol encoding with src_reg = 3

Basically src_reg 'bit_0 == 1' means it is gotol_or_nop/nop_or_gotol insn. The src_reg 'bit_1' indicates the insn itself will be a 'goto' 'bit_1 == 0' or a 'nop' 'bit_1 == 1'.

Two insns intend to support kernel static key like transformation where the insn can be a nop or a ja.

The following is an example, where two labels,
static_key_loc_1 and static_key_loc_2, can be used to identify the location of a particular gotol_or_nop/nop_or_gotol location.

It is possible that user space could do
static_key_enable("static_key_loc_1")
libbpf can validate that the label "static_key_loc_1" indeed
corresponds to a gotol_or_nop/nop_or_gotol insn and it
can translated the 'static_key_enable("static_key_loc_1")'
to something like bpf syscall command 'static_key_enable prog, insn offset 1'
and kernel will do proper adjustment.
the same for static_key_disable, static_key_enabled, etc.

$ cat t.c
int bar(void);
int foo1(int arg1)
{
        int a = arg1, b;

        asm volatile goto ("r0 = 0; \
                            static_key_loc_1: \
                            gotol_or_nop %l[label]; \
                            r2 = 2; \
                            r3 = 3; \
                           "::
                            : "r0", "r2", "r3"
                            :label);
        a = bar();
label:
        b = 20 * a;
        return b;
}
int foo2(int arg1)
{
        int a = arg1, b;

        asm volatile goto ("r0 = 0; \
                            static_key_loc_2: \
                            nop_or_gotol %l[label]; \
                            r2 = 2; \
                            r3 = 3; \
                           "::
                            : "r0", "r2", "r3"
                            :label);
        a = bar();
label:
        b = 20 * a;
        return b;
}
$ clang --target=bpf -O2 -g -c t.c
$ llvm-objdump -S t.o
t.o:    file format elf64-bpf

Disassembly of section .text:

0000000000000000 &lt;foo1&gt;:
;       asm volatile goto ("r0 = 0; \
       0:       b7 00 00 00 00 00 00 00 r0 = 0x0

0000000000000008 &lt;static_key_loc_1&gt;:
       1:       06 10 00 00 04 00 00 00 gotol_or_nop +0x4 &lt;LBB0_2&gt;
       2:       b7 02 00 00 02 00 00 00 r2 = 0x2
       3:       b7 03 00 00 03 00 00 00 r3 = 0x3
;       a = bar();
       4:       85 10 00 00 ff ff ff ff call -0x1
       5:       bf 01 00 00 00 00 00 00 r1 = r0

0000000000000030 &lt;LBB0_2&gt;:
;       b = 20 * a;
       6:       27 01 00 00 14 00 00 00 r1 *= 0x14
;       return b;
       7:       bf 10 00 00 00 00 00 00 r0 = r1
       8:       95 00 00 00 00 00 00 00 exit

0000000000000048 &lt;foo2&gt;:
;       asm volatile goto ("r0 = 0; \
       9:       b7 00 00 00 00 00 00 00 r0 = 0x0

0000000000000050 &lt;static_key_loc_2&gt;:
      10:       06 30 00 00 04 00 00 00 nop_or_gotol +0x4 &lt;LBB1_2&gt;
      11:       b7 02 00 00 02 00 00 00 r2 = 0x2
      12:       b7 03 00 00 03 00 00 00 r3 = 0x3
;       a = bar();
      13:       85 10 00 00 ff ff ff ff call -0x1
      14:       bf 01 00 00 00 00 00 00 r1 = r0

0000000000000078 &lt;LBB1_2&gt;:
;       b = 20 * a;
      15:       27 01 00 00 14 00 00 00 r1 *= 0x14
;       return b;
      16:       bf 10 00 00 00 00 00 00 r0 = r1
      17:       95 00 00 00 00 00 00 00 exit

Full diff: https://github.com/llvm/llvm-project/pull/75110.diff

5 Files Affected:

  • (modified) llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp (+4)
  • (modified) llvm/lib/Target/BPF/BPFInstrInfo.td (+28)
  • (modified) llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp (+2-1)
  • (modified) llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp (+2-1)
  • (added) llvm/test/MC/BPF/branch-or-nop.s (+11)
diff --git a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
index 90697c6645be2..0422f6b0ee92e 100644
--- a/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
+++ b/llvm/lib/Target/BPF/AsmParser/BPFAsmParser.cpp
@@ -231,6 +231,8 @@ struct BPFOperand : public MCParsedAsmOperand {
         .Case("call", true)
         .Case("goto", true)
         .Case("gotol", true)
+        .Case("gotol_or_nop", true)
+        .Case("nop_or_gotol", true)
         .Case("*", true)
         .Case("exit", true)
         .Case("lock", true)
@@ -259,6 +261,8 @@ struct BPFOperand : public MCParsedAsmOperand {
         .Case("bswap64", true)
         .Case("goto", true)
         .Case("gotol", true)
+        .Case("gotol_or_nop", true)
+        .Case("nop_or_gotol", true)
         .Case("ll", true)
         .Case("skb", true)
         .Case("s", true)
diff --git a/llvm/lib/Target/BPF/BPFInstrInfo.td b/llvm/lib/Target/BPF/BPFInstrInfo.td
index 7d443a3449014..779c8c586371d 100644
--- a/llvm/lib/Target/BPF/BPFInstrInfo.td
+++ b/llvm/lib/Target/BPF/BPFInstrInfo.td
@@ -604,6 +604,32 @@ class BRANCH_LONG<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
   let BPFClass = BPF_JMP32;
 }
 
+class BRANCH_OR_NOP<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
+    : TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
+                   (outs),
+                   (ins brtarget:$BrDst),
+                   !strconcat(OpcodeStr, " $BrDst"),
+                   Pattern> {
+  bits<32> BrDst;
+
+  let Inst{55-52} = 1;
+  let Inst{31-0} = BrDst;
+  let BPFClass = BPF_JMP32;
+}
+
+class NOP_OR_BRANCH<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
+    : TYPE_ALU_JMP<Opc.Value, BPF_K.Value,
+                   (outs),
+                   (ins brtarget:$BrDst),
+                   !strconcat(OpcodeStr, " $BrDst"),
+                   Pattern> {
+  bits<32> BrDst;
+
+  let Inst{55-52} = 3;
+  let Inst{31-0} = BrDst;
+  let BPFClass = BPF_JMP32;
+}
+
 class CALL<string OpcodeStr>
     : TYPE_ALU_JMP<BPF_CALL.Value, BPF_K.Value,
                    (outs),
@@ -632,6 +658,8 @@ class CALLX<string OpcodeStr>
 let isBranch = 1, isTerminator = 1, hasDelaySlot=0, isBarrier = 1 in {
   def JMP : BRANCH<BPF_JA, "goto", [(br bb:$BrDst)]>;
   def JMPL : BRANCH_LONG<BPF_JA, "gotol", []>;
+  def JMPL_OR_NOP : BRANCH_OR_NOP<BPF_JA, "gotol_or_nop", []>;
+  def NOP_OR_JMPL : NOP_OR_BRANCH<BPF_JA, "nop_or_gotol", []>;
 }
 
 // Jump and link
diff --git a/llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp b/llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp
index c266538bec736..fc5eccac96454 100644
--- a/llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp
+++ b/llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp
@@ -103,7 +103,8 @@ void BPFInstPrinter::printBrTargetOperand(const MCInst *MI, unsigned OpNo,
                                        raw_ostream &O) {
   const MCOperand &Op = MI->getOperand(OpNo);
   if (Op.isImm()) {
-    if (MI->getOpcode() == BPF::JMPL) {
+    if (MI->getOpcode() == BPF::JMPL || MI->getOpcode() == BPF::JMPL_OR_NOP ||
+        MI->getOpcode() == BPF::NOP_OR_JMPL) {
       int32_t Imm = Op.getImm();
       O << ((Imm >= 0) ? "+" : "") << formatImm(Imm);
     } else {
diff --git a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
index b807d6904004d..bdae69960e24a 100644
--- a/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
+++ b/llvm/lib/Target/BPF/MCTargetDesc/BPFMCCodeEmitter.cpp
@@ -96,7 +96,8 @@ unsigned BPFMCCodeEmitter::getMachineOpValue(const MCInst &MI,
     Fixups.push_back(MCFixup::create(0, Expr, FK_PCRel_4));
   else if (MI.getOpcode() == BPF::LD_imm64)
     Fixups.push_back(MCFixup::create(0, Expr, FK_SecRel_8));
-  else if (MI.getOpcode() == BPF::JMPL)
+  else if (MI.getOpcode() == BPF::JMPL || MI.getOpcode() == BPF::JMPL_OR_NOP ||
+           MI.getOpcode() == BPF::NOP_OR_JMPL)
     Fixups.push_back(MCFixup::create(0, Expr, (MCFixupKind)BPF::FK_BPF_PCRel_4));
   else
     // bb label
diff --git a/llvm/test/MC/BPF/branch-or-nop.s b/llvm/test/MC/BPF/branch-or-nop.s
new file mode 100644
index 0000000000000..e71803b14a191
--- /dev/null
+++ b/llvm/test/MC/BPF/branch-or-nop.s
@@ -0,0 +1,11 @@
+# RUN: llvm-mc -triple bpfel < %s
+
+dst:
+
+# CHECK: gotol_or_nop dst                        # encoding: [0x06'A',0x01'A',A,A,0x00,0x00,0x00,0x00]
+# CHECK:                                 #   fixup A - offset: 0, value: dst, kind: FK_BPF_PCRel_4
+gotol_or_nop dst
+
+# CHECK: nop_or_gotol dst                        # encoding: [0x06'A',0x03'A',A,A,0x00,0x00,0x00,0x00]
+# CHECK:                                 #   fixup A - offset: 0, value: dst, kind: FK_BPF_PCRel_4
+nop_or_gotol dst

inclyc added a commit to yonghong-song/llvm-project that referenced this pull request Dec 12, 2023
Copy link
Member

@inclyc inclyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read the thread on https://lore.kernel.org/bpf/[email protected]/T/#mf55ee30dacc6e67aee6712ae3a117059ed06a84b

Looks like there is no test for this, and I just pushed a fixup commit on your branch

let BPFClass = BPF_JMP32;
}

class BRANCH_OR_NOP<BPFJumpOp Opc, string OpcodeStr, list<dag> Pattern>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two insns are just special cases for JMPL, how about just extending(inherit) BRANCH_LONG and override Inst{55-52} fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two insns are just special cases for JMPL, how about just extending(inherit) BRANCH_LONG and override Inst{55-52} fields?

Thanks. @inclyc Good suggestion and adding the test! I marked the patch as RFC. Will wait until the design in kernel is settled and then will make proper coding (as you suggested in the above) and add tests.

@yonghong-song yonghong-song changed the title [BPF] Add support for asm gotol_or_nop and nop_or_gotol insns [RFC][BPF] Add support for asm gotol_or_nop and nop_or_gotol insns Dec 12, 2023
.Case("bswap64", true)
.Case("goto", true)
.Case("gotol", true)
.Case("gotol_or_nop", true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skeptical about this, is gotol_or_nop ValidIdInMiddle? I think we did not introduce any asm syntax like:

if ... gotol_or_nop

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

skeptical about this, is gotol_or_nop ValidIdInMiddle? I think we did not introduce any asm syntax like:

if ... gotol_or_nop

Thanks! Will remove it later.

Alastair Robertson reported a huge compilation time increase
without -g for bpf target when comparing to x86 ([1]). In my setup,
with '-O0', for x86, a large basic block compilation takes 0.19s
while bpf target takes 2.46s. The top function which contributes
to the compile time is eliminateFrameIndex().

Such long compilation time without -g is caused by commit
  05de2e4 ("[bpf] error when BPF stack size exceeds 512 bytes")
The compiler tries to get some debug loc by iterating all insns
in the basic block which will be used when compiler warns
larger-than-512 stack size. Even without -g, such iterating also
happens which cause unnecessary compile time increase.

To fix the issue, let us move the related code when the compiler
is about to warn stack limit violation. This fixed the
compile time regression, and on my system, the compile time
is reduced from 2.46s to 0.35s.

  [1] bpftrace/bpftrace#3257
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:BPF llvm:mc Machine (object) code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants