Skip to content

Commit b54a8e1

Browse files
author
Alexei Starovoitov
committed
Merge branch 'bpf-indirect-jumps'
Anton Protopopov says: ==================== BPF indirect jumps This patchset implements a new type of map, instruction set, and uses it to build support for indirect branches in BPF (on x86). (The same map will be later used to provide support for indirect calls and static keys.) See [1], [2] for more context. Short table of contents: * Patches 1-6 implement the new map of type BPF_MAP_TYPE_INSN_SET and corresponding selftests. This map can be used to track the "original -> xlated -> jitted mapping" for a given program. * Patches 7-12 implement the support for indirect jumps on x86 and add libbpf support for LLVM-compiled programs containing indirect jumps, and selftests. The jump table support was merged to LLVM and now can be enabled with -mcpu=v4, see [3]. The __BPF_FEATURE_GOTOX macros can be used to check if the compiler supports the feature or not. See individual patches for more details on the implementation details. v10 -> v11 (this series): * rearranged patches and split libbpf patch such that first 6 patches implementing instruction arrays can be applied independently * instruction arrays: * move [fake] aux->used_maps assignment in this patch * indirect jumps: * call clear_insn_aux_data before bpf_remove_insns (AI) * libbpf: * remove the relocations check after the new LLVM is released (Eduard, Yonghong) * libbpf: fix an index printed in pr_warn (AI) * selftests: * protect programs triggered by nanosleep from fake runs (Eduard) * patch verifier_gotox to not emit .rel.jumptables v9 -> v10 (https://lore.kernel.org/bpf/[email protected]/T/#t): * Three bugs were noticed by AI in v9 (two old, one introduced by v9): * [new] insn_array_alloc_size could overflow u32, switched to u64 (AI) * map_ptr should be compared in regsafe for PTR_TO_INSN (AI) * duplicate elements were copied in jt_from_map (AI) * added a selftest in verifier_gotox with a jump table containing non-unique entries v8 -> v9 (https://lore.kernel.org/bpf/[email protected]/T/#t): * instruction arrays: * remove the size restriction of 256 elements * add a comments about addrs usage, old and new (Alexei) * libbpf: * properly prefix warnings (Andrii) * cast j[t] to long long for printf and some other minor cleanups (Andrii) * selftests: * use __BPF_FEATURE_GOTOX in selftests and skip tests if it's not set (Eduard) * fix a typo in a selftest assembly (AI) v7 -> v8 (https://lore.kernel.org/bpf/[email protected]/T/#u): * instruction arrays: * simplify the bpf_prog_update_insn_ptrs function (Eduard) * remove a semicolon after a function definition (AI) * libbpf: * add a proper error path in libbpf patch (AI) * re-re-factor the create_jt_map & find_subprog_idx (Eduard) * selftests: * verifier_gotox: add a test for a jump table pointing to outside of a subprog (Eduard) * used test__skip instead of just running an empty test * split tests in bpf_gotox into subtests for convenience * random: * drop the docs commit for now v6 -> v7 (https://lore.kernel.org/bpf/[email protected]/T/#t): * rebased and dropped already merged commits * instruction arrays * use jit_data to find mappings from insn to jit (Alexei) * alloc `ips` as part of the main allocation (Eduard) * the `jitted_ip` member wasn't actually used (Eduard) * remove the bpf_insn_ptr structure, which is not needed for this patch * indirect jumps, kernel: * fix a memory leak in `create_jt` (AI) * use proper reg+8*ereg in `its_static_thunk` (AI) * some minor cleanups (Eduard) * indirect jumps, libbpf: * refactor the `jt_adjust_off()` piece (Edurad) * move "JUMPTABLES_SEC" into libbpf_internal.h (Eduard) * remove an unnecessary if (Eduard) * verifier_gotox: add tests to verify that `gotox rX` works with all registers v5 -> v6 (https://lore.kernel.org/bpf/[email protected]/T/#u): * instruction arrays: * better document `struct bpf_insn_array_value` (Eduard) * remove a condition in `bpf_insn_array_adjust_after_remove` (Eduard) * make userspace see original, xlated, and jitted indexes (+original) (Eduard) * indirect jumps, kernel: * reject writes to the map * reject unaligned ops * add a check what `w` is not outside the program in check_config for `gotox` (Eduard) * do not introduce unneeded `bpf_find_containing_subprog_idx` * simplify error processing for `bpf_find_containing_subprog` (Eduard) * add `insn_state |= DISCOVERED` when it's discovered (Eduard) * support SUB operations on PTR_TO_INSN (Eduard) * make `gotox_tmp_buf` a bpf_iarray and use helper to relocate it (Eduard) * rename fields of `bpf_iarray` to more generic (Eduard) * re-implement `visit_gotox_insn` in a loop (Eduard) * some minor cleanups (Eduard) * libbpf: * `struct reloc_desc`: add a comment about `union` (Eduard) * rename parameters of (and one other place in code) `{create,add}_jt_map` to `sym_off` (Eduard) * `create_jt_map`: check that size/off are 8-byte aligned (Eduard) * Selftests: * instruction array selftests: * only run tests on x86_64 * write a more generic function to test things to reduce code (Eduard) * errno wasn't used in checks, so don't reset it (Eduard) * print `i`, `xlated_off` and `map_out[i]` here (Eduard) * added `verifier_gotox` selftests which do not depend on LLVM: * disabled `bpf_gotox` tests by default * other changes: * remove an extra function in bpf disasm (Eduard) * some minor cleanups in the insn_successors patch (Eduard) * update documentation in `Documentation/bpf/linux-notes.html` about jumps, now it is supported :) v3 -> v4 -> v5 (https://lore.kernel.org/bpf/[email protected]/): * [v4 -> v5] rebased on top of the last bpf-next/master * instruction arrays: * add copyright (Alexei) * remove mutexes, add frozen back (Alexei) * setup 1:1 prog-map correspondence using atomic_xchg * do not copy/paste array_map_get_next_key, add a common helper (Alexei) * misc minor code cleanups (Alexei) * indirect jumps, kernel side: * remove jt_allocated, just check if insn is gotox (Eduard) * use copy_register_state instead of individual copies (Eduard) * in push_stack is_speculative should be inherited (Eduard) * a few cleanups for insn_successors, including omitting error path (Eduard) * check if reserved fields are used when considering `gotox` instruction (Eduard) * read size and alignment of read from insn_array should be 8 (Eduard) * put buffer for sorting in subfun info and realloc to grow as needed (Eduard) * properly do `jump_point` / `prune_point` from `push_gotox_edge` (Eduard) * use range_within to check states (Eduard) * some minor cleanups and fix commit message (Eduard) * indirect jumps, libbpf side: * close map_fd in some error paths in create_jt_map (Andrii) * maps for jump tables are actually not closed at all, fix this (Andrii) * rename map from `jt` to `.jumptables` (Andrii) * use `errstr` in an error message (Andrii) * rephrase error message to look more standard (Andrii) * misc other minor renames and cleanups (Andrii) * selftests: * add the frozen selftest back * add a selftest for two jumps loading same table * some other changes: * rebase and split insn_successor changes into separate patch * use PTR_ERR_OR_ZERO in the push stack patch (Eduard) * indirect jumps on x86: properly re-read *pprog (Eduard) v2 -> v3 (https://lore.kernel.org/bpf/[email protected]/): * fix build failure when CONFIG_BPF_SYSCALL is not set (kbuild-bot) * reformat bpftool help messages (Quentin) v1 -> v2 (https://lore.kernel.org/bpf/[email protected]/): * push_stack changes: * sanitize_speculative_path should just return int (Eduard) * return code from sanitize_speculative_path, not EFAULT (Eduard) * when BPF_COMPLEXITY_LIMIT_JMP_SEQ is reached, return E2BIG (Eduard) * indirect jumps: * omit support for .imm=fd in gotox, as we're not using it for now (Eduard) * struct jt -> struct bpf_iarray (Eduard) * insn_successors: rewrite the interface to just return a pointer (Eduard) * remove min_index/max_index, use umin_value/umax_value instead (Alexei, Eduard) * move emit_indirect_jump args change to the previous patch (Eduard) * add a comment to map_mem_size() (Eduard) * use verifier_bug for some error cases in check_indirect_jump (Eduard) * clear_insn_aux_data: use start,len instead of start,end (Eduard) * make regs[insn->dst_reg].type = PTR_TO_INSN part of check_mem_access (Eduard) * constant blinding changes: * make subprog_start adjustment better readable (Eduard) * do not set subprog len, it is already set (Eduard) * libbpf: * remove check that relocations from .rodata are ok (Anton) * do not freeze the map, it is not necessary anymore (Anton) * rename the goto_x -> gotox everywhere (Anton) * use u64 when parsing LLVM jump tables (Eduard) * split patch in two due to spaces->tabs change (Eduard) * split bpftool changes to bpftool patch (Andrii) * make sym_size it a union with ext_idx (Andrii) * properly copy/free the jumptables_data section from elf (Andrii) * a few cosmetic changes around create_jt_map (Andrii) * fix some comments + rewrite patch description (Andrii) * inline bpf_prog__append_subprog_offsets (Andrii) * subprog_sec_offst -> subprog_sec_off (Andrii) * !strcmp -> strcmp() == 0 (Andrii) * make some function names more readable (Andrii) * allocate table of subfunc offsets via libbpf_reallocarray (Andrii) * selftests: * squash insn_array* tests together (Anton) * fixed build warnings (kernel test robot) RFC -> v1 (https://lore.kernel.org/bpf/[email protected]/): * I've tried to address all the comments provided by Alexei and Eduard in RFC. Will try to list the most important of them below. * One big change: move from older LLVM version [5] to newer [4]. Now LLVM generates jump tables as symbols in the new special section ".jumptables". Another part of this change is that libbpf now doesn't try to link map load and goto *rX, as 1) this is absolutely not reliable 2) for some use cases this is impossible (namely, when more than one jump table can be used in the same gotox instruction). * Added insn_successors() support (Alexei, Eduard). This includes getting rid of the ugly bpf_insn_set_iter_xlated_offset() interface (Eduard). * Removed hack for the unreachable instruction, as new LLVM thank to Eduard doesn't generate it. * Set mem_size for direct map access properly instead of hacking. Remove off>0 check. (Alexei) * Do not allocate new memory for min_index/max_index (Alexei, Eduard) * Information required during check_cfg is now cached to be reused later (Alexei + general logic for supporting multiple JT per jump) * Properly compare registers in regsafe (Alexei, Eduard) * Remove support for JMP32 (Eduard) * Better checks in adjust_ptr_min_max_vals (Eduard) * More selftests were added (but still there's room for more) which directly use gotox (Alexei) * More checks and verbose messages added * "unique pointers" are no more in the map Links: 1. https://lpc.events/event/18/contributions/1941/ 2. https://lwn.net/Articles/1017439/ 3. llvm/llvm-project#149715 4. llvm/llvm-project#149715 (comment) 6. rfc: https://lore.kernel.org/bpf/[email protected]/ ==================== Link: https://patch.msgid.link/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2 parents 4cb4897 + ac4d838 commit b54a8e1

File tree

26 files changed

+2776
-20
lines changed

26 files changed

+2776
-20
lines changed

arch/x86/net/bpf_jit_comp.c

Lines changed: 34 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -660,24 +660,38 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
660660

661661
#define EMIT_LFENCE() EMIT3(0x0F, 0xAE, 0xE8)
662662

663-
static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip)
663+
static void __emit_indirect_jump(u8 **pprog, int reg, bool ereg)
664664
{
665665
u8 *prog = *pprog;
666666

667+
if (ereg)
668+
EMIT1(0x41);
669+
670+
EMIT2(0xFF, 0xE0 + reg);
671+
672+
*pprog = prog;
673+
}
674+
675+
static void emit_indirect_jump(u8 **pprog, int bpf_reg, u8 *ip)
676+
{
677+
u8 *prog = *pprog;
678+
int reg = reg2hex[bpf_reg];
679+
bool ereg = is_ereg(bpf_reg);
680+
667681
if (cpu_feature_enabled(X86_FEATURE_INDIRECT_THUNK_ITS)) {
668682
OPTIMIZER_HIDE_VAR(reg);
669-
emit_jump(&prog, its_static_thunk(reg), ip);
683+
emit_jump(&prog, its_static_thunk(reg + 8*ereg), ip);
670684
} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE_LFENCE)) {
671685
EMIT_LFENCE();
672-
EMIT2(0xFF, 0xE0 + reg);
686+
__emit_indirect_jump(&prog, reg, ereg);
673687
} else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) {
674688
OPTIMIZER_HIDE_VAR(reg);
675689
if (cpu_feature_enabled(X86_FEATURE_CALL_DEPTH))
676-
emit_jump(&prog, &__x86_indirect_jump_thunk_array[reg], ip);
690+
emit_jump(&prog, &__x86_indirect_jump_thunk_array[reg + 8*ereg], ip);
677691
else
678-
emit_jump(&prog, &__x86_indirect_thunk_array[reg], ip);
692+
emit_jump(&prog, &__x86_indirect_thunk_array[reg + 8*ereg], ip);
679693
} else {
680-
EMIT2(0xFF, 0xE0 + reg); /* jmp *%\reg */
694+
__emit_indirect_jump(&prog, reg, ereg);
681695
if (IS_ENABLED(CONFIG_MITIGATION_RETPOLINE) || IS_ENABLED(CONFIG_MITIGATION_SLS))
682696
EMIT1(0xCC); /* int3 */
683697
}
@@ -797,7 +811,7 @@ static void emit_bpf_tail_call_indirect(struct bpf_prog *bpf_prog,
797811
* rdi == ctx (1st arg)
798812
* rcx == prog->bpf_func + X86_TAIL_CALL_OFFSET
799813
*/
800-
emit_indirect_jump(&prog, 1 /* rcx */, ip + (prog - start));
814+
emit_indirect_jump(&prog, BPF_REG_4 /* R4 -> rcx */, ip + (prog - start));
801815

802816
/* out: */
803817
ctx->tail_call_indirect_label = prog - start;
@@ -2614,6 +2628,9 @@ st: if (is_imm8(insn->off))
26142628

26152629
break;
26162630

2631+
case BPF_JMP | BPF_JA | BPF_X:
2632+
emit_indirect_jump(&prog, insn->dst_reg, image + addrs[i - 1]);
2633+
break;
26172634
case BPF_JMP | BPF_JA:
26182635
case BPF_JMP32 | BPF_JA:
26192636
if (BPF_CLASS(insn->code) == BPF_JMP) {
@@ -3543,7 +3560,7 @@ static int emit_bpf_dispatcher(u8 **pprog, int a, int b, s64 *progs, u8 *image,
35433560
if (err)
35443561
return err;
35453562

3546-
emit_indirect_jump(&prog, 2 /* rdx */, image + (prog - buf));
3563+
emit_indirect_jump(&prog, BPF_REG_3 /* R3 -> rdx */, image + (prog - buf));
35473564

35483565
*pprog = prog;
35493566
return 0;
@@ -3827,6 +3844,15 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
38273844
jit_data->header = header;
38283845
jit_data->rw_header = rw_header;
38293846
}
3847+
3848+
/*
3849+
* The bpf_prog_update_insn_ptrs function expects addrs to
3850+
* point to the first byte of the jitted instruction (unlike
3851+
* the bpf_prog_fill_jited_linfo below, which, for historical
3852+
* reasons, expects to point to the next instruction)
3853+
*/
3854+
bpf_prog_update_insn_ptrs(prog, addrs, image);
3855+
38303856
/*
38313857
* ctx.prog_offset is used when CFI preambles put code *before*
38323858
* the function. See emit_cfi(). For FineIBT specifically this code

include/linux/bpf.h

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1001,6 +1001,7 @@ enum bpf_reg_type {
10011001
PTR_TO_ARENA,
10021002
PTR_TO_BUF, /* reg points to a read/write buffer */
10031003
PTR_TO_FUNC, /* reg points to a bpf program function */
1004+
PTR_TO_INSN, /* reg points to a bpf program instruction */
10041005
CONST_PTR_TO_DYNPTR, /* reg points to a const struct bpf_dynptr */
10051006
__BPF_REG_TYPE_MAX,
10061007

@@ -3797,4 +3798,19 @@ int bpf_prog_get_file_line(struct bpf_prog *prog, unsigned long ip, const char *
37973798
const char **linep, int *nump);
37983799
struct bpf_prog *bpf_prog_find_from_stack(void);
37993800

3801+
int bpf_insn_array_init(struct bpf_map *map, const struct bpf_prog *prog);
3802+
int bpf_insn_array_ready(struct bpf_map *map);
3803+
void bpf_insn_array_release(struct bpf_map *map);
3804+
void bpf_insn_array_adjust(struct bpf_map *map, u32 off, u32 len);
3805+
void bpf_insn_array_adjust_after_remove(struct bpf_map *map, u32 off, u32 len);
3806+
3807+
#ifdef CONFIG_BPF_SYSCALL
3808+
void bpf_prog_update_insn_ptrs(struct bpf_prog *prog, u32 *offsets, void *image);
3809+
#else
3810+
static inline void
3811+
bpf_prog_update_insn_ptrs(struct bpf_prog *prog, u32 *offsets, void *image)
3812+
{
3813+
}
3814+
#endif
3815+
38003816
#endif /* _LINUX_BPF_H */

include/linux/bpf_types.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,7 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_RINGBUF, ringbuf_map_ops)
133133
BPF_MAP_TYPE(BPF_MAP_TYPE_BLOOM_FILTER, bloom_filter_map_ops)
134134
BPF_MAP_TYPE(BPF_MAP_TYPE_USER_RINGBUF, user_ringbuf_map_ops)
135135
BPF_MAP_TYPE(BPF_MAP_TYPE_ARENA, arena_map_ops)
136+
BPF_MAP_TYPE(BPF_MAP_TYPE_INSN_ARRAY, insn_array_map_ops)
136137

137138
BPF_LINK_TYPE(BPF_LINK_TYPE_RAW_TRACEPOINT, raw_tracepoint)
138139
BPF_LINK_TYPE(BPF_LINK_TYPE_TRACING, tracing)

include/linux/bpf_verifier.h

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -527,6 +527,7 @@ struct bpf_insn_aux_data {
527527
struct {
528528
u32 map_index; /* index into used_maps[] */
529529
u32 map_off; /* offset from value base address */
530+
struct bpf_iarray *jt; /* jump table for gotox instruction */
530531
};
531532
struct {
532533
enum bpf_reg_type reg_type; /* type of pseudo_btf_id */
@@ -754,8 +755,10 @@ struct bpf_verifier_env {
754755
struct list_head free_list; /* list of struct bpf_verifier_state_list */
755756
struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */
756757
struct btf_mod_pair used_btfs[MAX_USED_BTFS]; /* array of BTF's used by BPF program */
758+
struct bpf_map *insn_array_maps[MAX_USED_MAPS]; /* array of INSN_ARRAY map's to be relocated */
757759
u32 used_map_cnt; /* number of used maps */
758760
u32 used_btf_cnt; /* number of used BTF objects */
761+
u32 insn_array_map_cnt; /* number of used maps of type BPF_MAP_TYPE_INSN_ARRAY */
759762
u32 id_gen; /* used to generate unique reg IDs */
760763
u32 hidden_subprog_cnt; /* number of hidden subprogs */
761764
int exception_callback_subprog;
@@ -838,6 +841,7 @@ struct bpf_verifier_env {
838841
struct bpf_scc_info **scc_info;
839842
u32 scc_cnt;
840843
struct bpf_iarray *succ;
844+
struct bpf_iarray *gotox_tmp_buf;
841845
};
842846

843847
static inline struct bpf_func_info_aux *subprog_aux(struct bpf_verifier_env *env, int subprog)
@@ -1048,6 +1052,13 @@ static inline bool bpf_stack_narrow_access_ok(int off, int fill_size, int spill_
10481052
return !(off % BPF_REG_SIZE);
10491053
}
10501054

1055+
static inline bool insn_is_gotox(struct bpf_insn *insn)
1056+
{
1057+
return BPF_CLASS(insn->code) == BPF_JMP &&
1058+
BPF_OP(insn->code) == BPF_JA &&
1059+
BPF_SRC(insn->code) == BPF_X;
1060+
}
1061+
10511062
const char *reg_type_str(struct bpf_verifier_env *env, enum bpf_reg_type type);
10521063
const char *dynptr_type_str(enum bpf_dynptr_type type);
10531064
const char *iter_type_str(const struct btf *btf, u32 btf_id);

include/uapi/linux/bpf.h

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1026,6 +1026,7 @@ enum bpf_map_type {
10261026
BPF_MAP_TYPE_USER_RINGBUF,
10271027
BPF_MAP_TYPE_CGRP_STORAGE,
10281028
BPF_MAP_TYPE_ARENA,
1029+
BPF_MAP_TYPE_INSN_ARRAY,
10291030
__MAX_BPF_MAP_TYPE
10301031
};
10311032

@@ -7649,4 +7650,24 @@ enum bpf_kfunc_flags {
76497650
BPF_F_PAD_ZEROS = (1ULL << 0),
76507651
};
76517652

7653+
/*
7654+
* Values of a BPF_MAP_TYPE_INSN_ARRAY entry must be of this type.
7655+
*
7656+
* Before the map is used the orig_off field should point to an
7657+
* instruction inside the program being loaded. The other fields
7658+
* must be set to 0.
7659+
*
7660+
* After the program is loaded, the xlated_off will be adjusted
7661+
* by the verifier to point to the index of the original instruction
7662+
* in the xlated program. If the instruction is deleted, it will
7663+
* be set to (u32)-1. The jitted_off will be set to the corresponding
7664+
* offset in the jitted image of the program.
7665+
*/
7666+
struct bpf_insn_array_value {
7667+
__u32 orig_off;
7668+
__u32 xlated_off;
7669+
__u32 jitted_off;
7670+
__u32 :32;
7671+
};
7672+
76527673
#endif /* _UAPI__LINUX_BPF_H__ */

kernel/bpf/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ CFLAGS_core.o += -Wno-override-init $(cflags-nogcse-yy)
99
obj-$(CONFIG_BPF_SYSCALL) += syscall.o verifier.o inode.o helpers.o tnum.o log.o token.o liveness.o
1010
obj-$(CONFIG_BPF_SYSCALL) += bpf_iter.o map_iter.o task_iter.o prog_iter.o link_iter.o
1111
obj-$(CONFIG_BPF_SYSCALL) += hashtab.o arraymap.o percpu_freelist.o bpf_lru_list.o lpm_trie.o map_in_map.o bloom_filter.o
12-
obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o
12+
obj-$(CONFIG_BPF_SYSCALL) += local_storage.o queue_stack_maps.o ringbuf.o bpf_insn_array.o
1313
obj-$(CONFIG_BPF_SYSCALL) += bpf_local_storage.o bpf_task_storage.o
1414
obj-${CONFIG_BPF_LSM} += bpf_inode_storage.o
1515
obj-$(CONFIG_BPF_SYSCALL) += disasm.o mprog.o

0 commit comments

Comments
 (0)