Skip to content

Commit 886bf91

Browse files
committed
Daniel Borkmann says: ==================== pull-request: bpf 2024-06-05 We've added 8 non-merge commits during the last 6 day(s) which contain a total of 9 files changed, 34 insertions(+), 35 deletions(-). The main changes are: 1) Fix a potential use-after-free in bpf_link_free when the link uses dealloc_deferred to free the link object but later still tests for presence of link->ops->dealloc, from Cong Wang. 2) Fix BPF test infra to set the run context for rawtp test_run callback where syzbot reported a crash, from Jiri Olsa. 3) Fix bpf_session_cookie BTF_ID in the special_kfunc_set list to exclude it for the case of !CONFIG_FPROBE, also from Jiri Olsa. 4) Fix a Coverity static analysis report to not close() a link_fd of -1 in the multi-uprobe feature detector, from Andrii Nakryiko. 5) Revert support for redirect to any xsk socket bound to the same umem as it can result in corrupted ring state which can lead to a crash when flushing rings. A different approach will be pursued for bpf-next to address it safely, from Magnus Karlsson. 6) Fix inet_csk_accept prototype in test_sk_storage_tracing.c which caused BPF CI failure after the last tree fast forwarding, from Andrii Nakryiko. 7) Fix a coccicheck warning in BPF devmap that iterator variable cannot be NULL, from Thorsten Blum. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: Revert "xsk: Document ability to redirect to any socket bound to the same umem" Revert "xsk: Support redirect to any socket bound to the same umem" bpf: Set run context for rawtp test_run callback bpf: Fix a potential use-after-free in bpf_link_free() bpf, devmap: Remove unnecessary if check in for loop libbpf: don't close(-1) in multi-uprobe feature detector bpf: Fix bpf_session_cookie BTF_ID in special_kfunc_set list selftests/bpf: fix inet_csk_accept prototype in test_sk_storage_tracing.c ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2 parents 323a359 + 03e38d3 commit 886bf91

File tree

9 files changed

+34
-35
lines changed

9 files changed

+34
-35
lines changed

Documentation/networking/af_xdp.rst

Lines changed: 14 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -329,24 +329,23 @@ XDP_SHARED_UMEM option and provide the initial socket's fd in the
329329
sxdp_shared_umem_fd field as you registered the UMEM on that
330330
socket. These two sockets will now share one and the same UMEM.
331331

332-
In this case, it is possible to use the NIC's packet steering
333-
capabilities to steer the packets to the right queue. This is not
334-
possible in the previous example as there is only one queue shared
335-
among sockets, so the NIC cannot do this steering as it can only steer
336-
between queues.
337-
338-
In libxdp (or libbpf prior to version 1.0), you need to use the
339-
xsk_socket__create_shared() API as it takes a reference to a FILL ring
340-
and a COMPLETION ring that will be created for you and bound to the
341-
shared UMEM. You can use this function for all the sockets you create,
342-
or you can use it for the second and following ones and use
343-
xsk_socket__create() for the first one. Both methods yield the same
344-
result.
332+
There is no need to supply an XDP program like the one in the previous
333+
case where sockets were bound to the same queue id and
334+
device. Instead, use the NIC's packet steering capabilities to steer
335+
the packets to the right queue. In the previous example, there is only
336+
one queue shared among sockets, so the NIC cannot do this steering. It
337+
can only steer between queues.
338+
339+
In libbpf, you need to use the xsk_socket__create_shared() API as it
340+
takes a reference to a FILL ring and a COMPLETION ring that will be
341+
created for you and bound to the shared UMEM. You can use this
342+
function for all the sockets you create, or you can use it for the
343+
second and following ones and use xsk_socket__create() for the first
344+
one. Both methods yield the same result.
345345

346346
Note that a UMEM can be shared between sockets on the same queue id
347347
and device, as well as between queues on the same device and between
348-
devices at the same time. It is also possible to redirect to any
349-
socket as long as it is bound to the same umem with XDP_SHARED_UMEM.
348+
devices at the same time.
350349

351350
XDP_USE_NEED_WAKEUP bind flag
352351
-----------------------------
@@ -823,10 +822,6 @@ A: The short answer is no, that is not supported at the moment. The
823822
switch, or other distribution mechanism, in your NIC to direct
824823
traffic to the correct queue id and socket.
825824

826-
Note that if you are using the XDP_SHARED_UMEM option, it is
827-
possible to switch traffic between any socket bound to the same
828-
umem.
829-
830825
Q: My packets are sometimes corrupted. What is wrong?
831826

832827
A: Care has to be taken not to feed the same buffer in the UMEM into

kernel/bpf/devmap.c

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -760,9 +760,6 @@ int dev_map_redirect_multi(struct net_device *dev, struct sk_buff *skb,
760760
for (i = 0; i < dtab->n_buckets; i++) {
761761
head = dev_map_index_hash(dtab, i);
762762
hlist_for_each_entry_safe(dst, next, head, index_hlist) {
763-
if (!dst)
764-
continue;
765-
766763
if (is_ifindex_excluded(excluded_devices, num_excluded,
767764
dst->dev->ifindex))
768765
continue;

kernel/bpf/syscall.c

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2998,6 +2998,7 @@ static int bpf_obj_get(const union bpf_attr *attr)
29982998
void bpf_link_init(struct bpf_link *link, enum bpf_link_type type,
29992999
const struct bpf_link_ops *ops, struct bpf_prog *prog)
30003000
{
3001+
WARN_ON(ops->dealloc && ops->dealloc_deferred);
30013002
atomic64_set(&link->refcnt, 1);
30023003
link->type = type;
30033004
link->id = 0;
@@ -3056,16 +3057,17 @@ static void bpf_link_defer_dealloc_mult_rcu_gp(struct rcu_head *rcu)
30563057
/* bpf_link_free is guaranteed to be called from process context */
30573058
static void bpf_link_free(struct bpf_link *link)
30583059
{
3060+
const struct bpf_link_ops *ops = link->ops;
30593061
bool sleepable = false;
30603062

30613063
bpf_link_free_id(link->id);
30623064
if (link->prog) {
30633065
sleepable = link->prog->sleepable;
30643066
/* detach BPF program, clean up used resources */
3065-
link->ops->release(link);
3067+
ops->release(link);
30663068
bpf_prog_put(link->prog);
30673069
}
3068-
if (link->ops->dealloc_deferred) {
3070+
if (ops->dealloc_deferred) {
30693071
/* schedule BPF link deallocation; if underlying BPF program
30703072
* is sleepable, we need to first wait for RCU tasks trace
30713073
* sync, then go through "classic" RCU grace period
@@ -3074,9 +3076,8 @@ static void bpf_link_free(struct bpf_link *link)
30743076
call_rcu_tasks_trace(&link->rcu, bpf_link_defer_dealloc_mult_rcu_gp);
30753077
else
30763078
call_rcu(&link->rcu, bpf_link_defer_dealloc_rcu_gp);
3077-
}
3078-
if (link->ops->dealloc)
3079-
link->ops->dealloc(link);
3079+
} else if (ops->dealloc)
3080+
ops->dealloc(link);
30803081
}
30813082

30823083
static void bpf_link_put_deferred(struct work_struct *work)

kernel/bpf/verifier.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11128,7 +11128,11 @@ BTF_ID(func, bpf_iter_css_task_new)
1112811128
#else
1112911129
BTF_ID_UNUSED
1113011130
#endif
11131+
#ifdef CONFIG_BPF_EVENTS
1113111132
BTF_ID(func, bpf_session_cookie)
11133+
#else
11134+
BTF_ID_UNUSED
11135+
#endif
1113211136

1113311137
static bool is_kfunc_ret_null(struct bpf_kfunc_call_arg_meta *meta)
1113411138
{

kernel/trace/bpf_trace.c

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3517,7 +3517,6 @@ static u64 bpf_uprobe_multi_entry_ip(struct bpf_run_ctx *ctx)
35173517
}
35183518
#endif /* CONFIG_UPROBES */
35193519

3520-
#ifdef CONFIG_FPROBE
35213520
__bpf_kfunc_start_defs();
35223521

35233522
__bpf_kfunc bool bpf_session_is_return(void)
@@ -3566,4 +3565,3 @@ static int __init bpf_kprobe_multi_kfuncs_init(void)
35663565
}
35673566

35683567
late_initcall(bpf_kprobe_multi_kfuncs_init);
3569-
#endif

net/bpf/test_run.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -727,10 +727,16 @@ static void
727727
__bpf_prog_test_run_raw_tp(void *data)
728728
{
729729
struct bpf_raw_tp_test_run_info *info = data;
730+
struct bpf_trace_run_ctx run_ctx = {};
731+
struct bpf_run_ctx *old_run_ctx;
732+
733+
old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
730734

731735
rcu_read_lock();
732736
info->retval = bpf_prog_run(info->prog, info->ctx);
733737
rcu_read_unlock();
738+
739+
bpf_reset_run_ctx(old_run_ctx);
734740
}
735741

736742
int bpf_prog_test_run_raw_tp(struct bpf_prog *prog,

net/xdp/xsk.c

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -313,13 +313,10 @@ static bool xsk_is_bound(struct xdp_sock *xs)
313313

314314
static int xsk_rcv_check(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
315315
{
316-
struct net_device *dev = xdp->rxq->dev;
317-
u32 qid = xdp->rxq->queue_index;
318-
319316
if (!xsk_is_bound(xs))
320317
return -ENXIO;
321318

322-
if (!dev->_rx[qid].pool || xs->umem != dev->_rx[qid].pool->umem)
319+
if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index)
323320
return -EINVAL;
324321

325322
if (len > xsk_pool_get_rx_frame_size(xs->pool) && !xs->sg) {

tools/lib/bpf/features.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -393,7 +393,8 @@ static int probe_uprobe_multi_link(int token_fd)
393393
err = -errno; /* close() can clobber errno */
394394

395395
if (link_fd >= 0 || err != -EBADF) {
396-
close(link_fd);
396+
if (link_fd >= 0)
397+
close(link_fd);
397398
close(prog_fd);
398399
return 0;
399400
}

tools/testing/selftests/bpf/progs/test_sk_storage_tracing.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -84,7 +84,7 @@ int BPF_PROG(trace_tcp_connect, struct sock *sk)
8484
}
8585

8686
SEC("fexit/inet_csk_accept")
87-
int BPF_PROG(inet_csk_accept, struct sock *sk, int flags, int *err, bool kern,
87+
int BPF_PROG(inet_csk_accept, struct sock *sk, struct proto_accept_arg *arg,
8888
struct sock *accepted_sk)
8989
{
9090
set_task_info(accepted_sk);

0 commit comments

Comments
 (0)