Skip to content

Commit 8d6df2c

Browse files
idoschgregkh
authored andcommitted
nexthop: Make nexthop bucket dump more efficient
commit f10d3d9 upstream. rtm_dump_nexthop_bucket_nh() is used to dump nexthop buckets belonging to a specific resilient nexthop group. The function returns a positive return code (the skb length) upon both success and failure. The above behavior is problematic. When a complete nexthop bucket dump is requested, the function that walks the different nexthops treats the non-zero return code as an error. This causes buckets belonging to different resilient nexthop groups to be dumped using different buffers even if they can all fit in the same buffer: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id 10 group 1 type resilient buckets 1 # ip nexthop add id 20 group 1 type resilient buckets 1 # strace -e recvmsg -s 0 ip nexthop bucket [...] recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[...], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 64 id 10 index 0 idle_time 10.27 nhid 1 [...] recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[...], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 64 id 20 index 0 idle_time 6.44 nhid 1 [...] Fix by only returning a non-zero return code when an error occurred and restarting the dump from the bucket index we failed to fill in. This allows buckets belonging to different resilient nexthop groups to be dumped using the same buffer: # ip link add name dummy1 up type dummy # ip nexthop add id 1 dev dummy1 # ip nexthop add id 10 group 1 type resilient buckets 1 # ip nexthop add id 20 group 1 type resilient buckets 1 # strace -e recvmsg -s 0 ip nexthop bucket [...] recvmsg(3, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[...], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 128 id 10 index 0 idle_time 30.21 nhid 1 id 20 index 0 idle_time 26.7 nhid 1 [...] While this change is more of a performance improvement change than an actual bug fix, it is a prerequisite for a subsequent patch that does fix a bug. Fixes: 8a1bbab ("nexthop: Add netlink handlers for bucket dump") Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
1 parent 0b10d8d commit 8d6df2c

File tree

1 file changed

+5
-11
lines changed

1 file changed

+5
-11
lines changed

net/ipv4/nexthop.c

Lines changed: 5 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -3363,25 +3363,19 @@ static int rtm_dump_nexthop_bucket_nh(struct sk_buff *skb,
33633363
dd->filter.res_bucket_nh_id != nhge->nh->id)
33643364
continue;
33653365

3366+
dd->ctx->bucket_index = bucket_index;
33663367
err = nh_fill_res_bucket(skb, nh, bucket, bucket_index,
33673368
RTM_NEWNEXTHOPBUCKET, portid,
33683369
cb->nlh->nlmsg_seq, NLM_F_MULTI,
33693370
cb->extack);
3370-
if (err < 0) {
3371-
if (likely(skb->len))
3372-
goto out;
3373-
goto out_err;
3374-
}
3371+
if (err)
3372+
return err;
33753373
}
33763374

33773375
dd->ctx->done_nh_idx = dd->ctx->nh.idx + 1;
3378-
bucket_index = 0;
3376+
dd->ctx->bucket_index = 0;
33793377

3380-
out:
3381-
err = skb->len;
3382-
out_err:
3383-
dd->ctx->bucket_index = bucket_index;
3384-
return err;
3378+
return 0;
33853379
}
33863380

33873381
static int rtm_dump_nexthop_bucket_cb(struct sk_buff *skb,

0 commit comments

Comments
 (0)