Skip to content

Commit c0ae035

Browse files
committed
ethtool: rss: initial RSS_SET (indirection table handling)
Add initial support for RSS_SET, for now only operations on the indirection table are supported. Unlike the ioctl don't check if at least one parameter is being changed. This is how other ethtool-nl ops behave, so pick the ethtool-nl consistency vs copying ioctl behavior. There are two special cases here: 1) resetting the table to defaults; 2) support for tables of different size. For (1) I use an empty Netlink attribute (array of size 0). (2) may require some background. AFAICT a lot of modern devices allow allocating RSS tables of different sizes. mlx5 can upsize its tables, bnxt has some "table size calculation", and Intel folks asked about RSS table sizing in context of resource allocation in the past. The ethtool IOCTL API has a concept of table size, but right now the user is expected to provide a table exactly the size the device requests. Some drivers may change the table size at runtime (in response to queue count changes) but the user is not in control of this. What's not great is that all RSS contexts share the same table size. For example a device with 128 queues enabled, 16 RSS contexts 8 queues in each will likely have 256 entry tables for each of the 16 contexts, while 32 would be more than enough given each context only has 8 queues. To address this the Netlink API should avoid enforcing table size at the uAPI level, and should allow the user to express the min table size they expect. To fully solve (2) we will need more driver plumbing but at the uAPI level this patch allows the user to specify a table size smaller than what the device advertises. The device table size must be a multiple of the user requested table size. We then replicate the user-provided table to fill the full device size table. This addresses the "allow the user to express the min table size" objective, while not enforcing any fixed size. From Netlink perspective .get_rxfh_indir_size() is now de facto the "max" table size supported by the device. We may choose to support table replication in ethtool, too, when we actually plumb this thru the device APIs. Initially I was considering moving full pattern generation to the kernel (which queues to use, at which frequency and what min sequence length). I don't think this complexity would buy us much and most if not all devices have pow-2 table sizes, which simplifies the replication a lot. Reviewed-by: Gal Pressman <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
1 parent 870bc1a commit c0ae035

File tree

6 files changed

+242
-1
lines changed

6 files changed

+242
-1
lines changed

Documentation/netlink/specs/ethtool.yaml

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2643,6 +2643,18 @@ operations:
26432643
attributes:
26442644
- header
26452645
- events
2646+
-
2647+
name: rss-set
2648+
doc: Set RSS params.
2649+
2650+
attribute-set: rss
2651+
2652+
do:
2653+
request:
2654+
attributes:
2655+
- header
2656+
- context
2657+
- indir
26462658
-
26472659
name: rss-ntf
26482660
doc: |

Documentation/networking/ethtool-netlink.rst

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,7 @@ Userspace to kernel:
239239
``ETHTOOL_MSG_PHY_GET`` get Ethernet PHY information
240240
``ETHTOOL_MSG_TSCONFIG_GET`` get hw timestamping configuration
241241
``ETHTOOL_MSG_TSCONFIG_SET`` set hw timestamping configuration
242+
``ETHTOOL_MSG_RSS_SET`` set RSS settings
242243
===================================== =================================
243244

244245
Kernel to userspace:
@@ -292,6 +293,7 @@ Kernel to userspace:
292293
``ETHTOOL_MSG_TSCONFIG_GET_REPLY`` hw timestamping configuration
293294
``ETHTOOL_MSG_TSCONFIG_SET_REPLY`` new hw timestamping configuration
294295
``ETHTOOL_MSG_PSE_NTF`` PSE events notification
296+
``ETHTOOL_MSG_RSS_NTF`` RSS settings notification
295297
======================================== =================================
296298

297299
``GET`` requests are sent by userspace applications to retrieve device
@@ -1989,6 +1991,28 @@ hfunc. Current supported options are symmetric-xor and symmetric-or-xor.
19891991
ETHTOOL_A_RSS_FLOW_HASH carries per-flow type bitmask of which header
19901992
fields are included in the hash calculation.
19911993

1994+
RSS_SET
1995+
=======
1996+
1997+
Request contents:
1998+
1999+
===================================== ====== ==============================
2000+
``ETHTOOL_A_RSS_HEADER`` nested request header
2001+
``ETHTOOL_A_RSS_CONTEXT`` u32 context number
2002+
``ETHTOOL_A_RSS_INDIR`` binary Indir table bytes
2003+
===================================== ====== ==============================
2004+
2005+
``ETHTOOL_A_RSS_INDIR`` is the minimal RSS table the user expects. Kernel and
2006+
the device driver may replicate the table if its smaller than smallest table
2007+
size supported by the device. For example if user requests ``[0, 1]`` but the
2008+
device needs at least 8 entries - the real table in use will end up being
2009+
``[0, 1, 0, 1, 0, 1, 0, 1]``. Most devices require the table size to be power
2010+
of 2, so tables which size is not a power of 2 will likely be rejected.
2011+
Using table of size 0 will reset the indirection table to the default.
2012+
2013+
Note that, at present, only a subset of RSS configuration can be accomplished
2014+
over Netlink.
2015+
19922016
PLCA_GET_CFG
19932017
============
19942018

@@ -2455,7 +2479,7 @@ are netlink only.
24552479
``ETHTOOL_GRXNTUPLE`` n/a
24562480
``ETHTOOL_GSSET_INFO`` ``ETHTOOL_MSG_STRSET_GET``
24572481
``ETHTOOL_GRXFHINDIR`` ``ETHTOOL_MSG_RSS_GET``
2458-
``ETHTOOL_SRXFHINDIR`` n/a
2482+
``ETHTOOL_SRXFHINDIR`` ``ETHTOOL_MSG_RSS_SET``
24592483
``ETHTOOL_GFEATURES`` ``ETHTOOL_MSG_FEATURES_GET``
24602484
``ETHTOOL_SFEATURES`` ``ETHTOOL_MSG_FEATURES_SET``
24612485
``ETHTOOL_GCHANNELS`` ``ETHTOOL_MSG_CHANNELS_GET``

include/uapi/linux/ethtool_netlink_generated.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -840,6 +840,7 @@ enum {
840840
ETHTOOL_MSG_PHY_GET,
841841
ETHTOOL_MSG_TSCONFIG_GET,
842842
ETHTOOL_MSG_TSCONFIG_SET,
843+
ETHTOOL_MSG_RSS_SET,
843844

844845
__ETHTOOL_MSG_USER_CNT,
845846
ETHTOOL_MSG_USER_MAX = (__ETHTOOL_MSG_USER_CNT - 1)

net/ethtool/netlink.c

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -405,6 +405,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
405405
[ETHTOOL_MSG_PSE_GET] = &ethnl_pse_request_ops,
406406
[ETHTOOL_MSG_PSE_SET] = &ethnl_pse_request_ops,
407407
[ETHTOOL_MSG_RSS_GET] = &ethnl_rss_request_ops,
408+
[ETHTOOL_MSG_RSS_SET] = &ethnl_rss_request_ops,
408409
[ETHTOOL_MSG_PLCA_GET_CFG] = &ethnl_plca_cfg_request_ops,
409410
[ETHTOOL_MSG_PLCA_SET_CFG] = &ethnl_plca_cfg_request_ops,
410411
[ETHTOOL_MSG_PLCA_GET_STATUS] = &ethnl_plca_status_request_ops,
@@ -1504,6 +1505,13 @@ static const struct genl_ops ethtool_genl_ops[] = {
15041505
.policy = ethnl_tsconfig_set_policy,
15051506
.maxattr = ARRAY_SIZE(ethnl_tsconfig_set_policy) - 1,
15061507
},
1508+
{
1509+
.cmd = ETHTOOL_MSG_RSS_SET,
1510+
.flags = GENL_UNS_ADMIN_PERM,
1511+
.doit = ethnl_default_set_doit,
1512+
.policy = ethnl_rss_set_policy,
1513+
.maxattr = ARRAY_SIZE(ethnl_rss_set_policy) - 1,
1514+
},
15071515
};
15081516

15091517
static const struct genl_multicast_group ethtool_nl_mcgrps[] = {

net/ethtool/netlink.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -484,6 +484,7 @@ extern const struct nla_policy ethnl_module_set_policy[ETHTOOL_A_MODULE_POWER_MO
484484
extern const struct nla_policy ethnl_pse_get_policy[ETHTOOL_A_PSE_HEADER + 1];
485485
extern const struct nla_policy ethnl_pse_set_policy[ETHTOOL_A_PSE_MAX + 1];
486486
extern const struct nla_policy ethnl_rss_get_policy[ETHTOOL_A_RSS_START_CONTEXT + 1];
487+
extern const struct nla_policy ethnl_rss_set_policy[ETHTOOL_A_RSS_START_CONTEXT + 1];
487488
extern const struct nla_policy ethnl_plca_get_cfg_policy[ETHTOOL_A_PLCA_HEADER + 1];
488489
extern const struct nla_policy ethnl_plca_set_cfg_policy[ETHTOOL_A_PLCA_MAX + 1];
489490
extern const struct nla_policy ethnl_plca_get_status_policy[ETHTOOL_A_PLCA_HEADER + 1];

net/ethtool/rss.c

Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,10 @@ rss_prepare(const struct rss_req_info *request, struct net_device *dev,
218218
{
219219
rss_prepare_flow_hash(request, dev, data, info);
220220

221+
/* Coming from RSS_SET, driver may only have flow_hash_fields ops */
222+
if (!dev->ethtool_ops->get_rxfh)
223+
return 0;
224+
221225
if (request->rss_context)
222226
return rss_prepare_ctx(request, dev, data, info);
223227
return rss_prepare_get(request, dev, data, info);
@@ -466,6 +470,193 @@ void ethtool_rss_notify(struct net_device *dev, u32 rss_context)
466470
ethnl_notify(dev, ETHTOOL_MSG_RSS_NTF, &req_info.base);
467471
}
468472

473+
/* RSS_SET */
474+
475+
const struct nla_policy ethnl_rss_set_policy[ETHTOOL_A_RSS_START_CONTEXT + 1] = {
476+
[ETHTOOL_A_RSS_HEADER] = NLA_POLICY_NESTED(ethnl_header_policy),
477+
[ETHTOOL_A_RSS_CONTEXT] = { .type = NLA_U32, },
478+
[ETHTOOL_A_RSS_INDIR] = { .type = NLA_BINARY, },
479+
};
480+
481+
static int
482+
ethnl_rss_set_validate(struct ethnl_req_info *req_info, struct genl_info *info)
483+
{
484+
const struct ethtool_ops *ops = req_info->dev->ethtool_ops;
485+
struct rss_req_info *request = RSS_REQINFO(req_info);
486+
struct nlattr **tb = info->attrs;
487+
struct nlattr *bad_attr = NULL;
488+
489+
if (request->rss_context && !ops->create_rxfh_context)
490+
bad_attr = bad_attr ?: tb[ETHTOOL_A_RSS_CONTEXT];
491+
492+
if (bad_attr) {
493+
NL_SET_BAD_ATTR(info->extack, bad_attr);
494+
return -EOPNOTSUPP;
495+
}
496+
497+
return 1;
498+
}
499+
500+
static int
501+
rss_set_prep_indir(struct net_device *dev, struct genl_info *info,
502+
struct rss_reply_data *data, struct ethtool_rxfh_param *rxfh,
503+
bool *reset, bool *mod)
504+
{
505+
const struct ethtool_ops *ops = dev->ethtool_ops;
506+
struct netlink_ext_ack *extack = info->extack;
507+
struct nlattr **tb = info->attrs;
508+
struct ethtool_rxnfc rx_rings;
509+
size_t alloc_size;
510+
u32 user_size;
511+
int i, err;
512+
513+
if (!tb[ETHTOOL_A_RSS_INDIR])
514+
return 0;
515+
if (!data->indir_size || !ops->get_rxnfc)
516+
return -EOPNOTSUPP;
517+
518+
rx_rings.cmd = ETHTOOL_GRXRINGS;
519+
err = ops->get_rxnfc(dev, &rx_rings, NULL);
520+
if (err)
521+
return err;
522+
523+
if (nla_len(tb[ETHTOOL_A_RSS_INDIR]) % 4) {
524+
NL_SET_BAD_ATTR(info->extack, tb[ETHTOOL_A_RSS_INDIR]);
525+
return -EINVAL;
526+
}
527+
user_size = nla_len(tb[ETHTOOL_A_RSS_INDIR]) / 4;
528+
if (!user_size) {
529+
if (rxfh->rss_context) {
530+
NL_SET_ERR_MSG_ATTR(extack, tb[ETHTOOL_A_RSS_INDIR],
531+
"can't reset table for a context");
532+
return -EINVAL;
533+
}
534+
*reset = true;
535+
} else if (data->indir_size % user_size) {
536+
NL_SET_ERR_MSG_ATTR_FMT(extack, tb[ETHTOOL_A_RSS_INDIR],
537+
"size (%d) mismatch with device indir table (%d)",
538+
user_size, data->indir_size);
539+
return -EINVAL;
540+
}
541+
542+
rxfh->indir_size = data->indir_size;
543+
alloc_size = array_size(data->indir_size, sizeof(rxfh->indir[0]));
544+
rxfh->indir = kzalloc(alloc_size, GFP_KERNEL);
545+
if (!rxfh->indir)
546+
return -ENOMEM;
547+
548+
nla_memcpy(rxfh->indir, tb[ETHTOOL_A_RSS_INDIR], alloc_size);
549+
for (i = 0; i < user_size; i++) {
550+
if (rxfh->indir[i] < rx_rings.data)
551+
continue;
552+
553+
NL_SET_ERR_MSG_ATTR_FMT(extack, tb[ETHTOOL_A_RSS_INDIR],
554+
"entry %d: queue out of range (%d)",
555+
i, rxfh->indir[i]);
556+
err = -EINVAL;
557+
goto err_free;
558+
}
559+
560+
if (user_size) {
561+
/* Replicate the user-provided table to fill the device table */
562+
for (i = user_size; i < data->indir_size; i++)
563+
rxfh->indir[i] = rxfh->indir[i % user_size];
564+
} else {
565+
for (i = 0; i < data->indir_size; i++)
566+
rxfh->indir[i] =
567+
ethtool_rxfh_indir_default(i, rx_rings.data);
568+
}
569+
570+
*mod |= memcmp(rxfh->indir, data->indir_table, data->indir_size);
571+
572+
return 0;
573+
574+
err_free:
575+
kfree(rxfh->indir);
576+
rxfh->indir = NULL;
577+
return err;
578+
}
579+
580+
static void
581+
rss_set_ctx_update(struct ethtool_rxfh_context *ctx, struct nlattr **tb,
582+
struct rss_reply_data *data, struct ethtool_rxfh_param *rxfh)
583+
{
584+
int i;
585+
586+
if (rxfh->indir) {
587+
for (i = 0; i < data->indir_size; i++)
588+
ethtool_rxfh_context_indir(ctx)[i] = rxfh->indir[i];
589+
ctx->indir_configured = !!nla_len(tb[ETHTOOL_A_RSS_INDIR]);
590+
}
591+
}
592+
593+
static int
594+
ethnl_rss_set(struct ethnl_req_info *req_info, struct genl_info *info)
595+
{
596+
struct rss_req_info *request = RSS_REQINFO(req_info);
597+
struct ethtool_rxfh_context *ctx = NULL;
598+
struct net_device *dev = req_info->dev;
599+
struct ethtool_rxfh_param rxfh = {};
600+
bool indir_reset = false, indir_mod;
601+
struct nlattr **tb = info->attrs;
602+
struct rss_reply_data data = {};
603+
const struct ethtool_ops *ops;
604+
bool mod = false;
605+
int ret;
606+
607+
ops = dev->ethtool_ops;
608+
data.base.dev = dev;
609+
610+
ret = rss_prepare(request, dev, &data, info);
611+
if (ret)
612+
return ret;
613+
614+
rxfh.rss_context = request->rss_context;
615+
616+
ret = rss_set_prep_indir(dev, info, &data, &rxfh, &indir_reset, &mod);
617+
if (ret)
618+
goto exit_clean_data;
619+
indir_mod = !!tb[ETHTOOL_A_RSS_INDIR];
620+
621+
rxfh.hfunc = ETH_RSS_HASH_NO_CHANGE;
622+
rxfh.input_xfrm = RXH_XFRM_NO_CHANGE;
623+
624+
mutex_lock(&dev->ethtool->rss_lock);
625+
if (request->rss_context) {
626+
ctx = xa_load(&dev->ethtool->rss_ctx, request->rss_context);
627+
if (!ctx) {
628+
ret = -ENOENT;
629+
goto exit_unlock;
630+
}
631+
}
632+
633+
if (!mod)
634+
ret = 0; /* nothing to tell the driver */
635+
else if (!ops->set_rxfh)
636+
ret = -EOPNOTSUPP;
637+
else if (!rxfh.rss_context)
638+
ret = ops->set_rxfh(dev, &rxfh, info->extack);
639+
else
640+
ret = ops->modify_rxfh_context(dev, ctx, &rxfh, info->extack);
641+
if (ret)
642+
goto exit_unlock;
643+
644+
if (ctx)
645+
rss_set_ctx_update(ctx, tb, &data, &rxfh);
646+
else if (indir_reset)
647+
dev->priv_flags &= ~IFF_RXFH_CONFIGURED;
648+
else if (indir_mod)
649+
dev->priv_flags |= IFF_RXFH_CONFIGURED;
650+
651+
exit_unlock:
652+
mutex_unlock(&dev->ethtool->rss_lock);
653+
kfree(rxfh.indir);
654+
exit_clean_data:
655+
rss_cleanup_data(&data.base);
656+
657+
return ret ?: mod;
658+
}
659+
469660
const struct ethnl_request_ops ethnl_rss_request_ops = {
470661
.request_cmd = ETHTOOL_MSG_RSS_GET,
471662
.reply_cmd = ETHTOOL_MSG_RSS_GET_REPLY,
@@ -478,4 +669,8 @@ const struct ethnl_request_ops ethnl_rss_request_ops = {
478669
.reply_size = rss_reply_size,
479670
.fill_reply = rss_fill_reply,
480671
.cleanup_data = rss_cleanup_data,
672+
673+
.set_validate = ethnl_rss_set_validate,
674+
.set = ethnl_rss_set,
675+
.set_ntf_cmd = ETHTOOL_MSG_RSS_NTF,
481676
};

0 commit comments

Comments
 (0)