Skip to content

Commit fbc1449

Browse files
committed
Merge tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says: ==================== mlx5-updates-2023-04-20 1) Dragos Improves RX page pool, and provides some fixes to his previous series: 1.1) Fix releasing page_pool for striding RQ and legacy RQ nonlinear case 1.2) Hook NAPIs to page pools to gain more performance. 2) From Roi, Some cleanups to TC and eswitch modules. 3) Maher migrates vnic diagnostic counters reporting from debugfs to a dedicated devlink health reporter Maher Says: =========== net/mlx5: Expose vnic diagnostic counters using devlink Currently, vnic diagnostic counters are exposed through the following debugfs: $ ls /sys/kernel/debug/mlx5/0000:08:00.0/esw/vf_0/vnic_diag/ cq_overrun quota_exceeded_command total_q_under_processor_handle invalid_command send_queue_priority_update_flow nic_receive_steering_discard The current design does not allow the hypervisor to view the diagnostic counters of its VFs, in case the VFs get bound to a VM. In other words, the counters are not exposed for representor interfaces. Furthermore, the debugfs design is inconvenient future-wise, in case more counters need to be reported by the driver in the future. As these counters pertain to vNIC health, it is more appropriate to utilize the devlink health reporter to expose them. Thus, this patchest includes the following changes: * Drop the current vnic diagnostic counters debugfs interface. * Add a vnic devlink health reporter for PFs/VFs core devices, which when diagnosed will dump vnic diagnostic counter values that are queried from FW. * Add a vnic devlink health reporter for the representor interface, which serves the same purpose listed in the previous point, in addition to allowing the hypervisor to view its VFs diagnostic counters, even when the VFs are bounded to external VMs. Example of devlink health reporter usage is: $devlink health diagnose pci/0000:08:00.0 reporter vnic vNIC env counters: total_error_queues: 0 send_queue_priority_update_flow: 0 comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0 invalid_command: 0 quota_exceeded_command: 0 nic_receive_steering_discard: 0 =========== 4) SW steering fixes and improvements Yevgeny Kliteynik Says: ======================= These short patch series are just small fixes / improvements for SW steering: - Patch 1: Fix dumping of legacy modify_hdr in debug dump to align to what is expected by parser - Patch 2: Have separate threshold for ICM sync per ICM type - Patch 3: Add more info to the steering debug dump - Linux version and device name - Patch 4: Keep track of number of buddies that are currently in use per domain per buddy type ======================= * tag 'mlx5-updates-2023-04-20' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: Update op_mode to op_mod for port selection net/mlx5: E-Switch, Remove unused mlx5_esw_offloads_vport_metadata_set() net/mlx5: E-Switch, Remove redundant dev arg from mlx5_esw_vport_alloc() net/mlx5: Include linux/pci.h for pci_msix_can_alloc_dyn() net/mlx5e: RX, Hook NAPIs to page pools net/mlx5e: RX, Fix XDP_TX page release for legacy rq nonlinear case net/mlx5e: RX, Fix releasing page_pool pages twice for striding RQ net/mlx5e: Add vnic devlink health reporter to representors net/mlx5: Add vnic devlink health reporter to PFs/VFs Revert "net/mlx5: Expose vnic diagnostic counters for eswitch managed vports" Revert "net/mlx5: Expose steering dropped packets counter" net/mlx5: DR, Add memory statistics for domain object net/mlx5: DR, Add more info in domain dbg dump net/mlx5: DR, Calculate sync threshold of each pool according to its type net/mlx5: DR, Fix dumping of legacy modify_hdr in debug dump ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2 parents 9a82cdc + f9c895a commit fbc1449

File tree

20 files changed

+297
-273
lines changed

20 files changed

+297
-273
lines changed

Documentation/networking/device_drivers/ethernet/mellanox/mlx5/devlink.rst

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -257,3 +257,36 @@ User commands examples:
257257
$ devlink health dump show pci/0000:82:00.1 reporter fw_fatal
258258

259259
NOTE: This command can run only on PF.
260+
261+
vnic reporter
262+
-------------
263+
The vnic reporter implements only the `diagnose` callback.
264+
It is responsible for querying the vnic diagnostic counters from fw and displaying
265+
them in realtime.
266+
267+
Description of the vnic counters:
268+
total_q_under_processor_handle: number of queues in an error state due to
269+
an async error or errored command.
270+
send_queue_priority_update_flow: number of QP/SQ priority/SL update
271+
events.
272+
cq_overrun: number of times CQ entered an error state due to an
273+
overflow.
274+
async_eq_overrun: number of times an EQ mapped to async events was
275+
overrun.
276+
comp_eq_overrun: number of times an EQ mapped to completion events was
277+
overrun.
278+
quota_exceeded_command: number of commands issued and failed due to quota
279+
exceeded.
280+
invalid_command: number of commands issued and failed dues to any reason
281+
other than quota exceeded.
282+
nic_receive_steering_discard: number of packets that completed RX flow
283+
steering but were discarded due to a mismatch in flow table.
284+
285+
User commands examples:
286+
- Diagnose PF/VF vnic counters
287+
$ devlink health diagnose pci/0000:82:00.1 reporter vnic
288+
- Diagnose representor vnic counters (performed by supplying devlink port of the
289+
representor, which can be obtained via devlink port command)
290+
$ devlink health diagnose pci/0000:82:00.1/65537 reporter vnic
291+
292+
NOTE: This command can run over all interfaces such as PF/VF and representor ports.

drivers/net/ethernet/mellanox/mlx5/core/Makefile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ mlx5_core-y := main.o cmd.o debugfs.o fw.o eq.o uar.o pagealloc.o \
1616
transobj.o vport.o sriov.o fs_cmd.o fs_core.o pci_irq.o \
1717
fs_counters.o fs_ft_pool.o rl.o lag/debugfs.o lag/lag.o dev.o events.o wq.o lib/gid.o \
1818
lib/devcom.o lib/pci_vsc.o lib/dm.o lib/fs_ttc.o diag/fs_tracepoint.o \
19-
diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o \
19+
diag/fw_tracer.o diag/crdump.o devlink.o diag/rsc_dump.o diag/reporter_vnic.o \
2020
fw_reset.o qos.o lib/tout.o lib/aso.o
2121

2222
#
@@ -69,7 +69,7 @@ mlx5_core-$(CONFIG_MLX5_TC_SAMPLE) += en/tc/sample.o
6969
#
7070
mlx5_core-$(CONFIG_MLX5_ESWITCH) += eswitch.o eswitch_offloads.o eswitch_offloads_termtbl.o \
7171
ecpf.o rdma.o esw/legacy.o \
72-
esw/debugfs.o esw/devlink_port.o esw/vporttbl.o esw/qos.o
72+
esw/devlink_port.o esw/vporttbl.o esw/qos.o
7373

7474
mlx5_core-$(CONFIG_MLX5_ESWITCH) += esw/acl/helper.o \
7575
esw/acl/egress_lgcy.o esw/acl/egress_ofld.o \
Lines changed: 125 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
2+
/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. */
3+
4+
#include "reporter_vnic.h"
5+
#include "devlink.h"
6+
7+
#define VNIC_ENV_GET64(vnic_env_stats, c) \
8+
MLX5_GET64(query_vnic_env_out, (vnic_env_stats)->query_vnic_env_out, \
9+
vport_env.c)
10+
11+
struct mlx5_vnic_diag_stats {
12+
__be64 query_vnic_env_out[MLX5_ST_SZ_QW(query_vnic_env_out)];
13+
};
14+
15+
int mlx5_reporter_vnic_diagnose_counters(struct mlx5_core_dev *dev,
16+
struct devlink_fmsg *fmsg,
17+
u16 vport_num, bool other_vport)
18+
{
19+
u32 in[MLX5_ST_SZ_DW(query_vnic_env_in)] = {};
20+
struct mlx5_vnic_diag_stats vnic;
21+
int err;
22+
23+
MLX5_SET(query_vnic_env_in, in, opcode, MLX5_CMD_OP_QUERY_VNIC_ENV);
24+
MLX5_SET(query_vnic_env_in, in, vport_number, vport_num);
25+
MLX5_SET(query_vnic_env_in, in, other_vport, !!other_vport);
26+
27+
err = mlx5_cmd_exec_inout(dev, query_vnic_env, in, &vnic.query_vnic_env_out);
28+
if (err)
29+
return err;
30+
31+
err = devlink_fmsg_pair_nest_start(fmsg, "vNIC env counters");
32+
if (err)
33+
return err;
34+
35+
err = devlink_fmsg_obj_nest_start(fmsg);
36+
if (err)
37+
return err;
38+
39+
err = devlink_fmsg_u64_pair_put(fmsg, "total_error_queues",
40+
VNIC_ENV_GET64(&vnic, total_error_queues));
41+
if (err)
42+
return err;
43+
44+
err = devlink_fmsg_u64_pair_put(fmsg, "send_queue_priority_update_flow",
45+
VNIC_ENV_GET64(&vnic, send_queue_priority_update_flow));
46+
if (err)
47+
return err;
48+
49+
err = devlink_fmsg_u64_pair_put(fmsg, "comp_eq_overrun",
50+
VNIC_ENV_GET64(&vnic, comp_eq_overrun));
51+
if (err)
52+
return err;
53+
54+
err = devlink_fmsg_u64_pair_put(fmsg, "async_eq_overrun",
55+
VNIC_ENV_GET64(&vnic, async_eq_overrun));
56+
if (err)
57+
return err;
58+
59+
err = devlink_fmsg_u64_pair_put(fmsg, "cq_overrun",
60+
VNIC_ENV_GET64(&vnic, cq_overrun));
61+
if (err)
62+
return err;
63+
64+
err = devlink_fmsg_u64_pair_put(fmsg, "invalid_command",
65+
VNIC_ENV_GET64(&vnic, invalid_command));
66+
if (err)
67+
return err;
68+
69+
err = devlink_fmsg_u64_pair_put(fmsg, "quota_exceeded_command",
70+
VNIC_ENV_GET64(&vnic, quota_exceeded_command));
71+
if (err)
72+
return err;
73+
74+
err = devlink_fmsg_u64_pair_put(fmsg, "nic_receive_steering_discard",
75+
VNIC_ENV_GET64(&vnic, nic_receive_steering_discard));
76+
if (err)
77+
return err;
78+
79+
err = devlink_fmsg_obj_nest_end(fmsg);
80+
if (err)
81+
return err;
82+
83+
err = devlink_fmsg_pair_nest_end(fmsg);
84+
if (err)
85+
return err;
86+
87+
return 0;
88+
}
89+
90+
static int mlx5_reporter_vnic_diagnose(struct devlink_health_reporter *reporter,
91+
struct devlink_fmsg *fmsg,
92+
struct netlink_ext_ack *extack)
93+
{
94+
struct mlx5_core_dev *dev = devlink_health_reporter_priv(reporter);
95+
96+
return mlx5_reporter_vnic_diagnose_counters(dev, fmsg, 0, false);
97+
}
98+
99+
static const struct devlink_health_reporter_ops mlx5_reporter_vnic_ops = {
100+
.name = "vnic",
101+
.diagnose = mlx5_reporter_vnic_diagnose,
102+
};
103+
104+
void mlx5_reporter_vnic_create(struct mlx5_core_dev *dev)
105+
{
106+
struct mlx5_core_health *health = &dev->priv.health;
107+
struct devlink *devlink = priv_to_devlink(dev);
108+
109+
health->vnic_reporter =
110+
devlink_health_reporter_create(devlink,
111+
&mlx5_reporter_vnic_ops,
112+
0, dev);
113+
if (IS_ERR(health->vnic_reporter))
114+
mlx5_core_warn(dev,
115+
"Failed to create vnic reporter, err = %ld\n",
116+
PTR_ERR(health->vnic_reporter));
117+
}
118+
119+
void mlx5_reporter_vnic_destroy(struct mlx5_core_dev *dev)
120+
{
121+
struct mlx5_core_health *health = &dev->priv.health;
122+
123+
if (!IS_ERR_OR_NULL(health->vnic_reporter))
124+
devlink_health_reporter_destroy(health->vnic_reporter);
125+
}
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB
2+
* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES.
3+
*/
4+
#ifndef __MLX5_REPORTER_VNIC_H
5+
#define __MLX5_REPORTER_VNIC_H
6+
7+
#include "mlx5_core.h"
8+
9+
void mlx5_reporter_vnic_create(struct mlx5_core_dev *dev);
10+
void mlx5_reporter_vnic_destroy(struct mlx5_core_dev *dev);
11+
12+
int mlx5_reporter_vnic_diagnose_counters(struct mlx5_core_dev *dev,
13+
struct devlink_fmsg *fmsg,
14+
u16 vport_num, bool other_vport);
15+
16+
#endif /* __MLX5_REPORTER_VNIC_H */

drivers/net/ethernet/mellanox/mlx5/core/en_main.c

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -857,6 +857,7 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params,
857857
pp_params.pool_size = pool_size;
858858
pp_params.nid = node;
859859
pp_params.dev = rq->pdev;
860+
pp_params.napi = rq->cq.napi;
860861
pp_params.dma_dir = rq->buff.map_dir;
861862
pp_params.max_len = PAGE_SIZE;
862863

drivers/net/ethernet/mellanox/mlx5/core/en_rep.c

Lines changed: 50 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@
5353
#include "lib/vxlan.h"
5454
#define CREATE_TRACE_POINTS
5555
#include "diag/en_rep_tracepoint.h"
56+
#include "diag/reporter_vnic.h"
5657
#include "en_accel/ipsec.h"
5758
#include "en/tc/int_port.h"
5859
#include "en/ptp.h"
@@ -1294,6 +1295,50 @@ static unsigned int mlx5e_ul_rep_stats_grps_num(struct mlx5e_priv *priv)
12941295
return ARRAY_SIZE(mlx5e_ul_rep_stats_grps);
12951296
}
12961297

1298+
static int
1299+
mlx5e_rep_vnic_reporter_diagnose(struct devlink_health_reporter *reporter,
1300+
struct devlink_fmsg *fmsg,
1301+
struct netlink_ext_ack *extack)
1302+
{
1303+
struct mlx5e_rep_priv *rpriv = devlink_health_reporter_priv(reporter);
1304+
struct mlx5_eswitch_rep *rep = rpriv->rep;
1305+
1306+
return mlx5_reporter_vnic_diagnose_counters(rep->esw->dev, fmsg,
1307+
rep->vport, true);
1308+
}
1309+
1310+
static const struct devlink_health_reporter_ops mlx5_rep_vnic_reporter_ops = {
1311+
.name = "vnic",
1312+
.diagnose = mlx5e_rep_vnic_reporter_diagnose,
1313+
};
1314+
1315+
static void mlx5e_rep_vnic_reporter_create(struct mlx5e_priv *priv,
1316+
struct devlink_port *dl_port)
1317+
{
1318+
struct mlx5e_rep_priv *rpriv = priv->ppriv;
1319+
struct devlink_health_reporter *reporter;
1320+
1321+
reporter = devl_port_health_reporter_create(dl_port,
1322+
&mlx5_rep_vnic_reporter_ops,
1323+
0, rpriv);
1324+
if (IS_ERR(reporter)) {
1325+
mlx5_core_err(priv->mdev,
1326+
"Failed to create representor vnic reporter, err = %ld\n",
1327+
PTR_ERR(reporter));
1328+
return;
1329+
}
1330+
1331+
rpriv->rep_vnic_reporter = reporter;
1332+
}
1333+
1334+
static void mlx5e_rep_vnic_reporter_destroy(struct mlx5e_priv *priv)
1335+
{
1336+
struct mlx5e_rep_priv *rpriv = priv->ppriv;
1337+
1338+
if (!IS_ERR_OR_NULL(rpriv->rep_vnic_reporter))
1339+
devl_health_reporter_destroy(rpriv->rep_vnic_reporter);
1340+
}
1341+
12971342
static const struct mlx5e_profile mlx5e_rep_profile = {
12981343
.init = mlx5e_init_rep,
12991344
.cleanup = mlx5e_cleanup_rep,
@@ -1394,8 +1439,10 @@ mlx5e_vport_vf_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
13941439

13951440
dl_port = mlx5_esw_offloads_devlink_port(dev->priv.eswitch,
13961441
rpriv->rep->vport);
1397-
if (dl_port)
1442+
if (dl_port) {
13981443
SET_NETDEV_DEVLINK_PORT(netdev, dl_port);
1444+
mlx5e_rep_vnic_reporter_create(priv, dl_port);
1445+
}
13991446

14001447
err = register_netdev(netdev);
14011448
if (err) {
@@ -1408,8 +1455,8 @@ mlx5e_vport_vf_rep_load(struct mlx5_core_dev *dev, struct mlx5_eswitch_rep *rep)
14081455
return 0;
14091456

14101457
err_detach_netdev:
1458+
mlx5e_rep_vnic_reporter_destroy(priv);
14111459
mlx5e_detach_netdev(netdev_priv(netdev));
1412-
14131460
err_cleanup_profile:
14141461
priv->profile->cleanup(priv);
14151462

@@ -1458,6 +1505,7 @@ mlx5e_vport_rep_unload(struct mlx5_eswitch_rep *rep)
14581505
}
14591506

14601507
unregister_netdev(netdev);
1508+
mlx5e_rep_vnic_reporter_destroy(priv);
14611509
mlx5e_detach_netdev(priv);
14621510
priv->profile->cleanup(priv);
14631511
mlx5e_destroy_netdev(priv);

drivers/net/ethernet/mellanox/mlx5/core/en_rep.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -118,6 +118,7 @@ struct mlx5e_rep_priv {
118118
struct rtnl_link_stats64 prev_vf_vport_stats;
119119
struct mlx5_flow_handle *send_to_vport_meta_rule;
120120
struct rhashtable tc_ht;
121+
struct devlink_health_reporter *rep_vnic_reporter;
121122
};
122123

123124
static inline

drivers/net/ethernet/mellanox/mlx5/core/en_rx.c

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -861,6 +861,11 @@ static void mlx5e_dealloc_rx_mpwqe(struct mlx5e_rq *rq, u16 ix)
861861
struct mlx5e_mpw_info *wi = mlx5e_get_mpw_info(rq, ix);
862862
/* This function is called on rq/netdev close. */
863863
mlx5e_free_rx_mpwqe(rq, wi);
864+
865+
/* Avoid a second release of the wqe pages: dealloc is called also
866+
* for missing wqes on an already flushed RQ.
867+
*/
868+
bitmap_fill(wi->skip_release_bitmap, rq->mpwqe.pages_per_wqe);
864869
}
865870

866871
INDIRECT_CALLABLE_SCOPE bool mlx5e_post_rx_wqes(struct mlx5e_rq *rq)
@@ -1741,10 +1746,10 @@ mlx5e_skb_from_cqe_nonlinear(struct mlx5e_rq *rq, struct mlx5e_wqe_frag_info *wi
17411746
prog = rcu_dereference(rq->xdp_prog);
17421747
if (prog && mlx5e_xdp_handle(rq, prog, &mxbuf)) {
17431748
if (test_bit(MLX5E_RQ_FLAG_XDP_XMIT, rq->flags)) {
1744-
int i;
1749+
struct mlx5e_wqe_frag_info *pwi;
17451750

1746-
for (i = wi - head_wi; i < rq->wqe.info.num_frags; i++)
1747-
mlx5e_put_rx_frag(rq, &head_wi[i]);
1751+
for (pwi = head_wi; pwi < wi; pwi++)
1752+
pwi->flags |= BIT(MLX5E_WQE_FRAG_SKIP_RELEASE);
17481753
}
17491754
return NULL; /* page/packet was consumed by XDP */
17501755
}

0 commit comments

Comments
 (0)