Skip to content

Commit 288e21b

Browse files
committed
Merge branch 'for-next/perf' into for-next/core
* for-next/perf: drivers/perf: arm_spe: Fix consistency of SYS_PMSCR_EL1.CX perf: RISC-V: Add of_node_put() when breaking out of for_each_of_cpu_node() docs: perf: Include hns3-pmu.rst in toctree to fix 'htmldocs' WARNING drivers/perf: hisi: add driver for HNS3 PMU drivers/perf: hisi: Add description for HNS3 PMU driver drivers/perf: riscv_pmu_sbi: perf format perf/arm-cci: Use the bitmap API to allocate bitmaps drivers/perf: riscv_pmu: Add riscv pmu pm notifier perf: hisi: Extract hisi_pmu_init perf/marvell_cn10k: Fix TAD PMU register offset perf/marvell_cn10k: Remove useless license text when SPDX-License-Identifier is already used arm64: cpufeature: Allow different PMU versions in ID_DFR0_EL1 perf/arm-cci: fix typo in comment drivers/perf:Directly use ida_alloc()/free() drivers/perf: Directly use ida_alloc()/free()
2 parents c436500 + 92f2b8b commit 288e21b

23 files changed

+1993
-105
lines changed
Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
======================================
2+
HNS3 Performance Monitoring Unit (PMU)
3+
======================================
4+
5+
HNS3(HiSilicon network system 3) Performance Monitoring Unit (PMU) is an
6+
End Point device to collect performance statistics of HiSilicon SoC NIC.
7+
On Hip09, each SICL(Super I/O cluster) has one PMU device.
8+
9+
HNS3 PMU supports collection of performance statistics such as bandwidth,
10+
latency, packet rate and interrupt rate.
11+
12+
Each HNS3 PMU supports 8 hardware events.
13+
14+
HNS3 PMU driver
15+
===============
16+
17+
The HNS3 PMU driver registers a perf PMU with the name of its sicl id.::
18+
19+
/sys/devices/hns3_pmu_sicl_<sicl_id>
20+
21+
PMU driver provides description of available events, filter modes, format,
22+
identifier and cpumask in sysfs.
23+
24+
The "events" directory describes the event code of all supported events
25+
shown in perf list.
26+
27+
The "filtermode" directory describes the supported filter modes of each
28+
event.
29+
30+
The "format" directory describes all formats of the config (events) and
31+
config1 (filter options) fields of the perf_event_attr structure.
32+
33+
The "identifier" file shows version of PMU hardware device.
34+
35+
The "bdf_min" and "bdf_max" files show the supported bdf range of each
36+
pmu device.
37+
38+
The "hw_clk_freq" file shows the hardware clock frequency of each pmu
39+
device.
40+
41+
Example usage of checking event code and subevent code::
42+
43+
$# cat /sys/devices/hns3_pmu_sicl_0/events/dly_tx_normal_to_mac_time
44+
config=0x00204
45+
$# cat /sys/devices/hns3_pmu_sicl_0/events/dly_tx_normal_to_mac_packet_num
46+
config=0x10204
47+
48+
Each performance statistic has a pair of events to get two values to
49+
calculate real performance data in userspace.
50+
51+
The bits 0~15 of config (here 0x0204) are the true hardware event code. If
52+
two events have same value of bits 0~15 of config, that means they are
53+
event pair. And the bit 16 of config indicates getting counter 0 or
54+
counter 1 of hardware event.
55+
56+
After getting two values of event pair in usersapce, the formula of
57+
computation to calculate real performance data is:::
58+
59+
counter 0 / counter 1
60+
61+
Example usage of checking supported filter mode::
62+
63+
$# cat /sys/devices/hns3_pmu_sicl_0/filtermode/bw_ssu_rpu_byte_num
64+
filter mode supported: global/port/port-tc/func/func-queue/
65+
66+
Example usage of perf::
67+
68+
$# perf list
69+
hns3_pmu_sicl_0/bw_ssu_rpu_byte_num/ [kernel PMU event]
70+
hns3_pmu_sicl_0/bw_ssu_rpu_time/ [kernel PMU event]
71+
------------------------------------------
72+
73+
$# perf stat -g -e hns3_pmu_sicl_0/bw_ssu_rpu_byte_num,global=1/ -e hns3_pmu_sicl_0/bw_ssu_rpu_time,global=1/ -I 1000
74+
or
75+
$# perf stat -g -e hns3_pmu_sicl_0/config=0x00002,global=1/ -e hns3_pmu_sicl_0/config=0x10002,global=1/ -I 1000
76+
77+
78+
Filter modes
79+
--------------
80+
81+
1. global mode
82+
PMU collect performance statistics for all HNS3 PCIe functions of IO DIE.
83+
Set the "global" filter option to 1 will enable this mode.
84+
Example usage of perf::
85+
86+
$# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,global=1/ -I 1000
87+
88+
2. port mode
89+
PMU collect performance statistic of one whole physical port. The port id
90+
is same as mac id. The "tc" filter option must be set to 0xF in this mode,
91+
here tc stands for traffic class.
92+
93+
Example usage of perf::
94+
95+
$# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,port=0,tc=0xF/ -I 1000
96+
97+
3. port-tc mode
98+
PMU collect performance statistic of one tc of physical port. The port id
99+
is same as mac id. The "tc" filter option must be set to 0 ~ 7 in this
100+
mode.
101+
Example usage of perf::
102+
103+
$# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,port=0,tc=0/ -I 1000
104+
105+
4. func mode
106+
PMU collect performance statistic of one PF/VF. The function id is BDF of
107+
PF/VF, its conversion formula::
108+
109+
func = (bus << 8) + (device << 3) + (function)
110+
111+
for example:
112+
BDF func
113+
35:00.0 0x3500
114+
35:00.1 0x3501
115+
35:01.0 0x3508
116+
117+
In this mode, the "queue" filter option must be set to 0xFFFF.
118+
Example usage of perf::
119+
120+
$# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,bdf=0x3500,queue=0xFFFF/ -I 1000
121+
122+
5. func-queue mode
123+
PMU collect performance statistic of one queue of PF/VF. The function id
124+
is BDF of PF/VF, the "queue" filter option must be set to the exact queue
125+
id of function.
126+
Example usage of perf::
127+
128+
$# perf stat -a -e hns3_pmu_sicl_0/config=0x1020F,bdf=0x3500,queue=0/ -I 1000
129+
130+
6. func-intr mode
131+
PMU collect performance statistic of one interrupt of PF/VF. The function
132+
id is BDF of PF/VF, the "intr" filter option must be set to the exact
133+
interrupt id of function.
134+
Example usage of perf::
135+
136+
$# perf stat -a -e hns3_pmu_sicl_0/config=0x00301,bdf=0x3500,intr=0/ -I 1000

Documentation/admin-guide/perf/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ Performance monitor support
99

1010
hisi-pmu
1111
hisi-pcie-pmu
12+
hns3-pmu
1213
imx-ddr
1314
qcom_l2_pmu
1415
qcom_l3_pmu

MAINTAINERS

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8944,6 +8944,12 @@ F: Documentation/admin-guide/perf/hisi-pcie-pmu.rst
89448944
F: Documentation/admin-guide/perf/hisi-pmu.rst
89458945
F: drivers/perf/hisilicon
89468946

8947+
HISILICON HNS3 PMU DRIVER
8948+
M: Guangbin Huang <[email protected]>
8949+
S: Supported
8950+
F: Documentation/admin-guide/perf/hns3-pmu.rst
8951+
F: drivers/perf/hisilicon/hns3_pmu.c
8952+
89478953
HISILICON QM AND ZIP Controller DRIVER
89488954
M: Zhou Wang <[email protected]>
89498955

arch/arm64/kernel/cpufeature.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -562,7 +562,7 @@ static const struct arm64_ftr_bits ftr_id_pfr2[] = {
562562

563563
static const struct arm64_ftr_bits ftr_id_dfr0[] = {
564564
/* [31:28] TraceFilt */
565-
S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_PERFMON_SHIFT, 4, 0xf),
565+
S_ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_EXACT, ID_DFR0_PERFMON_SHIFT, 4, 0),
566566
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_MPROFDBG_SHIFT, 4, 0),
567567
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_MMAPTRC_SHIFT, 4, 0),
568568
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_DFR0_COPTRC_SHIFT, 4, 0),

drivers/perf/arm-cci.c

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1139,7 +1139,7 @@ static void cci_pmu_start(struct perf_event *event, int pmu_flags)
11391139

11401140
/*
11411141
* To handle interrupt latency, we always reprogram the period
1142-
* regardlesss of PERF_EF_RELOAD.
1142+
* regardless of PERF_EF_RELOAD.
11431143
*/
11441144
if (pmu_flags & PERF_EF_RELOAD)
11451145
WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE));
@@ -1261,7 +1261,7 @@ static int validate_group(struct perf_event *event)
12611261
*/
12621262
.used_mask = mask,
12631263
};
1264-
memset(mask, 0, BITS_TO_LONGS(cci_pmu->num_cntrs) * sizeof(unsigned long));
1264+
bitmap_zero(mask, cci_pmu->num_cntrs);
12651265

12661266
if (!validate_event(event->pmu, &fake_pmu, leader))
12671267
return -EINVAL;
@@ -1629,10 +1629,9 @@ static struct cci_pmu *cci_pmu_alloc(struct device *dev)
16291629
GFP_KERNEL);
16301630
if (!cci_pmu->hw_events.events)
16311631
return ERR_PTR(-ENOMEM);
1632-
cci_pmu->hw_events.used_mask = devm_kcalloc(dev,
1633-
BITS_TO_LONGS(CCI_PMU_MAX_HW_CNTRS(model)),
1634-
sizeof(*cci_pmu->hw_events.used_mask),
1635-
GFP_KERNEL);
1632+
cci_pmu->hw_events.used_mask = devm_bitmap_zalloc(dev,
1633+
CCI_PMU_MAX_HW_CNTRS(model),
1634+
GFP_KERNEL);
16361635
if (!cci_pmu->hw_events.used_mask)
16371636
return ERR_PTR(-ENOMEM);
16381637

drivers/perf/arm-ccn.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1250,7 +1250,7 @@ static int arm_ccn_pmu_init(struct arm_ccn *ccn)
12501250
ccn->dt.cmp_mask[CCN_IDX_MASK_OPCODE].h = ~(0x1f << 9);
12511251

12521252
/* Get a convenient /sys/event_source/devices/ name */
1253-
ccn->dt.id = ida_simple_get(&arm_ccn_pmu_ida, 0, 0, GFP_KERNEL);
1253+
ccn->dt.id = ida_alloc(&arm_ccn_pmu_ida, GFP_KERNEL);
12541254
if (ccn->dt.id == 0) {
12551255
name = "ccn";
12561256
} else {
@@ -1312,7 +1312,7 @@ static int arm_ccn_pmu_init(struct arm_ccn *ccn)
13121312
&ccn->dt.node);
13131313
error_set_affinity:
13141314
error_choose_name:
1315-
ida_simple_remove(&arm_ccn_pmu_ida, ccn->dt.id);
1315+
ida_free(&arm_ccn_pmu_ida, ccn->dt.id);
13161316
for (i = 0; i < ccn->num_xps; i++)
13171317
writel(0, ccn->xp[i].base + CCN_XP_DT_CONTROL);
13181318
writel(0, ccn->dt.base + CCN_DT_PMCR);
@@ -1329,7 +1329,7 @@ static void arm_ccn_pmu_cleanup(struct arm_ccn *ccn)
13291329
writel(0, ccn->xp[i].base + CCN_XP_DT_CONTROL);
13301330
writel(0, ccn->dt.base + CCN_DT_PMCR);
13311331
perf_pmu_unregister(&ccn->dt.pmu);
1332-
ida_simple_remove(&arm_ccn_pmu_ida, ccn->dt.id);
1332+
ida_free(&arm_ccn_pmu_ida, ccn->dt.id);
13331333
}
13341334

13351335
static int arm_ccn_for_each_valid_region(struct arm_ccn *ccn,

drivers/perf/arm_spe_pmu.c

Lines changed: 20 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,24 @@
3939
#include <asm/mmu.h>
4040
#include <asm/sysreg.h>
4141

42+
/*
43+
* Cache if the event is allowed to trace Context information.
44+
* This allows us to perform the check, i.e, perfmon_capable(),
45+
* in the context of the event owner, once, during the event_init().
46+
*/
47+
#define SPE_PMU_HW_FLAGS_CX BIT(0)
48+
49+
static void set_spe_event_has_cx(struct perf_event *event)
50+
{
51+
if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
52+
event->hw.flags |= SPE_PMU_HW_FLAGS_CX;
53+
}
54+
55+
static bool get_spe_event_has_cx(struct perf_event *event)
56+
{
57+
return !!(event->hw.flags & SPE_PMU_HW_FLAGS_CX);
58+
}
59+
4260
#define ARM_SPE_BUF_PAD_BYTE 0
4361

4462
struct arm_spe_pmu_buf {
@@ -272,7 +290,7 @@ static u64 arm_spe_event_to_pmscr(struct perf_event *event)
272290
if (!attr->exclude_kernel)
273291
reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
274292

275-
if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
293+
if (get_spe_event_has_cx(event))
276294
reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
277295

278296
return reg;
@@ -709,10 +727,10 @@ static int arm_spe_pmu_event_init(struct perf_event *event)
709727
!(spe_pmu->features & SPE_PMU_FEAT_FILT_LAT))
710728
return -EOPNOTSUPP;
711729

730+
set_spe_event_has_cx(event);
712731
reg = arm_spe_event_to_pmscr(event);
713732
if (!perfmon_capable() &&
714733
(reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) |
715-
BIT(SYS_PMSCR_EL1_CX_SHIFT) |
716734
BIT(SYS_PMSCR_EL1_PCT_SHIFT))))
717735
return -EACCES;
718736

drivers/perf/fsl_imx8_ddr_perf.c

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -611,7 +611,7 @@ static int ddr_perf_init(struct ddr_pmu *pmu, void __iomem *base,
611611
.dev = dev,
612612
};
613613

614-
pmu->id = ida_simple_get(&ddr_ida, 0, 0, GFP_KERNEL);
614+
pmu->id = ida_alloc(&ddr_ida, GFP_KERNEL);
615615
return pmu->id;
616616
}
617617

@@ -765,7 +765,7 @@ static int ddr_perf_probe(struct platform_device *pdev)
765765
cpuhp_instance_err:
766766
cpuhp_remove_multi_state(pmu->cpuhp_state);
767767
cpuhp_state_err:
768-
ida_simple_remove(&ddr_ida, pmu->id);
768+
ida_free(&ddr_ida, pmu->id);
769769
dev_warn(&pdev->dev, "i.MX8 DDR Perf PMU failed (%d), disabled\n", ret);
770770
return ret;
771771
}
@@ -779,7 +779,7 @@ static int ddr_perf_remove(struct platform_device *pdev)
779779

780780
perf_pmu_unregister(&pmu->pmu);
781781

782-
ida_simple_remove(&ddr_ida, pmu->id);
782+
ida_free(&ddr_ida, pmu->id);
783783
return 0;
784784
}
785785

drivers/perf/hisilicon/Kconfig

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,13 @@ config HISI_PCIE_PMU
1414
RCiEP devices.
1515
Adds the PCIe PMU into perf events system for monitoring latency,
1616
bandwidth etc.
17+
18+
config HNS3_PMU
19+
tristate "HNS3 PERF PMU"
20+
depends on ARM64 || COMPILE_TEST
21+
depends on PCI
22+
help
23+
Provide support for HNS3 performance monitoring unit (PMU) RCiEP
24+
devices.
25+
Adds the HNS3 PMU into perf events system for monitoring latency,
26+
bandwidth etc.

drivers/perf/hisilicon/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,3 +4,4 @@ obj-$(CONFIG_HISI_PMU) += hisi_uncore_pmu.o hisi_uncore_l3c_pmu.o \
44
hisi_uncore_pa_pmu.o hisi_uncore_cpa_pmu.o
55

66
obj-$(CONFIG_HISI_PCIE_PMU) += hisi_pcie_pmu.o
7+
obj-$(CONFIG_HNS3_PMU) += hns3_pmu.o

0 commit comments

Comments
 (0)