Skip to content

Commit cae4061

Browse files
axiqiawilldeacon
authored andcommitted
docs: perf: Add description for Synopsys DesignWare PCIe PMU driver
Alibaba's T-Head Yitan 710 SoC includes Synopsys' DesignWare Core PCIe controller which implements PMU for performance and functional debugging to facilitate system maintenance. Document it to provide guidance on how to use it. Signed-off-by: Shuai Xue <[email protected]> Reviewed-by: Baolin Wang <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Yicong Yang <[email protected]> Tested-by: Ilkka Koskinen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
1 parent eb183b2 commit cae4061

File tree

2 files changed

+95
-0
lines changed

2 files changed

+95
-0
lines changed
Lines changed: 94 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
======================================================================
2+
Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU)
3+
======================================================================
4+
5+
DesignWare Cores (DWC) PCIe PMU
6+
===============================
7+
8+
The PMU is a PCIe configuration space register block provided by each PCIe Root
9+
Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error
10+
injection, and Statistics).
11+
12+
As the name indicates, the RAS DES capability supports system level
13+
debugging, AER error injection, and collection of statistics. To facilitate
14+
collection of statistics, Synopsys DesignWare Cores PCIe controller
15+
provides the following two features:
16+
17+
- one 64-bit counter for Time Based Analysis (RX/TX data throughput and
18+
time spent in each low-power LTSSM state) and
19+
- one 32-bit counter for Event Counting (error and non-error events for
20+
a specified lane)
21+
22+
Note: There is no interrupt for counter overflow.
23+
24+
Time Based Analysis
25+
-------------------
26+
27+
Using this feature you can obtain information regarding RX/TX data
28+
throughput and time spent in each low-power LTSSM state by the controller.
29+
The PMU measures data in two categories:
30+
31+
- Group#0: Percentage of time the controller stays in LTSSM states.
32+
- Group#1: Amount of data processed (Units of 16 bytes).
33+
34+
Lane Event counters
35+
-------------------
36+
37+
Using this feature you can obtain Error and Non-Error information in
38+
specific lane by the controller. The PMU event is selected by all of:
39+
40+
- Group i
41+
- Event j within the Group i
42+
- Lane k
43+
44+
Some of the events only exist for specific configurations.
45+
46+
DesignWare Cores (DWC) PCIe PMU Driver
47+
=======================================
48+
49+
This driver adds PMU devices for each PCIe Root Port named based on the BDF of
50+
the Root Port. For example,
51+
52+
30:03.0 PCI bridge: Device 1ded:8000 (rev 01)
53+
54+
the PMU device name for this Root Port is dwc_rootport_3018.
55+
56+
The DWC PCIe PMU driver registers a perf PMU driver, which provides
57+
description of available events and configuration options in sysfs, see
58+
/sys/bus/event_source/devices/dwc_rootport_{bdf}.
59+
60+
The "format" directory describes format of the config fields of the
61+
perf_event_attr structure. The "events" directory provides configuration
62+
templates for all documented events. For example,
63+
"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1".
64+
65+
The "perf list" command shall list the available events from sysfs, e.g.::
66+
67+
$# perf list | grep dwc_rootport
68+
<...>
69+
dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event]
70+
<...>
71+
dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event]
72+
73+
Time Based Analysis Event Usage
74+
-------------------------------
75+
76+
Example usage of counting PCIe RX TLP data payload (Units of bytes)::
77+
78+
$# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/
79+
80+
The average RX/TX bandwidth can be calculated using the following formula:
81+
82+
PCIe RX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window
83+
PCIe TX Bandwidth = Tx_PCIe_TLP_Data_Payload / Measure_Time_Window
84+
85+
Lane Event Usage
86+
-------------------------------
87+
88+
Each lane has the same event set and to avoid generating a list of hundreds
89+
of events, the user need to specify the lane ID explicitly, e.g.::
90+
91+
$# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/
92+
93+
The driver does not support sampling, therefore "perf record" will not
94+
work. Per-task (without "-a") perf sessions are not supported.

Documentation/admin-guide/perf/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ Performance monitor support
1919
arm_dsu_pmu
2020
thunderx2-pmu
2121
alibaba_pmu
22+
dwc_pcie_pmu
2223
nvidia-pmu
2324
meson-ddr-pmu
2425
cxl

0 commit comments

Comments
 (0)