|
| 1 | +====================================================================== |
| 2 | +Synopsys DesignWare Cores (DWC) PCIe Performance Monitoring Unit (PMU) |
| 3 | +====================================================================== |
| 4 | + |
| 5 | +DesignWare Cores (DWC) PCIe PMU |
| 6 | +=============================== |
| 7 | + |
| 8 | +The PMU is a PCIe configuration space register block provided by each PCIe Root |
| 9 | +Port in a Vendor-Specific Extended Capability named RAS D.E.S (Debug, Error |
| 10 | +injection, and Statistics). |
| 11 | + |
| 12 | +As the name indicates, the RAS DES capability supports system level |
| 13 | +debugging, AER error injection, and collection of statistics. To facilitate |
| 14 | +collection of statistics, Synopsys DesignWare Cores PCIe controller |
| 15 | +provides the following two features: |
| 16 | + |
| 17 | +- one 64-bit counter for Time Based Analysis (RX/TX data throughput and |
| 18 | + time spent in each low-power LTSSM state) and |
| 19 | +- one 32-bit counter for Event Counting (error and non-error events for |
| 20 | + a specified lane) |
| 21 | + |
| 22 | +Note: There is no interrupt for counter overflow. |
| 23 | + |
| 24 | +Time Based Analysis |
| 25 | +------------------- |
| 26 | + |
| 27 | +Using this feature you can obtain information regarding RX/TX data |
| 28 | +throughput and time spent in each low-power LTSSM state by the controller. |
| 29 | +The PMU measures data in two categories: |
| 30 | + |
| 31 | +- Group#0: Percentage of time the controller stays in LTSSM states. |
| 32 | +- Group#1: Amount of data processed (Units of 16 bytes). |
| 33 | + |
| 34 | +Lane Event counters |
| 35 | +------------------- |
| 36 | + |
| 37 | +Using this feature you can obtain Error and Non-Error information in |
| 38 | +specific lane by the controller. The PMU event is selected by all of: |
| 39 | + |
| 40 | +- Group i |
| 41 | +- Event j within the Group i |
| 42 | +- Lane k |
| 43 | + |
| 44 | +Some of the events only exist for specific configurations. |
| 45 | + |
| 46 | +DesignWare Cores (DWC) PCIe PMU Driver |
| 47 | +======================================= |
| 48 | + |
| 49 | +This driver adds PMU devices for each PCIe Root Port named based on the BDF of |
| 50 | +the Root Port. For example, |
| 51 | + |
| 52 | + 30:03.0 PCI bridge: Device 1ded:8000 (rev 01) |
| 53 | + |
| 54 | +the PMU device name for this Root Port is dwc_rootport_3018. |
| 55 | + |
| 56 | +The DWC PCIe PMU driver registers a perf PMU driver, which provides |
| 57 | +description of available events and configuration options in sysfs, see |
| 58 | +/sys/bus/event_source/devices/dwc_rootport_{bdf}. |
| 59 | + |
| 60 | +The "format" directory describes format of the config fields of the |
| 61 | +perf_event_attr structure. The "events" directory provides configuration |
| 62 | +templates for all documented events. For example, |
| 63 | +"Rx_PCIe_TLP_Data_Payload" is an equivalent of "eventid=0x22,type=0x1". |
| 64 | + |
| 65 | +The "perf list" command shall list the available events from sysfs, e.g.:: |
| 66 | + |
| 67 | + $# perf list | grep dwc_rootport |
| 68 | + <...> |
| 69 | + dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ [Kernel PMU event] |
| 70 | + <...> |
| 71 | + dwc_rootport_3018/rx_memory_read,lane=?/ [Kernel PMU event] |
| 72 | + |
| 73 | +Time Based Analysis Event Usage |
| 74 | +------------------------------- |
| 75 | + |
| 76 | +Example usage of counting PCIe RX TLP data payload (Units of bytes):: |
| 77 | + |
| 78 | + $# perf stat -a -e dwc_rootport_3018/Rx_PCIe_TLP_Data_Payload/ |
| 79 | + |
| 80 | +The average RX/TX bandwidth can be calculated using the following formula: |
| 81 | + |
| 82 | + PCIe RX Bandwidth = Rx_PCIe_TLP_Data_Payload / Measure_Time_Window |
| 83 | + PCIe TX Bandwidth = Tx_PCIe_TLP_Data_Payload / Measure_Time_Window |
| 84 | + |
| 85 | +Lane Event Usage |
| 86 | +------------------------------- |
| 87 | + |
| 88 | +Each lane has the same event set and to avoid generating a list of hundreds |
| 89 | +of events, the user need to specify the lane ID explicitly, e.g.:: |
| 90 | + |
| 91 | + $# perf stat -a -e dwc_rootport_3018/rx_memory_read,lane=4/ |
| 92 | + |
| 93 | +The driver does not support sampling, therefore "perf record" will not |
| 94 | +work. Per-task (without "-a") perf sessions are not supported. |
0 commit comments