Skip to content

Commit 00e4db5

Browse files
committed
Merge tag 'perf-tools-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo: "New features: - Introduce controlling how 'perf stat' and 'perf record' works via a control file descriptor, allowing starting with events configured but disabled until commands are received via the control file descriptor. This allows, for instance for tools such as Intel VTune to make further use of perf as its Linux platform driver. - Improve 'perf record' to to register in a perf.data file header the clockid used to help later correlate things like syslog files and perf events recorded. - Add basic syscall and find_next_bit benchmarks to 'perf bench'. - Allow using computed metrics in calculating other metrics. For instance: { .metric_expr = "l2_rqsts.demand_data_rd_hit + l2_rqsts.pf_hit + l2_rqsts.rfo_hit", .metric_name = "DCache_L2_All_Hits", }, { .metric_expr = "max(l2_rqsts.all_demand_data_rd - l2_rqsts.demand_data_rd_hit, 0) + l2_rqsts.pf_miss + l2_rqsts.rfo_miss", .metric_name = "DCache_L2_All_Miss", }, { .metric_expr = "dcache_l2_all_hits + dcache_l2_all_miss", .metric_name = "DCache_L2_All", } - Add suport for 'd_ratio', '>' and '<' operators to the expression resolver used in calculating metrics in 'perf stat'. Support for new kernel features: - Support TEXT_POKE and KSYMBOL_TYPE_OOL perf metadata events to cope with things like ftrace, trampolines, i.e. changes in the kernel text that gets in the way of properly decoding Intel PT hardware traces, for instance. Intel PT: - Add various knobs to reduce the volume of Intel PT traces by reducing the level of details such as decoding just some types of packets (e.g., FUP/TIP, PSB+), also filtering by time range. - Add new itrace options (log flags to the 'd' option, error flags to the 'e' one, etc), controlling how Intel PT is transformed into perf events, document some missing options (e.g., how to synthesize callchains). BPF: - Properly report BPF errors when parsing events. - Do not setup side-band events if LIBBPF is not linked, fixing a segfault. Libraries: - Improvements to the libtraceevent plugin mechanism. - Improve libtracevent support for KVM trace events SVM exit reasons. - Add a libtracevent plugins for decoding syscalls/sys_enter_futex and for tlb_flush. - Ensure sample_period is set libpfm4 events in 'perf test'. - Fixup libperf namespacing, to make sure what is in libperf has the perf_ namespace while what is now only in tools/perf/ doesn't use that prefix. Arch specific: - Improve the testing of vendor events and metrics in 'perf test'. - Allow no ARM CoreSight hardware tracer sink to be specified on command line. - Fix arm_spe_x recording when mixed with other perf events. - Add s390 idle functions 'psw_idle' and 'psw_idle_exit' to list of idle symbols. - List kernel supplied event aliases for arm64 in 'perf list'. - Add support for extended register capability in PowerPC 9 and 10. - Added nest IMC power9 metric events. Miscellaneous: - No need to setup sample_regs_intr/sample_regs_user for dummy events. - Update various copies of kernel headers, some causing perf to handle new syscalls, MSRs, etc. - Improve usage of flex and yacc, enabling warnings and addressing the fallout. - Add missing '--output' option to 'perf kmem' so that it can pass it along to 'perf record'. - 'perf probe' fixes related to adding multiple probes on the same address for the same event. - Make 'perf probe' warn if the target function is a GNU indirect function. - Remove //anon mmap events from 'perf inject jit' to fix supporting both using ELF files for generated functions and the perf-PID.map approaches" * tag 'perf-tools-2020-08-10' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (144 commits) perf record: Skip side-band event setup if HAVE_LIBBPF_SUPPORT is not set perf tools powerpc: Add support for extended regs in power10 perf tools powerpc: Add support for extended register capability tools headers UAPI: Sync drm/i915_drm.h with the kernel sources tools arch x86: Sync asm/cpufeatures.h with the kernel sources tools arch x86: Sync the msr-index.h copy with the kernel sources tools headers UAPI: update linux/in.h copy tools headers API: Update close_range affected files perf script: Add 'tod' field to display time of day perf script: Change the 'enum perf_output_field' enumerators to be 64 bits perf data: Add support to store time of day in CTF data conversion perf tools: Move clockid_res_ns under clock struct perf header: Store clock references for -k/--clockid option perf tools: Add clockid_name function perf clockid: Move parse_clockid() to new clockid object tools lib traceevent: Handle possible strdup() error in tep_add_plugin_path() API libtraceevent: Fixed description of tep_add_plugin_path() API libtraceevent: Fixed type in PRINT_FMT_STING libtraceevent: Fixed broken indentation in parse_ip4_print_args() libtraceevent: Improve error handling of tep_plugin_add_option() API ...
2 parents ed3854f + 1101c87 commit 00e4db5

File tree

127 files changed

+5164
-1091
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

127 files changed

+5164
-1091
lines changed

tools/arch/powerpc/include/uapi/asm/perf_regs.h

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,24 @@ enum perf_event_powerpc_regs {
4848
PERF_REG_POWERPC_DSISR,
4949
PERF_REG_POWERPC_SIER,
5050
PERF_REG_POWERPC_MMCRA,
51-
PERF_REG_POWERPC_MAX,
51+
/* Extended registers */
52+
PERF_REG_POWERPC_MMCR0,
53+
PERF_REG_POWERPC_MMCR1,
54+
PERF_REG_POWERPC_MMCR2,
55+
PERF_REG_POWERPC_MMCR3,
56+
PERF_REG_POWERPC_SIER2,
57+
PERF_REG_POWERPC_SIER3,
58+
/* Max regs without the extended regs */
59+
PERF_REG_POWERPC_MAX = PERF_REG_POWERPC_MMCRA + 1,
5260
};
61+
62+
#define PERF_REG_PMU_MASK ((1ULL << PERF_REG_POWERPC_MAX) - 1)
63+
64+
/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
65+
#define PERF_REG_PMU_MASK_300 (((1ULL << (PERF_REG_POWERPC_MMCR2 + 1)) - 1) - PERF_REG_PMU_MASK)
66+
/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_31 */
67+
#define PERF_REG_PMU_MASK_31 (((1ULL << (PERF_REG_POWERPC_SIER3 + 1)) - 1) - PERF_REG_PMU_MASK)
68+
69+
#define PERF_REG_MAX_ISA_300 (PERF_REG_POWERPC_MMCR2 + 1)
70+
#define PERF_REG_MAX_ISA_31 (PERF_REG_POWERPC_SIER3 + 1)
5371
#endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */

tools/arch/x86/include/asm/cpufeatures.h

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,7 @@
9696
#define X86_FEATURE_SYSCALL32 ( 3*32+14) /* "" syscall in IA32 userspace */
9797
#define X86_FEATURE_SYSENTER32 ( 3*32+15) /* "" sysenter in IA32 userspace */
9898
#define X86_FEATURE_REP_GOOD ( 3*32+16) /* REP microcode works well */
99+
/* free ( 3*32+17) */
99100
#define X86_FEATURE_LFENCE_RDTSC ( 3*32+18) /* "" LFENCE synchronizes RDTSC */
100101
#define X86_FEATURE_ACC_POWER ( 3*32+19) /* AMD Accumulated Power Mechanism */
101102
#define X86_FEATURE_NOPL ( 3*32+20) /* The NOPL (0F 1F) instructions */
@@ -107,6 +108,7 @@
107108
#define X86_FEATURE_EXTD_APICID ( 3*32+26) /* Extended APICID (8 bits) */
108109
#define X86_FEATURE_AMD_DCM ( 3*32+27) /* AMD multi-node processor */
109110
#define X86_FEATURE_APERFMPERF ( 3*32+28) /* P-State hardware coordination feedback capability (APERF/MPERF MSRs) */
111+
/* free ( 3*32+29) */
110112
#define X86_FEATURE_NONSTOP_TSC_S3 ( 3*32+30) /* TSC doesn't stop in S3 state */
111113
#define X86_FEATURE_TSC_KNOWN_FREQ ( 3*32+31) /* TSC has known frequency */
112114

@@ -365,7 +367,9 @@
365367
#define X86_FEATURE_SRBDS_CTRL (18*32+ 9) /* "" SRBDS mitigation MSR available */
366368
#define X86_FEATURE_MD_CLEAR (18*32+10) /* VERW clears CPU buffers */
367369
#define X86_FEATURE_TSX_FORCE_ABORT (18*32+13) /* "" TSX_FORCE_ABORT */
370+
#define X86_FEATURE_SERIALIZE (18*32+14) /* SERIALIZE instruction */
368371
#define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */
372+
#define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */
369373
#define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */
370374
#define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */
371375
#define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */

tools/arch/x86/include/asm/msr-index.h

Lines changed: 23 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -149,6 +149,10 @@
149149

150150
#define MSR_LBR_SELECT 0x000001c8
151151
#define MSR_LBR_TOS 0x000001c9
152+
153+
#define MSR_IA32_POWER_CTL 0x000001fc
154+
#define MSR_IA32_POWER_CTL_BIT_EE 19
155+
152156
#define MSR_LBR_NHM_FROM 0x00000680
153157
#define MSR_LBR_NHM_TO 0x000006c0
154158
#define MSR_LBR_CORE_FROM 0x00000040
@@ -158,7 +162,23 @@
158162
#define LBR_INFO_MISPRED BIT_ULL(63)
159163
#define LBR_INFO_IN_TX BIT_ULL(62)
160164
#define LBR_INFO_ABORT BIT_ULL(61)
165+
#define LBR_INFO_CYC_CNT_VALID BIT_ULL(60)
161166
#define LBR_INFO_CYCLES 0xffff
167+
#define LBR_INFO_BR_TYPE_OFFSET 56
168+
#define LBR_INFO_BR_TYPE (0xfull << LBR_INFO_BR_TYPE_OFFSET)
169+
170+
#define MSR_ARCH_LBR_CTL 0x000014ce
171+
#define ARCH_LBR_CTL_LBREN BIT(0)
172+
#define ARCH_LBR_CTL_CPL_OFFSET 1
173+
#define ARCH_LBR_CTL_CPL (0x3ull << ARCH_LBR_CTL_CPL_OFFSET)
174+
#define ARCH_LBR_CTL_STACK_OFFSET 3
175+
#define ARCH_LBR_CTL_STACK (0x1ull << ARCH_LBR_CTL_STACK_OFFSET)
176+
#define ARCH_LBR_CTL_FILTER_OFFSET 16
177+
#define ARCH_LBR_CTL_FILTER (0x7full << ARCH_LBR_CTL_FILTER_OFFSET)
178+
#define MSR_ARCH_LBR_DEPTH 0x000014cf
179+
#define MSR_ARCH_LBR_FROM_0 0x00001500
180+
#define MSR_ARCH_LBR_TO_0 0x00001600
181+
#define MSR_ARCH_LBR_INFO_0 0x00001200
162182

163183
#define MSR_IA32_PEBS_ENABLE 0x000003f1
164184
#define MSR_PEBS_DATA_CFG 0x000003f2
@@ -253,8 +273,6 @@
253273

254274
#define MSR_PEBS_FRONTEND 0x000003f7
255275

256-
#define MSR_IA32_POWER_CTL 0x000001fc
257-
258276
#define MSR_IA32_MC0_CTL 0x00000400
259277
#define MSR_IA32_MC0_STATUS 0x00000401
260278
#define MSR_IA32_MC0_ADDR 0x00000402
@@ -418,7 +436,6 @@
418436
#define MSR_AMD64_PATCH_LEVEL 0x0000008b
419437
#define MSR_AMD64_TSC_RATIO 0xc0000104
420438
#define MSR_AMD64_NB_CFG 0xc001001f
421-
#define MSR_AMD64_CPUID_FN_1 0xc0011004
422439
#define MSR_AMD64_PATCH_LOADER 0xc0010020
423440
#define MSR_AMD_PERF_CTL 0xc0010062
424441
#define MSR_AMD_PERF_STATUS 0xc0010063
@@ -427,6 +444,7 @@
427444
#define MSR_AMD64_OSVW_STATUS 0xc0010141
428445
#define MSR_AMD_PPIN_CTL 0xc00102f0
429446
#define MSR_AMD_PPIN 0xc00102f1
447+
#define MSR_AMD64_CPUID_FN_1 0xc0011004
430448
#define MSR_AMD64_LS_CFG 0xc0011020
431449
#define MSR_AMD64_DC_CFG 0xc0011022
432450
#define MSR_AMD64_BU_CFG2 0xc001102a
@@ -466,6 +484,8 @@
466484
#define MSR_F16H_DR0_ADDR_MASK 0xc0011027
467485

468486
/* Fam 15h MSRs */
487+
#define MSR_F15H_CU_PWR_ACCUMULATOR 0xc001007a
488+
#define MSR_F15H_CU_MAX_PWR_ACCUMULATOR 0xc001007b
469489
#define MSR_F15H_PERF_CTL 0xc0010200
470490
#define MSR_F15H_PERF_CTL0 MSR_F15H_PERF_CTL
471491
#define MSR_F15H_PERF_CTL1 (MSR_F15H_PERF_CTL + 2)

tools/build/Makefile.feature

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ endif
88

99
feature_check = $(eval $(feature_check_code))
1010
define feature_check_code
11-
feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CFLAGS="$(EXTRA_CFLAGS) $(FEATURE_CHECK_CFLAGS-$(1))" CXXFLAGS="$(EXTRA_CXXFLAGS) $(FEATURE_CHECK_CXXFLAGS-$(1))" LDFLAGS="$(LDFLAGS) $(FEATURE_CHECK_LDFLAGS-$(1))" -C $(feature_dir) $(OUTPUT_FEATURES)test-$1.bin >/dev/null 2>/dev/null && echo 1 || echo 0)
11+
feature-$(1) := $(shell $(MAKE) OUTPUT=$(OUTPUT_FEATURES) CC=$(CC) CXX=$(CXX) CFLAGS="$(EXTRA_CFLAGS) $(FEATURE_CHECK_CFLAGS-$(1))" CXXFLAGS="$(EXTRA_CXXFLAGS) $(FEATURE_CHECK_CXXFLAGS-$(1))" LDFLAGS="$(LDFLAGS) $(FEATURE_CHECK_LDFLAGS-$(1))" -C $(feature_dir) $(OUTPUT_FEATURES)test-$1.bin >/dev/null 2>/dev/null && echo 1 || echo 0)
1212
endef
1313

1414
feature_set = $(eval $(feature_set_code))

tools/build/feature/Makefile

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -74,8 +74,6 @@ FILES= \
7474

7575
FILES := $(addprefix $(OUTPUT),$(FILES))
7676

77-
CC ?= $(CROSS_COMPILE)gcc
78-
CXX ?= $(CROSS_COMPILE)g++
7977
PKG_CONFIG ?= $(CROSS_COMPILE)pkg-config
8078
LLVM_CONFIG ?= llvm-config
8179
CLANG ?= clang

tools/include/uapi/asm-generic/unistd.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -850,6 +850,8 @@ __SYSCALL(__NR_pidfd_open, sys_pidfd_open)
850850
#define __NR_clone3 435
851851
__SYSCALL(__NR_clone3, sys_clone3)
852852
#endif
853+
#define __NR_close_range 436
854+
__SYSCALL(__NR_close_range, sys_close_range)
853855

854856
#define __NR_openat2 437
855857
__SYSCALL(__NR_openat2, sys_openat2)

tools/include/uapi/drm/i915_drm.h

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ extern "C" {
5555
* cause the related events to not be seen.
5656
*
5757
* I915_RESET_UEVENT - Event is generated just before an attempt to reset the
58-
* the GPU. The value supplied with the event is always 1. NOTE: Disable
58+
* GPU. The value supplied with the event is always 1. NOTE: Disable
5959
* reset via module parameter will cause this event to not be seen.
6060
*/
6161
#define I915_L3_PARITY_UEVENT "L3_PARITY_ERROR"
@@ -1934,7 +1934,7 @@ enum drm_i915_perf_property_id {
19341934

19351935
/**
19361936
* The value specifies which set of OA unit metrics should be
1937-
* be configured, defining the contents of any OA unit reports.
1937+
* configured, defining the contents of any OA unit reports.
19381938
*
19391939
* This property is available in perf revision 1.
19401940
*/

tools/include/uapi/linux/in.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -123,6 +123,7 @@ struct in_addr {
123123
#define IP_CHECKSUM 23
124124
#define IP_BIND_ADDRESS_NO_PORT 24
125125
#define IP_RECVFRAGSIZE 25
126+
#define IP_RECVERR_RFC4884 26
126127

127128
/* IP_MTU_DISCOVER values */
128129
#define IP_PMTUDISC_DONT 0 /* Never send DF frames */

tools/include/uapi/linux/perf_event.h

Lines changed: 25 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -383,7 +383,8 @@ struct perf_event_attr {
383383
bpf_event : 1, /* include bpf events */
384384
aux_output : 1, /* generate AUX records instead of events */
385385
cgroup : 1, /* include cgroup events */
386-
__reserved_1 : 31;
386+
text_poke : 1, /* include text poke events */
387+
__reserved_1 : 30;
387388

388389
union {
389390
__u32 wakeup_events; /* wakeup every n events */
@@ -1041,12 +1042,35 @@ enum perf_event_type {
10411042
*/
10421043
PERF_RECORD_CGROUP = 19,
10431044

1045+
/*
1046+
* Records changes to kernel text i.e. self-modified code. 'old_len' is
1047+
* the number of old bytes, 'new_len' is the number of new bytes. Either
1048+
* 'old_len' or 'new_len' may be zero to indicate, for example, the
1049+
* addition or removal of a trampoline. 'bytes' contains the old bytes
1050+
* followed immediately by the new bytes.
1051+
*
1052+
* struct {
1053+
* struct perf_event_header header;
1054+
* u64 addr;
1055+
* u16 old_len;
1056+
* u16 new_len;
1057+
* u8 bytes[];
1058+
* struct sample_id sample_id;
1059+
* };
1060+
*/
1061+
PERF_RECORD_TEXT_POKE = 20,
1062+
10441063
PERF_RECORD_MAX, /* non-ABI */
10451064
};
10461065

10471066
enum perf_record_ksymbol_type {
10481067
PERF_RECORD_KSYMBOL_TYPE_UNKNOWN = 0,
10491068
PERF_RECORD_KSYMBOL_TYPE_BPF = 1,
1069+
/*
1070+
* Out of line code such as kprobe-replaced instructions or optimized
1071+
* kprobes or ftrace trampolines.
1072+
*/
1073+
PERF_RECORD_KSYMBOL_TYPE_OOL = 2,
10501074
PERF_RECORD_KSYMBOL_TYPE_MAX /* non-ABI */
10511075
};
10521076

tools/lib/api/fd/array.c

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
#include <poll.h>
99
#include <stdlib.h>
1010
#include <unistd.h>
11+
#include <string.h>
1112

1213
void fdarray__init(struct fdarray *fda, int nr_autogrow)
1314
{
@@ -19,7 +20,7 @@ void fdarray__init(struct fdarray *fda, int nr_autogrow)
1920

2021
int fdarray__grow(struct fdarray *fda, int nr)
2122
{
22-
void *priv;
23+
struct priv *priv;
2324
int nr_alloc = fda->nr_alloc + nr;
2425
size_t psize = sizeof(fda->priv[0]) * nr_alloc;
2526
size_t size = sizeof(struct pollfd) * nr_alloc;
@@ -34,6 +35,9 @@ int fdarray__grow(struct fdarray *fda, int nr)
3435
return -ENOMEM;
3536
}
3637

38+
memset(&entries[fda->nr_alloc], 0, sizeof(struct pollfd) * nr);
39+
memset(&priv[fda->nr_alloc], 0, sizeof(fda->priv[0]) * nr);
40+
3741
fda->nr_alloc = nr_alloc;
3842
fda->entries = entries;
3943
fda->priv = priv;
@@ -69,7 +73,7 @@ void fdarray__delete(struct fdarray *fda)
6973
free(fda);
7074
}
7175

72-
int fdarray__add(struct fdarray *fda, int fd, short revents)
76+
int fdarray__add(struct fdarray *fda, int fd, short revents, enum fdarray_flags flags)
7377
{
7478
int pos = fda->nr;
7579

@@ -79,6 +83,7 @@ int fdarray__add(struct fdarray *fda, int fd, short revents)
7983

8084
fda->entries[fda->nr].fd = fd;
8185
fda->entries[fda->nr].events = revents;
86+
fda->priv[fda->nr].flags = flags;
8287
fda->nr++;
8388
return pos;
8489
}
@@ -93,22 +98,22 @@ int fdarray__filter(struct fdarray *fda, short revents,
9398
return 0;
9499

95100
for (fd = 0; fd < fda->nr; ++fd) {
101+
if (!fda->entries[fd].events)
102+
continue;
103+
96104
if (fda->entries[fd].revents & revents) {
97105
if (entry_destructor)
98106
entry_destructor(fda, fd, arg);
99107

108+
fda->entries[fd].revents = fda->entries[fd].events = 0;
100109
continue;
101110
}
102111

103-
if (fd != nr) {
104-
fda->entries[nr] = fda->entries[fd];
105-
fda->priv[nr] = fda->priv[fd];
106-
}
107-
108-
++nr;
112+
if (!(fda->priv[fd].flags & fdarray_flag__nonfilterable))
113+
++nr;
109114
}
110115

111-
return fda->nr = nr;
116+
return nr;
112117
}
113118

114119
int fdarray__poll(struct fdarray *fda, int timeout)

0 commit comments

Comments
 (0)