Skip to content

Commit 48a577d

Browse files
committed
Merge tag 'perf-tools-for-v6.0-2022-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo: - Introduce 'perf lock contention' subtool, using new lock contention tracepoints and using BPF for in kernel aggregation and then userspace processing using the perf tooling infrastructure for resolving symbols, target specification, etc. Since the new lock contention tracepoints don't provide lock names, get up to 8 stack traces and display the first non-lock function symbol name as a caller: $ perf lock report -F acquired,contended,avg_wait,wait_total Name acquired contended avg wait total wait update_blocked_a... 40 40 3.61 us 144.45 us kernfs_fop_open+... 5 5 3.64 us 18.18 us _nohz_idle_balance 3 3 2.65 us 7.95 us tick_do_update_j... 1 1 6.04 us 6.04 us ep_scan_ready_list 1 1 3.93 us 3.93 us Supports the usual 'perf record' + 'perf report' workflow as well as a BCC/bpftrace like mode where you start the tool and then press control+C to get results: $ sudo perf lock contention -b ^C contended total wait max wait avg wait type caller 42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20 23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a 6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30 3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c 1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115 1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148 2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b 1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06 2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f 1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c ... - Add new 'perf kwork' tool to trace time properties of kernel work (such as softirq, and workqueue), uses eBPF skeletons to collect info in kernel space, aggregating data that then gets processed by the userspace tool, e.g.: # perf kwork report Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end | ---------------------------------------------------------------------------------------------------- nvme0q5:130 | 004 | 1.101 ms | 49 | 0.051 ms | 26035.056403 s | 26035.056455 s | amdgpu:162 | 002 | 0.176 ms | 9 | 0.046 ms | 26035.268020 s | 26035.268066 s | nvme0q24:149 | 023 | 0.161 ms | 55 | 0.009 ms | 26035.655280 s | 26035.655288 s | nvme0q20:145 | 019 | 0.090 ms | 33 | 0.014 ms | 26035.939018 s | 26035.939032 s | nvme0q31:156 | 030 | 0.075 ms | 21 | 0.010 ms | 26035.052237 s | 26035.052247 s | nvme0q8:133 | 007 | 0.062 ms | 12 | 0.021 ms | 26035.416840 s | 26035.416861 s | nvme0q6:131 | 005 | 0.054 ms | 22 | 0.010 ms | 26035.199919 s | 26035.199929 s | nvme0q19:144 | 018 | 0.052 ms | 14 | 0.010 ms | 26035.110615 s | 26035.110625 s | nvme0q7:132 | 006 | 0.049 ms | 13 | 0.007 ms | 26035.125180 s | 26035.125187 s | nvme0q18:143 | 017 | 0.033 ms | 14 | 0.007 ms | 26035.169698 s | 26035.169705 s | nvme0q17:142 | 016 | 0.013 ms | 1 | 0.013 ms | 26035.565147 s | 26035.565160 s | enp5s0-rx-0:164 | 006 | 0.004 ms | 4 | 0.002 ms | 26035.928882 s | 26035.928884 s | enp5s0-tx-0:166 | 008 | 0.003 ms | 3 | 0.002 ms | 26035.870923 s | 26035.870925 s | -------------------------------------------------------------------------------------------------------- See commit log messages for more examples with extra options to limit the events time window, etc. - Add support for new AMD IBS (Instruction Based Sampling) features: With the DataSrc extensions, the source of data can be decoded among: - Local L3 or other L1/L2 in CCX. - A peer cache in a near CCX. - Data returned from DRAM. - A peer cache in a far CCX. - DRAM address map with "long latency" bit set. - Data returned from MMIO/Config/PCI/APIC. - Extension Memory (S-Link, GenZ, etc - identified by the CS target and/or address map at DF's choice). - Peer Agent Memory. - Support hardware tracing with Intel PT on guest machines, combining the traces with the ones in the host machine. - Add a "-m" option to 'perf buildid-list' to show kernel and modules build-ids, to display all of the information needed to do external symbolization of kernel stack traces, such as those collected by bpf_get_stackid(). - Add arch TSC frequency information to perf.data file headers. - Handle changes in the binutils disassembler function signatures in perf, bpftool and bpf_jit_disasm (Acked by the bpftool maintainer). - Fix building the perf perl binding with the newest gcc in distros such as fedora rawhide, where some new warnings were breaking the build as perf uses -Werror. - Add 'perf test' entry for branch stack sampling. - Add ARM SPE system wide 'perf test' entry. - Add user space counter reading tests to 'perf test'. - Build with python3 by default, if available. - Add python converter script for the vendor JSON event files. - Update vendor JSON files for most Intel cores. - Add vendor JSON File for Intel meteorlake. - Add Arm Cortex-A78C and X1C JSON vendor event files. - Add workaround to symbol address reading from ELF files without phdr, falling back to the previoous equation. - Convert legacy map definition to BTF-defined in the perf BPF script test. - Rework prologue generation code to stop using libbpf deprecated APIs. - Add default hybrid events for 'perf stat' on x86. - Add topdown metrics in the default 'perf stat' on the hybrid machines (big/little cores). - Prefer sampled CPU when exporting JSON in 'perf data convert' - Fix ('perf stat CSV output linter') and ("Check branch stack sampling") 'perf test' entries on s390. * tag 'perf-tools-for-v6.0-2022-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (169 commits) perf stat: Refactor __run_perf_stat() common code perf lock: Print the number of lost entries for BPF perf lock: Add --map-nr-entries option perf lock: Introduce struct lock_contention perf scripting python: Do not build fail on deprecation warnings genelf: Use HAVE_LIBCRYPTO_SUPPORT, not the never defined HAVE_LIBCRYPTO perf build: Suppress openssl v3 deprecation warnings in libcrypto feature test perf parse-events: Break out tracepoint and printing perf parse-events: Don't #define YY_EXTRA_TYPE tools bpftool: Don't display disassembler-four-args feature test tools bpftool: Fix compilation error with new binutils tools bpf_jit_disasm: Don't display disassembler-four-args feature test tools bpf_jit_disasm: Fix compilation error with new binutils tools perf: Fix compilation error with new binutils tools include: add dis-asm-compat.h to handle version differences tools build: Don't display disassembler-four-args feature test tools build: Add feature test for init_disassemble_info API changes perf test: Add ARM SPE system wide test perf tools: Rework prologue generation code perf bpf: Convert legacy map definition to BTF-defined ...
2 parents 033a944 + bb8bc52 commit 48a577d

File tree

390 files changed

+98334
-11922
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

390 files changed

+98334
-11922
lines changed

tools/arch/x86/include/asm/amd-ibs.h

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,10 @@ union ibs_fetch_ctl {
2929
rand_en:1, /* 57: random tagging enable */
3030
fetch_l2_miss:1,/* 58: L2 miss for sampled fetch
3131
* (needs IbsFetchComp) */
32-
reserved:5; /* 59-63: reserved */
32+
l3_miss_only:1, /* 59: Collect L3 miss samples only */
33+
fetch_oc_miss:1,/* 60: Op cache miss for the sampled fetch */
34+
fetch_l3_miss:1,/* 61: L3 cache miss for the sampled fetch */
35+
reserved:2; /* 62-63: reserved */
3336
};
3437
};
3538

@@ -38,14 +41,14 @@ union ibs_op_ctl {
3841
__u64 val;
3942
struct {
4043
__u64 opmaxcnt:16, /* 0-15: periodic op max. count */
41-
reserved0:1, /* 16: reserved */
44+
l3_miss_only:1, /* 16: Collect L3 miss samples only */
4245
op_en:1, /* 17: op sampling enable */
4346
op_val:1, /* 18: op sample valid */
4447
cnt_ctl:1, /* 19: periodic op counter control */
4548
opmaxcnt_ext:7, /* 20-26: upper 7 bits of periodic op maximum count */
46-
reserved1:5, /* 27-31: reserved */
49+
reserved0:5, /* 27-31: reserved */
4750
opcurcnt:27, /* 32-58: periodic op counter current count */
48-
reserved2:5; /* 59-63: reserved */
51+
reserved1:5; /* 59-63: reserved */
4952
};
5053
};
5154

@@ -71,11 +74,12 @@ union ibs_op_data {
7174
union ibs_op_data2 {
7275
__u64 val;
7376
struct {
74-
__u64 data_src:3, /* 0-2: data source */
77+
__u64 data_src_lo:3, /* 0-2: data source low */
7578
reserved0:1, /* 3: reserved */
7679
rmt_node:1, /* 4: destination node */
7780
cache_hit_st:1, /* 5: cache hit state */
78-
reserved1:57; /* 5-63: reserved */
81+
data_src_hi:2, /* 6-7: data source high */
82+
reserved1:56; /* 8-63: reserved */
7983
};
8084
};
8185

tools/bpf/Makefile

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,8 @@ else
3434
endif
3535

3636
FEATURE_USER = .bpf
37-
FEATURE_TESTS = libbfd disassembler-four-args
38-
FEATURE_DISPLAY = libbfd disassembler-four-args
37+
FEATURE_TESTS = libbfd disassembler-four-args disassembler-init-styled
38+
FEATURE_DISPLAY = libbfd
3939

4040
check_feat := 1
4141
NON_CHECK_FEAT_TARGETS := clean bpftool_clean runqslower_clean resolve_btfids_clean
@@ -56,6 +56,9 @@ endif
5656
ifeq ($(feature-disassembler-four-args), 1)
5757
CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE
5858
endif
59+
ifeq ($(feature-disassembler-init-styled), 1)
60+
CFLAGS += -DDISASM_INIT_STYLED
61+
endif
5962

6063
$(OUTPUT)%.yacc.c: $(srctree)/tools/bpf/%.y
6164
$(QUIET_BISON)$(YACC) -o $@ -d $<

tools/bpf/bpf_jit_disasm.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,7 @@
2828
#include <sys/types.h>
2929
#include <sys/stat.h>
3030
#include <limits.h>
31+
#include <tools/dis-asm-compat.h>
3132

3233
#define CMD_ACTION_SIZE_BUFFER 10
3334
#define CMD_ACTION_READ_ALL 3
@@ -64,7 +65,9 @@ static void get_asm_insns(uint8_t *image, size_t len, int opcodes)
6465
assert(bfdf);
6566
assert(bfd_check_format(bfdf, bfd_object));
6667

67-
init_disassemble_info(&info, stdout, (fprintf_ftype) fprintf);
68+
init_disassemble_info_compat(&info, stdout,
69+
(fprintf_ftype) fprintf,
70+
fprintf_styled);
6871
info.arch = bfd_get_arch(bfdf);
6972
info.mach = bfd_get_mach(bfdf);
7073
info.buffer = image;

tools/bpf/bpftool/Makefile

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,8 +93,9 @@ INSTALL ?= install
9393
RM ?= rm -f
9494

9595
FEATURE_USER = .bpftool
96-
FEATURE_TESTS = libbfd disassembler-four-args libcap clang-bpf-co-re
97-
FEATURE_DISPLAY = libbfd disassembler-four-args libcap clang-bpf-co-re
96+
FEATURE_TESTS = libbfd disassembler-four-args disassembler-init-styled libcap \
97+
clang-bpf-co-re
98+
FEATURE_DISPLAY = libbfd libcap clang-bpf-co-re
9899

99100
check_feat := 1
100101
NON_CHECK_FEAT_TARGETS := clean uninstall doc doc-clean doc-install doc-uninstall
@@ -115,6 +116,9 @@ endif
115116
ifeq ($(feature-disassembler-four-args), 1)
116117
CFLAGS += -DDISASM_FOUR_ARGS_SIGNATURE
117118
endif
119+
ifeq ($(feature-disassembler-init-styled), 1)
120+
CFLAGS += -DDISASM_INIT_STYLED
121+
endif
118122

119123
LIBS = $(LIBBPF) -lelf -lz
120124
LIBS_BOOTSTRAP = $(LIBBPF_BOOTSTRAP) -lelf -lz

tools/bpf/bpftool/jit_disasm.c

Lines changed: 34 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
#include <sys/stat.h>
2525
#include <limits.h>
2626
#include <bpf/libbpf.h>
27+
#include <tools/dis-asm-compat.h>
2728

2829
#include "json_writer.h"
2930
#include "main.h"
@@ -39,15 +40,12 @@ static void get_exec_path(char *tpath, size_t size)
3940
}
4041

4142
static int oper_count;
42-
static int fprintf_json(void *out, const char *fmt, ...)
43+
static int printf_json(void *out, const char *fmt, va_list ap)
4344
{
44-
va_list ap;
4545
char *s;
4646
int err;
4747

48-
va_start(ap, fmt);
4948
err = vasprintf(&s, fmt, ap);
50-
va_end(ap);
5149
if (err < 0)
5250
return -1;
5351

@@ -73,6 +71,32 @@ static int fprintf_json(void *out, const char *fmt, ...)
7371
return 0;
7472
}
7573

74+
static int fprintf_json(void *out, const char *fmt, ...)
75+
{
76+
va_list ap;
77+
int r;
78+
79+
va_start(ap, fmt);
80+
r = printf_json(out, fmt, ap);
81+
va_end(ap);
82+
83+
return r;
84+
}
85+
86+
static int fprintf_json_styled(void *out,
87+
enum disassembler_style style __maybe_unused,
88+
const char *fmt, ...)
89+
{
90+
va_list ap;
91+
int r;
92+
93+
va_start(ap, fmt);
94+
r = printf_json(out, fmt, ap);
95+
va_end(ap);
96+
97+
return r;
98+
}
99+
76100
void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
77101
const char *arch, const char *disassembler_options,
78102
const struct btf *btf,
@@ -99,11 +123,13 @@ void disasm_print_insn(unsigned char *image, ssize_t len, int opcodes,
99123
assert(bfd_check_format(bfdf, bfd_object));
100124

101125
if (json_output)
102-
init_disassemble_info(&info, stdout,
103-
(fprintf_ftype) fprintf_json);
126+
init_disassemble_info_compat(&info, stdout,
127+
(fprintf_ftype) fprintf_json,
128+
fprintf_json_styled);
104129
else
105-
init_disassemble_info(&info, stdout,
106-
(fprintf_ftype) fprintf);
130+
init_disassemble_info_compat(&info, stdout,
131+
(fprintf_ftype) fprintf,
132+
fprintf_styled);
107133

108134
/* Update architecture info for offload. */
109135
if (arch) {

tools/build/Makefile.feature

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ FEATURE_TESTS_BASIC := \
7070
libaio \
7171
libzstd \
7272
disassembler-four-args \
73+
disassembler-init-styled \
7374
file-handle
7475

7576
# FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
@@ -134,8 +135,7 @@ FEATURE_DISPLAY ?= \
134135
get_cpuid \
135136
bpf \
136137
libaio \
137-
libzstd \
138-
disassembler-four-args
138+
libzstd
139139

140140
# Set FEATURE_CHECK_(C|LD)FLAGS-all for all FEATURE_TESTS features.
141141
# If in the future we need per-feature checks/flags for features not

tools/build/feature/Makefile

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ FILES= \
1818
test-libbfd.bin \
1919
test-libbfd-buildid.bin \
2020
test-disassembler-four-args.bin \
21+
test-disassembler-init-styled.bin \
2122
test-reallocarray.bin \
2223
test-libbfd-liberty.bin \
2324
test-libbfd-liberty-z.bin \
@@ -248,6 +249,9 @@ $(OUTPUT)test-libbfd-buildid.bin:
248249
$(OUTPUT)test-disassembler-four-args.bin:
249250
$(BUILD) -DPACKAGE='"perf"' -lbfd -lopcodes
250251

252+
$(OUTPUT)test-disassembler-init-styled.bin:
253+
$(BUILD) -DPACKAGE='"perf"' -lbfd -lopcodes
254+
251255
$(OUTPUT)test-reallocarray.bin:
252256
$(BUILD)
253257

tools/build/feature/test-all.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,10 @@
166166
# include "test-disassembler-four-args.c"
167167
#undef main
168168

169+
#define main main_test_disassembler_init_styled
170+
# include "test-disassembler-init-styled.c"
171+
#undef main
172+
169173
#define main main_test_libzstd
170174
# include "test-libzstd.c"
171175
#undef main
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
// SPDX-License-Identifier: GPL-2.0
2+
#include <stdio.h>
3+
#include <dis-asm.h>
4+
5+
int main(void)
6+
{
7+
struct disassemble_info info;
8+
9+
init_disassemble_info(&info, stdout,
10+
NULL, NULL);
11+
12+
return 0;
13+
}

tools/build/feature/test-libcrypto.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,12 @@
22
#include <openssl/sha.h>
33
#include <openssl/md5.h>
44

5+
/*
6+
* The MD5_* API have been deprecated since OpenSSL 3.0, which causes the
7+
* feature test to fail silently. This is a workaround.
8+
*/
9+
#pragma GCC diagnostic ignored "-Wdeprecated-declarations"
10+
511
int main(void)
612
{
713
MD5_CTX context;

0 commit comments

Comments
 (0)