Skip to content

Commit 174e719

Browse files
committed
Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more perf updates from Thomas Gleixner: "A rather large set of perf updates: Kernel: - Fix various initialization issues - Prevent creating [ku]probes for not CAP_SYS_ADMIN users Tooling: - Show only failing syscalls with 'perf trace --failure' (Arnaldo Carvalho de Melo) e.g: See what 'openat' syscalls are failing: # perf trace --failure -e openat 762.323 ( 0.007 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video2) = -1 ENOENT No such file or directory <SNIP N /dev/videoN open attempts... sigh, where is that improvised camera lid?!? > 790.228 ( 0.008 ms): VideoCapture/4566 openat(dfd: CWD, filename: /dev/video63) = -1 ENOENT No such file or directory ^C# - Show information about the event (freq, nr_samples, total period/nr_events) in the annotate --tui and --stdio2 'perf annotate' output, similar to the first line in the 'perf report --tui', but just for the samples for a the annotated symbol (Arnaldo Carvalho de Melo) - Introduce 'perf version --build-options' to show what features were linked, aliased as well as a shorter 'perf -vv' (Jin Yao) - Add a "dso_size" sort order (Kim Phillips) - Remove redundant ')' in the tracepoint output in 'perf trace' (Changbin Du) - Synchronize x86's cpufeatures.h, no effect on toolss (Arnaldo Carvalho de Melo) - Show group details on the title line in the annotate browser and 'perf annotate --stdio2' output, so that the per-event columns can have headers (Arnaldo Carvalho de Melo) - Fixup vertical line separating metrics from instructions and cleaning unused lines at the bottom, both in the annotate TUI browser (Arnaldo Carvalho de Melo) - Remove duplicated 'samples' in lost samples warning in 'perf report' (Arnaldo Carvalho de Melo) - Synchronize i915_drm.h, silencing the perf build process, automagically adding support for the new DRM_I915_QUERY ioctl (Arnaldo Carvalho de Melo) - Make auxtrace_queues__add_buffer() allocate struct buffer, from a patchkit already applied (Adrian Hunter) - Fix the --stdio2/TUI annotate output to include group details, be it for a recorded '{a,b,f}' explicit event group or when forcing group display using 'perf report --group' for a set of events not recorded as a group (Arnaldo Carvalho de Melo) - Fix display artifacts in the ui browser (base class for the annotate and main report/top TUI browser) related to the extra title lines work (Arnaldo Carvalho de Melo) - perf auxtrace refactorings, leftovers from a previously partially processed patchset (Adrian Hunter) - Fix the builtin clang build (Sandipan Das, Arnaldo Carvalho de Melo) - Synchronize i915_drm.h, silencing a perf build warning and in the process automagically adding support for a new ioctl command (Arnaldo Carvalho de Melo) - Fix a strncpy issue in uprobe tracing" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits) perf/core: Need CAP_SYS_ADMIN to create k/uprobe with perf_event_open() tracing/uprobe_event: Fix strncpy corner case perf/core: Fix perf_uprobe_init() perf/core: Fix perf_kprobe_init() perf/core: Fix use-after-free in uprobe_perf_close() perf tests clang: Fix function name for clang IR test perf clang: Add support for recent clang versions perf tools: Fix perf builds with clang support perf tools: No need to include namespaces.h in util.h perf hists browser: Remove leftover from row returned from refresh perf hists browser: Show extra_title_lines in the 'D' debug hotkey perf auxtrace: Make auxtrace_queues__add_buffer() do CPU filtering tools headers uapi: Synchronize i915_drm.h perf report: Remove duplicated 'samples' in lost samples warning perf ui browser: Fixup cleaning unused lines at the bottom perf annotate browser: Fixup vertical line separating metrics from instructions perf annotate: Show group details on the title line perf auxtrace: Make auxtrace_queues__add_buffer() allocate struct buffer perf/x86/intel: Move regs->flags EXACT bit init perf trace: Remove redundant ')' ...
2 parents 19ca90d + 32e6e96 commit 174e719

33 files changed

+629
-187
lines changed

arch/x86/events/intel/ds.c

Lines changed: 24 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1153,7 +1153,6 @@ static void setup_pebs_sample_data(struct perf_event *event,
11531153
if (pebs == NULL)
11541154
return;
11551155

1156-
regs->flags &= ~PERF_EFLAGS_EXACT;
11571156
sample_type = event->attr.sample_type;
11581157
dsrc = sample_type & PERF_SAMPLE_DATA_SRC;
11591158

@@ -1197,7 +1196,13 @@ static void setup_pebs_sample_data(struct perf_event *event,
11971196
* and PMI.
11981197
*/
11991198
*regs = *iregs;
1200-
regs->flags = pebs->flags;
1199+
1200+
/*
1201+
* Initialize regs_>flags from PEBS,
1202+
* Clear exact bit (which uses x86 EFLAGS Reserved bit 3),
1203+
* i.e., do not rely on it being zero:
1204+
*/
1205+
regs->flags = pebs->flags & ~PERF_EFLAGS_EXACT;
12011206

12021207
if (sample_type & PERF_SAMPLE_REGS_INTR) {
12031208
regs->ax = pebs->ax;
@@ -1217,10 +1222,6 @@ static void setup_pebs_sample_data(struct perf_event *event,
12171222
regs->sp = pebs->sp;
12181223
}
12191224

1220-
/*
1221-
* Preserve PERF_EFLAGS_VM from set_linear_ip().
1222-
*/
1223-
regs->flags = pebs->flags | (regs->flags & PERF_EFLAGS_VM);
12241225
#ifndef CONFIG_X86_32
12251226
regs->r8 = pebs->r8;
12261227
regs->r9 = pebs->r9;
@@ -1234,20 +1235,33 @@ static void setup_pebs_sample_data(struct perf_event *event,
12341235
}
12351236

12361237
if (event->attr.precise_ip > 1) {
1237-
/* Haswell and later have the eventing IP, so use it: */
1238+
/*
1239+
* Haswell and later processors have an 'eventing IP'
1240+
* (real IP) which fixes the off-by-1 skid in hardware.
1241+
* Use it when precise_ip >= 2 :
1242+
*/
12381243
if (x86_pmu.intel_cap.pebs_format >= 2) {
12391244
set_linear_ip(regs, pebs->real_ip);
12401245
regs->flags |= PERF_EFLAGS_EXACT;
12411246
} else {
1242-
/* Otherwise use PEBS off-by-1 IP: */
1247+
/* Otherwise, use PEBS off-by-1 IP: */
12431248
set_linear_ip(regs, pebs->ip);
12441249

1245-
/* ... and try to fix it up using the LBR entries: */
1250+
/*
1251+
* With precise_ip >= 2, try to fix up the off-by-1 IP
1252+
* using the LBR. If successful, the fixup function
1253+
* corrects regs->ip and calls set_linear_ip() on regs:
1254+
*/
12461255
if (intel_pmu_pebs_fixup_ip(regs))
12471256
regs->flags |= PERF_EFLAGS_EXACT;
12481257
}
1249-
} else
1258+
} else {
1259+
/*
1260+
* When precise_ip == 1, return the PEBS off-by-1 IP,
1261+
* no fixup attempted:
1262+
*/
12501263
set_linear_ip(regs, pebs->ip);
1264+
}
12511265

12521266

12531267
if ((sample_type & (PERF_SAMPLE_ADDR | PERF_SAMPLE_PHYS_ADDR)) &&

kernel/events/core.c

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4447,6 +4447,9 @@ static void _free_event(struct perf_event *event)
44474447
if (event->ctx)
44484448
put_ctx(event->ctx);
44494449

4450+
if (event->hw.target)
4451+
put_task_struct(event->hw.target);
4452+
44504453
exclusive_event_destroy(event);
44514454
module_put(event->pmu->module);
44524455

@@ -8397,6 +8400,10 @@ static int perf_kprobe_event_init(struct perf_event *event)
83978400

83988401
if (event->attr.type != perf_kprobe.type)
83998402
return -ENOENT;
8403+
8404+
if (!capable(CAP_SYS_ADMIN))
8405+
return -EACCES;
8406+
84008407
/*
84018408
* no branch sampling for probe events
84028409
*/
@@ -8434,6 +8441,10 @@ static int perf_uprobe_event_init(struct perf_event *event)
84348441

84358442
if (event->attr.type != perf_uprobe.type)
84368443
return -ENOENT;
8444+
8445+
if (!capable(CAP_SYS_ADMIN))
8446+
return -EACCES;
8447+
84378448
/*
84388449
* no branch sampling for probe events
84398450
*/
@@ -9955,6 +9966,7 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
99559966
* and we cannot use the ctx information because we need the
99569967
* pmu before we get a ctx.
99579968
*/
9969+
get_task_struct(task);
99589970
event->hw.target = task;
99599971
}
99609972

@@ -10070,6 +10082,8 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
1007010082
perf_detach_cgroup(event);
1007110083
if (event->ns)
1007210084
put_pid_ns(event->ns);
10085+
if (event->hw.target)
10086+
put_task_struct(event->hw.target);
1007310087
kfree(event);
1007410088

1007510089
return ERR_PTR(err);

kernel/trace/trace_event_perf.c

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -252,6 +252,8 @@ int perf_kprobe_init(struct perf_event *p_event, bool is_retprobe)
252252
ret = strncpy_from_user(
253253
func, u64_to_user_ptr(p_event->attr.kprobe_func),
254254
KSYM_NAME_LEN);
255+
if (ret == KSYM_NAME_LEN)
256+
ret = -E2BIG;
255257
if (ret < 0)
256258
goto out;
257259

@@ -300,6 +302,8 @@ int perf_uprobe_init(struct perf_event *p_event, bool is_retprobe)
300302
return -ENOMEM;
301303
ret = strncpy_from_user(
302304
path, u64_to_user_ptr(p_event->attr.uprobe_path), PATH_MAX);
305+
if (ret == PATH_MAX)
306+
return -E2BIG;
303307
if (ret < 0)
304308
goto out;
305309
if (path[0] == '\0') {

kernel/trace/trace_uprobe.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,8 @@ static void FETCH_FUNC_NAME(memory, string)(struct pt_regs *regs,
151151
return;
152152

153153
ret = strncpy_from_user(dst, src, maxlen);
154+
if (ret == maxlen)
155+
dst[--ret] = '\0';
154156

155157
if (ret < 0) { /* Failed to fetch string */
156158
((u8 *)get_rloc_data(dest))[0] = '\0';

tools/arch/x86/include/asm/cpufeatures.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -316,6 +316,7 @@
316316
#define X86_FEATURE_VPCLMULQDQ (16*32+10) /* Carry-Less Multiplication Double Quadword */
317317
#define X86_FEATURE_AVX512_VNNI (16*32+11) /* Vector Neural Network Instructions */
318318
#define X86_FEATURE_AVX512_BITALG (16*32+12) /* Support for VPOPCNT[B,W] and VPSHUF-BITQMB instructions */
319+
#define X86_FEATURE_TME (16*32+13) /* Intel Total Memory Encryption */
319320
#define X86_FEATURE_AVX512_VPOPCNTDQ (16*32+14) /* POPCNT for vectors of DW/QW */
320321
#define X86_FEATURE_LA57 (16*32+16) /* 5-level page tables */
321322
#define X86_FEATURE_RDPID (16*32+22) /* RDPID instruction */
@@ -328,6 +329,7 @@
328329
/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */
329330
#define X86_FEATURE_AVX512_4VNNIW (18*32+ 2) /* AVX-512 Neural Network Instructions */
330331
#define X86_FEATURE_AVX512_4FMAPS (18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
332+
#define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */
331333
#define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */
332334
#define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */
333335
#define X86_FEATURE_ARCH_CAPABILITIES (18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */

tools/include/tools/config.h

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
/* SPDX-License-Identifier: GPL-2.0 */
2+
#ifndef _TOOLS_CONFIG_H
3+
#define _TOOLS_CONFIG_H
4+
5+
/* Subset of include/linux/kconfig.h */
6+
7+
#define __ARG_PLACEHOLDER_1 0,
8+
#define __take_second_arg(__ignored, val, ...) val
9+
10+
/*
11+
* Helper macros to use CONFIG_ options in C/CPP expressions. Note that
12+
* these only work with boolean and tristate options.
13+
*/
14+
15+
/*
16+
* Getting something that works in C and CPP for an arg that may or may
17+
* not be defined is tricky. Here, if we have "#define CONFIG_BOOGER 1"
18+
* we match on the placeholder define, insert the "0," for arg1 and generate
19+
* the triplet (0, 1, 0). Then the last step cherry picks the 2nd arg (a one).
20+
* When CONFIG_BOOGER is not defined, we generate a (... 1, 0) pair, and when
21+
* the last step cherry picks the 2nd arg, we get a zero.
22+
*/
23+
#define __is_defined(x) ___is_defined(x)
24+
#define ___is_defined(val) ____is_defined(__ARG_PLACEHOLDER_##val)
25+
#define ____is_defined(arg1_or_junk) __take_second_arg(arg1_or_junk 1, 0)
26+
27+
/*
28+
* IS_BUILTIN(CONFIG_FOO) evaluates to 1 if CONFIG_FOO is set to 'y', 0
29+
* otherwise. For boolean options, this is equivalent to
30+
* IS_ENABLED(CONFIG_FOO).
31+
*/
32+
#define IS_BUILTIN(option) __is_defined(option)
33+
34+
#endif /* _TOOLS_CONFIG_H */

tools/include/uapi/drm/i915_drm.h

Lines changed: 108 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,7 @@ typedef struct _drm_i915_sarea {
318318
#define DRM_I915_PERF_OPEN 0x36
319319
#define DRM_I915_PERF_ADD_CONFIG 0x37
320320
#define DRM_I915_PERF_REMOVE_CONFIG 0x38
321+
#define DRM_I915_QUERY 0x39
321322

322323
#define DRM_IOCTL_I915_INIT DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
323324
#define DRM_IOCTL_I915_FLUSH DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -375,6 +376,7 @@ typedef struct _drm_i915_sarea {
375376
#define DRM_IOCTL_I915_PERF_OPEN DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
376377
#define DRM_IOCTL_I915_PERF_ADD_CONFIG DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
377378
#define DRM_IOCTL_I915_PERF_REMOVE_CONFIG DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
379+
#define DRM_IOCTL_I915_QUERY DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_QUERY, struct drm_i915_query)
378380

379381
/* Allow drivers to submit batchbuffers directly to hardware, relying
380382
* on the security mechanisms provided by hardware.
@@ -1358,7 +1360,9 @@ struct drm_intel_overlay_attrs {
13581360
* active on a given plane.
13591361
*/
13601362

1361-
#define I915_SET_COLORKEY_NONE (1<<0) /* disable color key matching */
1363+
#define I915_SET_COLORKEY_NONE (1<<0) /* Deprecated. Instead set
1364+
* flags==0 to disable colorkeying.
1365+
*/
13621366
#define I915_SET_COLORKEY_DESTINATION (1<<1)
13631367
#define I915_SET_COLORKEY_SOURCE (1<<2)
13641368
struct drm_intel_sprite_colorkey {
@@ -1604,15 +1608,115 @@ struct drm_i915_perf_oa_config {
16041608
__u32 n_flex_regs;
16051609

16061610
/*
1607-
* These fields are pointers to tuples of u32 values (register
1608-
* address, value). For example the expected length of the buffer
1609-
* pointed by mux_regs_ptr is (2 * sizeof(u32) * n_mux_regs).
1611+
* These fields are pointers to tuples of u32 values (register address,
1612+
* value). For example the expected length of the buffer pointed by
1613+
* mux_regs_ptr is (2 * sizeof(u32) * n_mux_regs).
16101614
*/
16111615
__u64 mux_regs_ptr;
16121616
__u64 boolean_regs_ptr;
16131617
__u64 flex_regs_ptr;
16141618
};
16151619

1620+
struct drm_i915_query_item {
1621+
__u64 query_id;
1622+
#define DRM_I915_QUERY_TOPOLOGY_INFO 1
1623+
1624+
/*
1625+
* When set to zero by userspace, this is filled with the size of the
1626+
* data to be written at the data_ptr pointer. The kernel sets this
1627+
* value to a negative value to signal an error on a particular query
1628+
* item.
1629+
*/
1630+
__s32 length;
1631+
1632+
/*
1633+
* Unused for now. Must be cleared to zero.
1634+
*/
1635+
__u32 flags;
1636+
1637+
/*
1638+
* Data will be written at the location pointed by data_ptr when the
1639+
* value of length matches the length of the data to be written by the
1640+
* kernel.
1641+
*/
1642+
__u64 data_ptr;
1643+
};
1644+
1645+
struct drm_i915_query {
1646+
__u32 num_items;
1647+
1648+
/*
1649+
* Unused for now. Must be cleared to zero.
1650+
*/
1651+
__u32 flags;
1652+
1653+
/*
1654+
* This points to an array of num_items drm_i915_query_item structures.
1655+
*/
1656+
__u64 items_ptr;
1657+
};
1658+
1659+
/*
1660+
* Data written by the kernel with query DRM_I915_QUERY_TOPOLOGY_INFO :
1661+
*
1662+
* data: contains the 3 pieces of information :
1663+
*
1664+
* - the slice mask with one bit per slice telling whether a slice is
1665+
* available. The availability of slice X can be queried with the following
1666+
* formula :
1667+
*
1668+
* (data[X / 8] >> (X % 8)) & 1
1669+
*
1670+
* - the subslice mask for each slice with one bit per subslice telling
1671+
* whether a subslice is available. The availability of subslice Y in slice
1672+
* X can be queried with the following formula :
1673+
*
1674+
* (data[subslice_offset +
1675+
* X * subslice_stride +
1676+
* Y / 8] >> (Y % 8)) & 1
1677+
*
1678+
* - the EU mask for each subslice in each slice with one bit per EU telling
1679+
* whether an EU is available. The availability of EU Z in subslice Y in
1680+
* slice X can be queried with the following formula :
1681+
*
1682+
* (data[eu_offset +
1683+
* (X * max_subslices + Y) * eu_stride +
1684+
* Z / 8] >> (Z % 8)) & 1
1685+
*/
1686+
struct drm_i915_query_topology_info {
1687+
/*
1688+
* Unused for now. Must be cleared to zero.
1689+
*/
1690+
__u16 flags;
1691+
1692+
__u16 max_slices;
1693+
__u16 max_subslices;
1694+
__u16 max_eus_per_subslice;
1695+
1696+
/*
1697+
* Offset in data[] at which the subslice masks are stored.
1698+
*/
1699+
__u16 subslice_offset;
1700+
1701+
/*
1702+
* Stride at which each of the subslice masks for each slice are
1703+
* stored.
1704+
*/
1705+
__u16 subslice_stride;
1706+
1707+
/*
1708+
* Offset in data[] at which the EU masks are stored.
1709+
*/
1710+
__u16 eu_offset;
1711+
1712+
/*
1713+
* Stride at which each of the EU masks for each subslice are stored.
1714+
*/
1715+
__u16 eu_stride;
1716+
1717+
__u8 data[];
1718+
};
1719+
16161720
#if defined(__cplusplus)
16171721
}
16181722
#endif

tools/perf/Documentation/perf-report.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ OPTIONS
8080
- comm: command (name) of the task which can be read via /proc/<pid>/comm
8181
- pid: command and tid of the task
8282
- dso: name of library or module executed at the time of sample
83+
- dso_size: size of library or module executed at the time of sample
8384
- symbol: name of function executed at the time of sample
8485
- symbol_size: size of function executed at the time of sample
8586
- parent: name of function matched to the parent regex filter. Unmatched

tools/perf/Documentation/perf-trace.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,6 +117,9 @@ the thread executes on the designated CPUs. Default is to monitor all CPUs.
117117
--sched::
118118
Accrue thread runtime and provide a summary at the end of the session.
119119

120+
--failure::
121+
Show only syscalls that failed, i.e. that returned < 0.
122+
120123
-i::
121124
--input::
122125
Process events from a given perf data file.
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
perf-version(1)
2+
===============
3+
4+
NAME
5+
----
6+
perf-version - display the version of perf binary
7+
8+
SYNOPSIS
9+
--------
10+
'perf version' [--build-options]
11+
12+
DESCRIPTION
13+
-----------
14+
With no options given, the 'perf version' prints the perf version
15+
on the standard output.
16+
17+
If the option '--build-options' is given, then the status of
18+
compiled-in libraries are printed on the standard output.
19+
20+
OPTIONS
21+
-------
22+
--build-options::
23+
Prints the status of compiled-in libraries on the
24+
standard output.

0 commit comments

Comments
 (0)