Skip to content

Commit 78bdb4e

Browse files
committed
Write docs
1 parent 6fc53d4 commit 78bdb4e

File tree

1 file changed

+56
-0
lines changed

1 file changed

+56
-0
lines changed

src/LinuxPerf.jl

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -656,6 +656,62 @@ scaledcount(counter::Counter) = counter.value * (counter.enabled / counter.runni
656656
@pstats [options] expr
657657
658658
Run `expr` and gather its performance statistics.
659+
660+
This macro basically measures the number of occurrences of events such as CPU
661+
cycles, branch prediction misses, page faults, and so on. The list of
662+
supported events can be shown by calling the `LinuxPerf.list` function.
663+
664+
Due to the resource limitation of performance measuring units (PMUs)
665+
installed in a CPU core, all events may not be measured simultaneously,
666+
resulting in multiplexing several groups of events in a single measurement.
667+
If the running time is extremely short, some event groups may not be measured
668+
at all.
669+
670+
The result is shown in a table. Each row consists of four columns: an event
671+
group indicator, an event name, a scaled count and a running rate. A comment
672+
may follow these columns after a hash (#) character.
673+
1. The event group indicated by a bracket is a set of events that are
674+
measured simultaneously so that their count statistics can be meaningfully
675+
compared.
676+
2. The event name is a conventional name of the measured event.
677+
3. The scaled count is the number of occurrences of the event, scaled by the
678+
reciprocal of the running rate.
679+
4. The running rate is the ratio of the time of running and enabled.
680+
681+
The macro can take some options. If a string object is passed, it is a
682+
comma-separated list of event names to measure. An event group can be
683+
indicated by a pair of parentheses.
684+
685+
# Examples
686+
687+
```
688+
julia> xs = randn(1_000_000);
689+
690+
julia> sort(xs[1:9]); # compile
691+
692+
julia> @pstats sort(xs)
693+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
694+
┌ cpu-cycles 2.57e+08 48.6% # 3.8 cycles per ns
695+
│ stalled-cycles-frontend 1.10e+07 48.6% # 4.3% of cycles
696+
└ stalled-cycles-backend 2.48e+06 48.6% # 1.0% of cycles
697+
┌ instructions 1.84e+08 51.4% # 0.7 insns per cycle
698+
│ branch-instructions 3.73e+07 51.4% # 20.2% of instructions
699+
└ branch-misses 7.92e+06 51.4% # 21.2% of branch instructions
700+
┌ task-clock 6.75e+07 100.0%
701+
│ context-switches 0.00e+00 100.0%
702+
│ cpu-migrations 0.00e+00 100.0%
703+
└ page-faults 1.95e+03 100.0%
704+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
705+
706+
julia> @pstats "(cpu-cycles,instructions,branch-instructions,branch-misses),page-faults" sort(xs)
707+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
708+
┌ cpu-cycles 2.61e+08 100.0% # 3.9 cycles per ns
709+
│ instructions 1.80e+08 100.0% # 0.7 insns per cycle
710+
│ branch-instructions 3.64e+07 100.0% # 20.2% of instructions
711+
└ branch-misses 8.32e+06 100.0% # 22.8% of branch instructions
712+
╶ page-faults 0.00e+00 100.0%
713+
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
714+
```
659715
"""
660716
macro pstats(args...)
661717
if isempty(args)

0 commit comments

Comments
 (0)