Add a field to display the content the raw_data of a synthesized event.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-22-git-send-email-adrian.hunter@intel.com
[ Resolved conflict with 106dacd86f ("perf script: Support -F brstackoff,dso") ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implementing a new --smi-cost mode in perf stat to measure SMI cost.
During the measurement, the /sys/device/cpu/freeze_on_smi will be set.
The measurement can be done with one counter (unhalted core cycles), and
two free running MSR counters (IA32_APERF and SMI_COUNT).
In practice, the percentages of SMI core cycles should be more useful
than absolute value. So the output will be the percentage of SMI core
cycles and SMI#. metric_only will be set by default.
SMI cycles% = (aperf - unhalted core cycles) / aperf
Here is an example output.
Performance counter stats for 'sudo echo ':
SMI cycles% SMI#
0.1% 1
0.010858678 seconds time elapsed
Users who wants to get the actual value can apply additional
--no-metric-only.
Signed-off-by: Kan Liang <Kan.liang@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Elliott <elliott@hpe.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1495825538-5230-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -D/--graph-depth option is to set max graph depth. The following
example traces max 2-depth of page fault handler.
$ sudo perf ftrace -G __do_page_fault -D 2 -- hello
...
0) | __do_page_fault() {
0) 0.063 us | down_read_trylock();
0) 0.251 us | find_vma();
0) 5.374 us | handle_mm_fault();
0) 0.054 us | up_read();
0) 7.463 us | }
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170618142302.25390-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The idea here is to make AutoFDO easier in cloud environment with ASLR.
It's easiest to show how this is useful by example. I built a small test
akin to "while(1) { do_nothing(); }" where the do_nothing function is
loaded from a dso:
$ cat burncpu.cpp
#include <dlfcn.h>
int main() {
void* handle = dlopen("./dso.so", RTLD_LAZY);
if (!handle) return -1;
typedef void (*fp)();
fp do_nothing = (fp) dlsym(handle, "do_nothing");
while(1) {
do_nothing();
}
}
$ cat dso.cpp
extern "C" void do_nothing() {}
$ cat build.sh
#!/bin/bash
g++ -shared dso.cpp -o dso.so
g++ burncpu.cpp -o burncpu -ldl
I sampled the execution of this program with perf record -b.
Using the existing "brstack,dso", we get absolute addresses that are
affected by ASLR, and could be different on different hosts. The address
does not uniquely identify a branch/target in the binary:
$ perf script -F brstack,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
0x7f967139b6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0
Using the existing "brstacksym,dso" is a little better, because the
symbol plus offset and dso name *does* uniquely identify a branch/target
in the binary. Ultimately, however, AutoFDO wants a simple offset into
the binary, so we'd have to undo all the work perf did to symbolize in
the first place:
$ perf script -F brstacksym,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
do_nothing+0x5(/tmp/burncpu/dso.so)/main+0x44(/tmp/burncpu/exe)/P/-/-/0
With the new "brstackoff,dso" we get what we need: a simple offset into a
specific dso/binary that uniquely identifies a branch/target:
$ perf script -F brstackoff,dso | sed 's/\/0 /\/0\n/g' | grep burncpu | grep dso.so | head -n 1
0x6aa(/tmp/burncpu/dso.so)/0x4006b1(/tmp/burncpu/exe)/P/-/-/0
Signed-off-by: Mark Santaniello <marksan@fb.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170619163825.2012979-2-marksan@fb.com
[ Updated documentation about 'brstackoff' using text from above ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With 'perf script' it is common that we just want to add or remove a field.
Currently this requires figuring out the long list of default fields and
specifying them first, and then adding/removing the new field.
This patch adds a new + - syntax to merely add or remove fields,
that allows more succint and clearer command lines
For example to remove the comm field from PMU samples:
Previously
$ perf script -F tid,cpu,time,event,sym,ip,dso,period | head -1
swapper 0 [000] 504345.383126: 1 cycles: ffffffff90060c66 native_write_msr ([kernel.kallsyms])
with the new syntax
perf script -F -comm | head -1
0 [000] 504345.383126: 1 cycles: ffffffff90060c66 native_write_msr ([kernel.kallsyms])
The new syntax cannot be mixed with normal overriding.
v2: Fix example in description. Use tid vs pid. No functional changes.
v3: Don't skip initialization when user specified explicit type.
v4: Rebase. Remove empty line.
Committer testing:
# perf record -a usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.748 MB perf.data (14 samples) ]
Without a explicit field list specified via -F, defaults to:
# perf script | head -2
perf 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
swapper 0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
#
Which is equivalent to:
# perf script -F comm,tid,cpu,time,period,event,ip,sym,dso | head -2
perf 6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
swapper 0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
#
So if we want to remove the comm, as in your original example, we would have to
figure out the default field list and remove ' comm' from it:
# perf script -F tid,cpu,time,period,event,ip,sym,dso | head -2
6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
#
With your patch this becomes simpler, one can remove fields by prefixing them
with '-':
# perf script -F -comm | head -2
6338 [000] 18467.058607: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
0 [001] 18467.058617: 1 cycles: ffffffff89060c36 native_write_msr (/lib/modules/4.11.0-rc8+/build/vmlinux)
#
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Link: http://lkml.kernel.org/r/20170602154810.15875-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Few shell command examples in perf-script-python.txt has few nitpicks
include:
- tools/perf/scripts/python directory listing command is unnecessarily
repeated.
- few examples contain additional information in command prompt
unnecessarily and inconsistently.
This commit fixes them to enhance readability of the document.
Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Fixes: cff68e5822 ("perf/scripts: Add perf-trace-python Documentation")
Link: http://lkml.kernel.org/r/20170530111827.21732-4-sj38.park@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Default function signature of trace_unhandled() got changed to include a
field dict, but its documentation, perf-script-python.txt has not been
updated. Fix it.
Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pierre Tardy <tardyp@gmail.com>
Fixes: c02514850d ("perf scripts python: Give field dict to unhandled callback")
Link: http://lkml.kernel.org/r/20170530111827.21732-6-sj38.park@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This commit fixes wrong code snippets for trace_begin() and trace_end()
function example definition.
Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Fixes: cff68e5822 ("perf/scripts: Add perf-trace-python Documentation")
Link: http://lkml.kernel.org/r/20170530111827.21732-5-sj38.park@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
An example in perf-probe documentation for pattern of function name
based probe addition is not providing example command for that case.
This commit fixes the example to give appropriate example command.
Signed-off-by: SeongJae Park <sj38.park@gmail.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Fixes: ee391de876 ("perf probe: Update perf probe document")
Link: http://lkml.kernel.org/r/20170507103642.30560-1-sj38.park@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Mostly in the documentation.
Signed-off-by: Kim Phillips <kim.phillips@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170503131350.cebeecd8bd0f2968417626ab@arm.com
[ Fix spelling of "parameter" in one of the spell-checked lines ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf trace supports --no-syscalls option but it's not listed in the man
page. (Though, I see an example using --no-syscalls in EXAMPLES
section.)
Committer note:
The --no-syscalls option tells 'perf trace' not to automagically ask for
raw_syscalls:sys_{enter,exit} to then format it in a strace like way.
This become more used as 'perf trace' got support for arbitrary events,
such as tracepoints, so more and more we use:
# perf trace --no-syscalls -e nmi:*
0.000 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 36649 handled: 1)
0.019 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 2907 handled: 0)
0.676 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 9401 handled: 1)
0.680 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 288 handled: 0)
0.701 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 4977 handled: 1)
0.703 nmi:nmi_handler:nmi_cpu_backtrace_handler() delta_ns: 67 handled: 0)
0.736 nmi:nmi_handler:perf_event_nmi_handler() delta_ns: 8549 handled: 1)
^C#
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1492063332-5745-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a minimal description of pipe's data format.
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170410201432.24807-4-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Often it is interesting to know how costly a given source line is in
total. Previously, one had to build these sums manually based on all
addresses that pointed to the same source line. This patch introduces
srcline as a sort key, which will do the aggregation for us.
Paired with the recent addition of showing inline frames, this makes
perf report much more useful for many C++ work loads.
The following shows the new feature in action. First, let's show the
status quo output when we sort by address. The result contains many hist
entries that generate the same output:
~~~~~~~~~~~~~~~~
$ perf report --stdio --inline -g address
# Children Self Command Shared Object Symbol
# ........ ........ ............ ................... .........................................
#
99.89% 35.34% cpp-inlining cpp-inlining [.] main
|
|--64.55%--main complex:655
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/complex:664 (inline)
| |
| |--60.31%--hypot +20
| | |
| | |--8.52%--__hypot_finite +273
| | |
| | |--7.32%--__hypot_finite +411
...
--35.34%--_start +4194346
__libc_start_main +241
|
|--6.65%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
|
|--2.70%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
|
|--1.69%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
...
~~~~~~~~~~~~~~~~
With this patch and `-g srcline` we instead get the following output:
~~~~~~~~~~~~~~~~
$ perf report --stdio --inline -g srcline
# Children Self Command Shared Object Symbol
# ........ ........ ............ ................... .........................................
#
99.89% 35.34% cpp-inlining cpp-inlining [.] main
|
|--64.55%--main complex:655
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/complex:664 (inline)
| |
| |--64.02%--hypot
| | |
| | --59.81%--__hypot_finite
| |
| --0.53%--cabs
|
--35.34%--_start
__libc_start_main
|
|--12.48%--main random.tcc:3326
| /home/milian/projects/kdab/rnd/hotspot/tests/test-clients/cpp-inlining/main.cpp:39 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1809 (inline)
| /usr/include/c++/6.3.1/bits/random.h:1818 (inline)
| /usr/include/c++/6.3.1/bits/random.h:185 (inline)
...
~~~~~~~~~~~~~~~~
Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20170318214928.9047-1-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It takes some time to look for inline stack for callgraph addresses. So
it provides new option "--inline" to let user decide if enable this
feature.
--inline:
If a callgraph address belongs to an inlined function, the inline stack
will be printed. Each entry is the inline function name or file/line.
Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Link: http://lkml.kernel.org/r/1490474069-15823-4-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 40218daea1 ("perf list: Show SDT and pre-cached events") added
sdt support in perf list, but it missed to update documentation.
Show sdt option in man perf-list.
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20170327025538.1753-1-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the printing of perf expressions and internal events to a new
clearer --details flag, instead of lumping it together with other debug
options in --debug. This makes it clearer to use.
Before
perf list --debug
...
unc_m_power_critical_throttle_cycles
[Cycles all ranks are in critical thermal throttle. Unit: uncore_imc]
uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100.
after
perf list --details
...
unc_m_power_critical_throttle_cycles
[Cycles all ranks are in critical thermal throttle. Unit: uncore_imc]
uncore_imc_2/event=0x86/ MetricName: power_critical_throttle_cycles % MetricExpr: (unc_m_power_critical_throttle_cycles / unc_m_clockticks) * 100.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/r/20170320201711.14142-14-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The uncore PMU has a lot of duplicated PMUs for different subsystems.
When expanding an uncore alias we usually end up with a large
number of identically named aliases, which makes perf stat
output difficult to read.
Automatically sum them up in perf stat, unless --no-merge is specified.
This can be default because only the uncores generally have duplicated
aliases. Other PMUs have unique names.
Before:
% perf stat --no-merge -a -e unc_c_llc_lookup.any sleep 1
Performance counter stats for 'system wide':
694,976 Bytes unc_c_llc_lookup.any
706,304 Bytes unc_c_llc_lookup.any
956,608 Bytes unc_c_llc_lookup.any
782,720 Bytes unc_c_llc_lookup.any
605,696 Bytes unc_c_llc_lookup.any
442,816 Bytes unc_c_llc_lookup.any
659,328 Bytes unc_c_llc_lookup.any
509,312 Bytes unc_c_llc_lookup.any
263,936 Bytes unc_c_llc_lookup.any
592,448 Bytes unc_c_llc_lookup.any
672,448 Bytes unc_c_llc_lookup.any
608,640 Bytes unc_c_llc_lookup.any
641,024 Bytes unc_c_llc_lookup.any
856,896 Bytes unc_c_llc_lookup.any
808,832 Bytes unc_c_llc_lookup.any
684,864 Bytes unc_c_llc_lookup.any
710,464 Bytes unc_c_llc_lookup.any
538,304 Bytes unc_c_llc_lookup.any
1.002577660 seconds time elapsed
After:
% perf stat -a -e unc_c_llc_lookup.any sleep 1
Performance counter stats for 'system wide':
2,685,120 Bytes unc_c_llc_lookup.any
1.002648032 seconds time elapsed
v2: Split collect_aliases. Rename alias flag.
v3: Make sure unsupported/not counted is always printed.
v4: Factor out callback change into separate patch.
v5: Move check for bad results here
Move merged check into collect_data
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170320201711.14142-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Description of --no-aggr in perf-stat man page is outdated. --no-aggr
can also be used while profiling specific set of cpus. For ex,
$ perf stat -e cycles,instructions -C 1-2 --no-aggr -- sleep 1
Performance counter stats for 'CPU(s) 1-2':
CPU1 5,94,92,795 cycles
CPU2 2,69,72,403 cycles
CPU1 2,02,08,327 instructions # 0.34 insn per cycle
CPU2 73,17,123 instructions # 0.12 insn per cycle
1.000989132 seconds time elapsed
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1490013438-5713-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement printing instruction sequences as hex dump for branch stacks.
This relies on the x86 instruction decoder used by the PT decoder to
find the lengths of instructions to dump them individually.
This is good enough for pattern matching.
This allows to study hot paths for individual samples, together with
branch misprediction and cycle count / IPC information if available (on
Skylake systems).
% perf record -b ...
% perf script -F brstackinsn
...
read_hpet+67:
ffffffff9905b843 insn: 74 ea # PRED
ffffffff9905b82f insn: 85 c9
ffffffff9905b831 insn: 74 12
ffffffff9905b833 insn: f3 90
ffffffff9905b835 insn: 48 8b 0f
ffffffff9905b838 insn: 48 89 ca
ffffffff9905b83b insn: 48 c1 ea 20
ffffffff9905b83f insn: 39 f2
ffffffff9905b841 insn: 89 d0
ffffffff9905b843 insn: 74 ea # PRED
Only works when no special branch filters are specified.
Occasionally the path does not reach up to the sample IP, as the LBRs
may be frozen before executing a final jump. In this case we print a
special message.
The instruction dumper piggy backs on the existing infrastructure from
the IP PT decoder.
An earlier iteration of this patch relied on a disassembler, but this
version only uses the existing instruction decoder.
Committer note:
Added hint about how to get suitable perf.data files for use with
'-F brstackinsm':
$ perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (8 samples) ]
$
$ perf script -F brstackinsn
Display of branch stack assembler requested, but non all-branch filter set
Hint: run 'perf record -b ...'
$
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Link: http://lkml.kernel.org/r/20170223234634.583-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This patch introduces a cgroup identifier entry field in perf report to
identify or distinguish data of different cgroups. It uses the device
number and inode number of cgroup namespace, included in perf data with
the new PERF_RECORD_NAMESPACES event, as cgroup identifier.
With the assumption that each container is created with it's own cgroup
namespace, this allows assessment/analysis of multiple containers at
once.
A simple test for this would be to clone a few processes passing
SIGCHILD & CLONE_NEWCROUP flags to each of them, execute shell and run
different workloads on each of those contexts, while running perf
record command with --namespaces option.
Shown below is the output of perf report, sorted with cgroup identifier,
on perf.data generated with the above test scenario, clearly indicating
one context's considerable use of kernel memory in comparison with
others:
$ perf report -s cgroup_id,sample --stdio
#
# Total Lost Samples: 0
#
# Samples: 5K of event 'kmem:kmalloc'
# Event count (approx.): 5965
#
# Overhead cgroup id (dev/inode) Samples
# ........ ..................... ............
#
81.27% 3/0xeffffffb 4848
16.24% 3/0xf00000d0 969
1.16% 3/0xf00000ce 69
0.82% 3/0xf00000cf 49
0.50% 0/0x0 30
While this is a start, there is further scope of improving this. For
example, instead of cgroup namespace's device and inode numbers, dev
and inode numbers of some or all namespaces may be used to distinguish
which processes are running in a given container context.
Also, scripts to map device and inode info to containers sounds
plausible for better tracing of containers.
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891933338.25309.756882900782042645.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce a new option to record PERF_RECORD_NAMESPACES events emitted
by the kernel when fork, clone, setns or unshare are invoked. And update
perf-record documentation with the new option to record namespace
events.
Committer notes:
Combined it with a later patch to allow printing it via 'perf report -D'
and be able to test the feature introduced in this patch. Had to move
here also perf_ns__name(), that was introduced in another later patch.
Also used PRIu64 and PRIx64 to fix the build in some enfironments wrt:
util/event.c:1129:39: error: format '%lx' expects argument of type 'long unsigned int', but argument 6 has type 'long long unsigned int' [-Werror=format=]
ret += fprintf(fp, "%u/%s: %lu/0x%lx%s", idx
^
Testing it:
# perf record --namespaces -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.083 MB perf.data (423 samples) ]
#
# perf report -D
<SNIP>
3 2028902078892 0x115140 [0xa0]: PERF_RECORD_NAMESPACES 14783/14783 - nr_namespaces: 7
[0/net: 3/0xf0000081, 1/uts: 3/0xeffffffe, 2/ipc: 3/0xefffffff, 3/pid: 3/0xeffffffc,
4/user: 3/0xeffffffd, 5/mnt: 3/0xf0000000, 6/cgroup: 3/0xeffffffb]
0x1151e0 [0x30]: event: 9
.
. ... raw event: size 48 bytes
. 0000: 09 00 00 00 02 00 30 00 c4 71 82 68 0c 7f 00 00 ......0..q.h....
. 0010: a9 39 00 00 a9 39 00 00 94 28 fe 63 d8 01 00 00 .9...9...(.c....
. 0020: 03 00 00 00 00 00 00 00 ce c4 02 00 00 00 00 00 ................
<SNIP>
NAMESPACES events: 1
<SNIP>
#
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Ananth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
Cc: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/148891930386.25309.18412039920746995488.stgit@hbathini.in.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 2f3f9bcf00 ("perf tools: Add +field argument support for
--field option") by Jiri Olsa <jolsa@kernel.org> introduced +field style
argument support for --field option.
This is useful but not updated documentation. This add a little
description there.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170313083252.23644-1-changbin.du@intel.com
[ Slightly improved the phrase structure ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -a/--all-cpus and -C/--cpu option is for controlling tracing cpus.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170224011251.14946-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The -p (--pid) option enables to trace existing process by its pid.
Committer notes:
Testing it:
Using the function_graph tracer on a process that is just waiting for user
input and thus will make 'perf ftrace' sit there waiting for that, then press
any key on that mutt session and see what happens:
# perf ftrace -t function_graph -p `pidof mutt` | head -40
2) 1.038 us | switch_mm_irqs_off();
------------------------------------------
2) <idle>-0 => mutt-3595
------------------------------------------
2) | finish_task_switch() {
2) | smp_irq_work_interrupt() {
2) | irq_enter() {
2) 0.180 us | rcu_irq_enter();
2) 1.248 us | }
2) | __wake_up() {
2) 0.126 us | _raw_spin_lock_irqsave();
2) | __wake_up_common() {
2) | pollwake() {
2) | default_wake_function() {
2) | try_to_wake_up() {
2) 0.662 us | _raw_spin_lock_irqsave();
2) | select_task_rq_fair() {
2) 1.719 us | effective_load.isra.41();
2) 1.343 us | effective_load.isra.41();
2) | select_idle_sibling() {
2) 0.331 us | idle_cpu();
2) 1.458 us | }
2) 8.350 us | }
2) 0.200 us | _raw_spin_lock();
2) | ttwu_do_activate() {
2) | activate_task() {
2) 0.136 us | update_rq_clock.part.77();
2) | enqueue_task_fair() {
2) | enqueue_entity() {
2) 0.146 us | update_curr();
2) 0.330 us | account_entity_enqueue();
2) 0.280 us | update_cfs_shares();
2) 0.321 us | place_entity();
2) 0.206 us | __enqueue_entity();
2) 6.926 us | }
2) | enqueue_entity() {
2) 0.105 us | update_curr();
2) 0.175 us | account_entity_enqueue();
2) 0.531 us | update_cfs_shares();
#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170224011251.14946-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add new sort key 'symbol_size' to allow user to sort by symbol size, or
(more usefully) display the symbol size using --fields=...,symbol_size.
Committer note:
Testing it together with the recently added -q, to remove the headers,
and using the '+' sign with -s, to add the symbol_size sort order to
the default, which is '-s/--sort comm,dso,symbol':
# perf report -q -s +symbol_size | head -10
10.39% swapper [kernel.vmlinux] [k] intel_idle 270
3.45% swapper [kernel.vmlinux] [k] update_blocked_averages 1546
2.61% swapper [kernel.vmlinux] [k] update_load_avg 1292
2.36% swapper [kernel.vmlinux] [k] update_cfs_shares 240
1.83% swapper [kernel.vmlinux] [k] __hrtimer_run_queues 606
1.74% swapper [kernel.vmlinux] [k] update_cfs_rq_load_avg. 1187
1.66% swapper [kernel.vmlinux] [k] apic_timer_interrupt 152
1.60% CPU 0/KVM [kvm] [k] kvm_set_msr_common 3046
1.60% gnome-shell libglib-2.0.so.0 [.] g_slist_find 37
1.46% gnome-termina libglib-2.0.so.0 [.] g_hash_table_lookup 370
#
Signed-off-by: Charles Baylis <charles.baylis@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Maxim Kuvyrkov <maxim.kuvyrkov@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1487943176-13840-1-git-send-email-charles.baylis@linaro.org
[ Use symbol__size(), remove needless %lld + (long long) casting ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix typos and add the following to the scripts/spelling.txt:
an user||a user
an userspace||a userspace
I also added "userspace" to the list since it is a common word in Linux.
I found some instances for "an userfaultfd", but I did not add it to the
list. I felt it is endless to find words that start with "user" such as
"userland" etc., so must draw a line somewhere.
Link: http://lkml.kernel.org/r/1481573103-11329-4-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The -q/--quiet option is to suppress any message. Sometimes users just
want to see the numbers and it can be used for that case.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: kernel-team@lge.com
Link: http://lkml.kernel.org/r/20170217081742.17417-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Running 'perf record' with no target (-a, -p, -t, etc) will now collect
system wide data.
Commiter notes:
Testing it:
[root@jouet ~]# perf record
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.351 MB perf.data (366 samples) ]
#
is equivalent to:
# perf record -a
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.411 MB perf.data (978 samples) ]
#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170217170018.GA15389@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Boris asked for default -a option in case we monitor only uncore events.
While implementing that I thought it might be actually useful to make it
overall default.
Running 'perf stat' will now collect system wide data.
Committer note:
Testing it:
# perf stat
^C
Performance counter stats for 'system wide':
3571.559178 cpu-clock (msec) # 4.000 CPUs utilized
3,346 context-switches # 0.937 K/sec
277 cpu-migrations # 0.078 K/sec
57,271 page-faults # 0.016 M/sec
4,535,633,835 cycles # 1.270 GHz
6,389,736,516 instructions # 1.41 insn per cycle
1,541,293,875 branches # 431.547 M/sec
14,526,396 branch-misses # 0.94% of all branches
0.892950118 seconds time elapsed
#
Requested-and-Acked-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170217170034.GB15389@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The "delta-abs" compute method will show most changed entries on top.
So users can easily see how much effect between the data. Note that it
also changes the default of -o option to 1 in order to apply the compute
method. To see original-style (sorted by baseline) use -o 0 option.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170210161856.18422-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The diff.compute config variable is to set the default compute method of
perf diff command (-c option). Possible values 'delta' (default),
'delta-abs', 'ratio' and 'wdiff'.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20170210073614.24584-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In many cases, I need to look at differences between two data so I often
used the -o option to sort the result base on the difference first.
It'd be nice to have a config option to set it by default.
The diff.order config option is to set the default value of -o/--order
option.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Taeung Song <treeze.taeung@gmail.com>
Link: http://lkml.kernel.org/r/20170210073614.24584-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It seems to be the most used argument for -c option so far. In the
beginning when you want to have the overall process report, so it makes
sense to make it the default one.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Don Zickus <dzickus@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1484904032-11040-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The --state option is to show task state when switched out. The state
is printed as a single character like in the /proc but I added 'I' for
idle state rather than 'R'.
$ perf sched timehist --state | head
Samples do not have callchains.
time cpu task name wait time sch delay run time state
[tid/pid] (msec) (msec) (msec)
-------- --- ----------------------- -------- ------------------ -----
1.753791 [3] <idle> 0.000 0.000 0.000 I
1.753834 [1] perf[27469] 0.000 0.000 0.000 S
1.753904 [3] perf[27470] 0.000 0.006 0.112 S
1.753914 [1] <idle> 0.000 0.000 0.079 I
1.753915 [3] migration/3[23] 0.000 0.002 0.011 S
1.754287 [2] <idle> 0.000 0.000 0.000 I
1.754335 [2] transmission[1773/1739] 0.000 0.004 0.047 S
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170113104523.31212-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The "--dump-raw-script" is not a valid option, replace it with the valid
one, "--dump-raw-trace"
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 133dc4c39c ("perf: Rename 'perf trace' to 'perf script'")
LPU-Reference: 728644547.14560155.1484320012612.JavaMail.zimbra@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It's now possible to specify the threshold time for perf.data like:
$ perf record --switch-output=30s ...
Once it's reached, the current data are dumped in to the
perf.data.<timestamp> file and session does on.
$ perf record --switch-output=30s ...
[ perf record: dump data: Woken up 44 times ]
[ perf record: Dump perf.data.2017010213043746 ]
...
The time is expected to be a number with appended unit
character - s/m/h/d.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Wang Nan <wangnan0@huawei.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1483955520-29063-7-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>