Performance counters are special hardware registers available on most modern CPUs. These registers count the number of certain types of hw events: such as instructions executed, cache-misses suffered, or branches mispredicted - without slowing down the kernel or applications. These registers can also trigger interrupts when a threshold number of events have passed - and can thus be used to profile the code that runs on that CPU.
The Linux Performance Counter subsystem provides rich abstractions over these hardware capabilities. It provides per task, per CPU and per-workload counters, counter groups, and it provides sampling capabilities on top of those - and more.
It also provides abstraction for 'software events' - such as minor/major page faults, task migrations, task context-switches and tracepoints.
There is a new tool ('perf') that makes full use of this new kernel subsystem. It can be used to optimize, validate and measure applications, workloads or the full system.
'perf' is hosted in the upstream kernel repository and can be found under: tools/perf/
Contents[hide] |
Perf uses breakpoint from different sources that handle the register scheduling, thread/cpu attachment, etc.
ptrace kgdb ftrace perf syscall \ | / / \ | / / / Core breakpoint API / / | / | / Breakpoints perf events
That's why, to fully use perf, you have to activate all this module such as Ftrace in the kernel configuration.
Actually, perf tool cannot be cross compile due to his different library needed. At this time, he can be build on OMAP with ubuntu installed.Prior to compile it, install the libelf library which is needed for the installation and execution of perf.
# apt-get install libelf-dev# make # make install
Futhermore, the following flags has to be activate into the kernel configuration :
* PERF_EVENTS* PERF_COUNTERS.
Once you have installed 'perf' on your system, the simplest way to start profiling an userspace program is to use the "perf record" and "perf report" command as follows:
$ perf record -f -- git gc Counting objects: 1283571, done.Compressing objects: 100% (206724/206724), done.Writing objects: 100% (1283571/1283571), done.Total 1283571 (delta 1070675), reused 1281443 (delta 1068566)[ perf record: Captured and wrote 31.054 MB perf.data (~1356768 samples) ] $ perf report --sort comm,dso,symbol | head -10# Samples: 1355726## Overhead Command Shared Object Symbol# ........ ............... ....................................... ......# 31.53% git /usr/bin/git [.] 0x0000000009804f 13.41% git-prune /usr/bin/git-prune [.] 0x000000000ad06d 10.05% git /lib/tls/i686/cmov/libc-2.8.90.so [.] _nl_make_l10nflist 5.36% git-prune /usr/lib/libz.so.1.2.3.3 [.] 0x00000000009d51 4.48% git /lib/tls/i686/cmov/libc-2.8.90.so [.] memcpy
# perf list[...]kmem:kmalloc [Tracepoint event]kmem:kmem_cache_alloc [Tracepoint event]kmem:kmalloc_node [Tracepoint event]kmem:kmem_cache_alloc_node [Tracepoint event]kmem:kfree [Tracepoint event]kmem:kmem_cache_free [Tracepoint event]kmem:mm_page_free_direct [Tracepoint event]kmem:mm_pagevec_free [Tracepoint event]kmem:mm_page_alloc [Tracepoint event]kmem:mm_page_alloc_zone_locked [Tracepoint event]kmem:mm_page_pcpu_drain [Tracepoint event]
Then any (or all) of the above event sources can be activated and measured. For example the page alloc/free properties of a 'hackbench run' are:
# perf stat -e kmem:mm_page_pcpu_drain -e kmem:mm_page_alloc -e kmem:mm_pagevec_free -e kmem:mm_page_free_direct ./hackbench 10Time: 0.575Performance counter stats for './hackbench 10': 13857 kmem:mm_page_pcpu_drain 27576 kmem:mm_page_alloc 6025 kmem:mm_pagevec_free 20934 kmem:mm_page_free_direct 0.613972165 seconds time elapsed
You can observe the statistical properties as well, by using the 'repeat the workload N times' feature of perf stat:
# perf stat --repeat 5 -e kmem:mm_page_pcpu_drain -e kmem:mm_page_alloc -e kmem:mm_pagevec_free -e kmem:mm_page_free_direct ./hackbench 10Time: 0.627Time: 0.644Time: 0.564Time: 0.559Time: 0.626Performance counter stats for './hackbench 10' (5 runs): 12920 kmem:mm_page_pcpu_drain ( +- 3.359% ) 25035 kmem:mm_page_alloc ( +- 3.783% ) 6104 kmem:mm_pagevec_free ( +- 0.934% ) 18376 kmem:mm_page_free_direct ( +- 4.941% ) 0.643954516 seconds time elapsed ( +- 2.363% )
Recently, a support was added for using perl and python scripts with the perf tool. Interpreters for both perl and python can be embedded into the perf executable, which allows processing the raw perf trace data stream in either of those languages.
Multiple different example scripts are provided with perf, which can be listed from perf itself:
# perf trace -lList of available trace scripts:syscall-counts [comm] system-wide syscall countssyscall-counts-by-pid [comm] system-wide syscall counts, by pidfailed-syscalls-by-pid [comm] system-wide failed syscalls, by pidworkqueue-stats workqueue stats (ins/exe/create/destroy)check-perf-trace useless but exhaustive test scriptfailed-syscalls [comm] system-wide failed syscallswakeup-latency system-wide min/max/avg wakeup latencyrw-by-file <comm> r/w activity for a program, by filerw-by-pid system-wide r/w activity
This list is a mix of perl and python scripts that live in the tools/perf/scripts/{perl,python}
The installed scripts can be used as follows:
# perf trace record failed-syscalls ^C[ perf record: Woken up 11 times to write data ] [ perf record: Captured and wrote 1.939 MB perf.data (~84709 samples) ] # perf trace report failed-syscalls perf trace started with Perl script
/root/libexec/perf-core/scripts/perl/failed-syscalls.pl
failed syscalls, by comm: comm # errors -------------------- ---------- firefox 1721 claws-mail 149 konsole 99 X 77 emacs 56 [...] failed syscalls, by syscall: syscall # errors ------------------------------ ---------- sys_read 2042 sys_futex 130 sys_mmap_pgoff 71 sys_access 33 sys_stat64 5 sys_inotify_add_watch 4 [...]
联系客服