Using 'perf' to profile your program in Fedora

February 25, 2013

Installing perf

If perf is not already on your system, run this command:

$ sudo yum install perf

On my system that installed: perf-3.7.9-101.fc17.x86_64

Running your application to collect data

One of the nice things about perf is that you don't need to compile your executable in any special way. You can run perf no matter how you built your executable (as far as I know). To profile your application, do this:

$ perf stat ./bigint_test

Performance counter stats for './bigint_test':

11461.887097 task-clock # 1.015 CPUs utilized
155 context-switches # 0.014 K/sec
0 cpu-migrations # 0.000 K/sec
335 page-faults # 0.029 K/sec
<not supported> cycles
0 stalled-cycles-frontend # 0.00% frontend cycles idle
<not supported> stalled-cycles-backend
<not supported> instructions
<not supported> branches
<not supported> branch-misses

11.290034097 seconds time elapsed

I'm running Fedora in a VM and have not enabled all of the performance pass-through options. That means I don't get to see some of the performance profile data. My roommate is working on computing very large prime numbers and this data is hugely valuable for him. For me, I don't use this output from perf much. I prefer to run perf in this way...

$ perf record ./bigint_test

[ perf record: Woken up 5 times to write data ]

[ perf record: Captured and wrote 1.351 MB perf.data (~59033 samples) ]

In my VM I'm doing my development in a directory mounted from the host OS. This allows me to share files easily between the host OS and the guest OS (Fedora). One drawback to this is that the file ownership of that mounted directory is not me. perf complains if I try to examine the perf.data file when I don't own it, so I move the file to my home directory and then examine it.

$ mv perf.data ~

$ cd

$ perf report

61.98% bigint_test bigint_test [.] BigInt::validate(char const*, int) const

10.18% bigint_test bigint_test [.] BigInt::import(int)

3.92% bigint_test bigint_test [.] BigInt::isDivisibleBy(int) const

2.86% bigint_test bigint_test [.] BigInt::compare(BigInt const&) const

2.49% bigint_test bigint_test [.] BigInt::BigInt(int)

2.28% bigint_test bigint_test [.] BigInt::extendBuffer(unsigned int)

2.00% bigint_test bigint_test [.] BigInt::length() const

1.60% bigint_test bigint_test [.] testDivisibility(unsigned int)

1.58% bigint_test bigint_test [.] BigInt::~BigInt()

1.49% bigint_test libc-2.15.so [.] __memcpy_ssse3_back

1.38% bigint_test bigint_test [.] BigInt::subtractStrings(char*, char const*)

1.26% bigint_test bigint_test [.] BigInt::addStrings(char*, char const*)

0.84% bigint_test bigint_test [.] BigInt::isZero() const

0.70% bigint_test bigint_test [.] BigInt::operator=(BigInt const&)

[...]

Now I have some really useful information. BigInt::validate() is taking up 61.98% of the time. This is a debug-only routine that is only used in this test executable and does not appear in my development executables. I can ignore this. But, I see that BigInt::import() is taking a lot of CPU time. I should go look at how I can optimize that routine. And, each time I try an optimization I should re-run the performance metrics. Things that I think will be optimizations often make no difference or actually cost me time.

Search This Blog

Development