Using 'perf' to profile your program in Fedora

Installing perf

If perf is not already on your system, run this command:

  $ sudo yum install perf

On my system that installed: perf-3.7.9-101.fc17.x86_64

Running your application to collect data

One of the nice things about perf is that you don't need to compile your executable in any special way. You can run perf no matter how you built your executable (as far as I know). To profile your application, do this:

  $ perf stat ./bigint_test

   Performance counter stats for './bigint_test':

        11461.887097 task-clock                #    1.015 CPUs utilized        
                 155 context-switches              #    0.014 K/sec                
                   0 cpu-migrations                  #    0.000 K/sec                
                 335 page-faults                      #    0.029 K/sec                
     <not supported> cycles
                   0 stalled-cycles-frontend      #    0.00% frontend cycles idle  
     <not supported> stalled-cycles-backend
     <not supported> instructions
     <not supported> branches
     <not supported> branch-misses

        11.290034097 seconds time elapsed

I'm running Fedora in a VM and have not enabled all of the performance pass-through options. That means I don't get to see some of the performance profile data. My roommate is working on computing very large prime numbers and this data is hugely valuable for him. For me, I don't use this output from perf much. I prefer to run perf in this way...

  $ perf record ./bigint_test
  [ perf record: Woken up 5 times to write data ]
  [ perf record: Captured and wrote 1.351 MB (~59033 samples) ]

In my VM I'm doing my development in a directory mounted from the host OS. This allows me to share files easily between the host OS and the guest OS (Fedora). One drawback to this is that the file ownership of that mounted directory is not me. perf complains if I try to examine the file when I don't own it, so I move the file to my home directory and then examine it.

  $ mv ~
  $ cd
  $ perf report
   61.98%  bigint_test  bigint_test        [.] BigInt::validate(char const*, int) const
   10.18%  bigint_test  bigint_test        [.] BigInt::import(int)
    3.92%  bigint_test  bigint_test        [.] BigInt::isDivisibleBy(int) const
    2.86%  bigint_test  bigint_test        [.] BigInt::compare(BigInt const&) const
    2.49%  bigint_test  bigint_test        [.] BigInt::BigInt(int)
    2.28%  bigint_test  bigint_test        [.] BigInt::extendBuffer(unsigned int)
    2.00%  bigint_test  bigint_test        [.] BigInt::length() const
    1.60%  bigint_test  bigint_test        [.] testDivisibility(unsigned int)
    1.58%  bigint_test  bigint_test        [.] BigInt::~BigInt()
    1.49%  bigint_test       [.] __memcpy_ssse3_back
    1.38%  bigint_test  bigint_test        [.] BigInt::subtractStrings(char*, char const*)
    1.26%  bigint_test  bigint_test        [.] BigInt::addStrings(char*, char const*)
    0.84%  bigint_test  bigint_test        [.] BigInt::isZero() const
    0.70%  bigint_test  bigint_test        [.] BigInt::operator=(BigInt const&)

Now I have some really useful information. BigInt::validate() is taking up 61.98% of the time. This is a debug-only routine that is only used in this test executable and does not appear in my development executables. I can ignore this. But, I see that BigInt::import() is taking a lot of CPU time. I should go look at how I can optimize that routine. And, each time I try an optimization I should re-run the performance metrics. Things that I think will be optimizations often make no difference or actually cost me time.

Analyzing the data


Popular posts from this blog

Programming language notes, links

Questions to ask about yourself, your job, your company

Patching VMware Tools in Fedora 18