c++performanceprofiling

What should I check: cpu time or wall time?


I have two algorithms to do the same task. To examine their performance, what should I check: cpu time or wall time? I think it is cpu time, right?

I am doing parallelism of my code. To check my parallelism performance, what should I check: cpu time or wall time? I think it is wall time, right?

Assume I have done an ideal parallelism using multi-threads. I think the cpu time for 1 thread will be same as 8 threads, and the wall time for 1 threas will be 8 times longer than the 8 threads. Is it right?

Also any easy way to check those times?


Solution

  • The answer depends on what you're really trying to measure.

    If you have a couple small code sequences where each runs on a single CPU (i.e., it's basically single-threaded) and you want to know which is faster, you probably want CPU time. This will tell you the time taken to execute that code, without counting other things like I/O, task switches, time spent on other processes, interrupt handling, etc. [Note: although it attempts to ignore other facts, you'll still usually get the most accurate results with the system otherwise as quiescent as possible.]

    If you're writing multi-threaded code and want to measure how well you're distributing your code across processors/cores, you'll probably measure both CPU time and wall time, and compare the two. If, for example, you have 4 cores available, your ideal would be that the wall time is 1/4th the CPU time.

    So, for multithreaded code you'll often end up doing things in two phases: first you look at the time to execute using one thread, using CPU time. You optimize to get that to (reasonable) minimum. Then in the second phase, you compare wall time to CPU time, to try to use multiple cores efficiently. Since changing one often affects the other, you may well iterate through the two a number of times (and often compromise between the two to some degree).

    Just as a really general rule of thumb, you tend to use CPU time to measure microscopic benchmarks of individual bits of code, and wall time for larger (system-level) benchmarks. In other words, when you want to measure how fast one piece of code runs, and nothing else, CPU time generally make the most sense. When you want to include the effects of things like disk I/O time, caching, etc., then you're a lot more likely to care about wall time.