I am familiar with two approaches, but both of them have their limitations.
The first one is to use the instruction RDTSC
. However, the problem is that it doesn't count the number of cycles of my program in isolation and is therefore sensitive to noise due to concurrent processes.
The second option is to use the clock
library function. I thought that this approach is reliable, since I expected it to count the number of cycles for my program only (what I intend to achieve). However, it turns out that in my case it measures the elapsed time and then multiplies it by CLOCKS_PER_SEC
. This is not only unreliable, but also wrong, since CLOCKS_PER_SEC
is set to 1,000,000
which does not correspond to the actual frequency of my processor.
Given the limitation of the proposed approaches, is there a better and more reliable alternative to produce consistent results?
A lot here depends on how large an amount of time you're trying to measure.
RDTSC can be (almost) 100% reliable when used correctly. It is, however, of use primarily for measuring truly microscopic pieces of code. If you want to measure two sequences of, say, a few dozen or so instructions apiece, there's probably nothing else that can do the job nearly as well.
Using it correctly is somewhat challenging though. Generally speaking, to get good measurements you want to do at least the following:
If, on the other hand, you're trying to measure something that takes anywhere from, say, 100 ms on up, RDTSC
is pointless. It's like trying to measure the distance between cities with a micrometer. For this, it's generally best to assure that the code in question takes (at least) the better part of a second or so. clock
isn't particularly precise, but for a length of time on this general order, the fact that it might only be accurate to, say, 10 ms or so, is more or less irrelevant.