I was trying to count the duration taken by a for loop using std::chrono but it gives 0 nanoseconds even if I make the loop take longer by incrementing the bound value, this is the code:
#pragma pack(1) // don't align let's let it take longer
struct Foo{
int x;
char c;
int z;
} ;
void take_time()
{
Foo f;
auto t1 = std::chrono::system_clock::now();
register int c = 0;
int x=0,y=0,z=1;
for (c=0;c<10000;c++){ // even if I put 1000000000 it will take 0 nanosec !!!!!
f.z = x+y;
f.z += z-x+y;
}
std::cout<<"\ntoken time : "<< std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::system_clock::now()-t1).count()<<std::endl;;
}
output:
token time : 0
but when I increment the bound of the loop's counter to very very huge value it suddenly takes forever! If I put c<100000000 it takes 0 nanosec but if I add one '0' on the right it takes forever!
the answer:
as WhiZTiM said, the compiler is removing the loop because it does nothing useful (thanks gcc), but we really don't want that to happen when we are testing to algorithms to see which one is faster on different compilers (and not this specific one), to do so we can insert an asm line into the loop. asm("")
, an empty asm, anywhere in the loop. That will tell the compiler that there is some low level operations that it can't optimize! Or we can use the volatile keyword for any variable used in the loop that prevents the compiler from doing any optimization related to that variable. thanks everyone I hope this helps
First of all, using initialized variables is a sin.
The Optimizer definitely figured out the loop was useless(really, what should be the values of x
, y
, z
in the loop); and result of the loop wasn't used (no side effects), so it removed the loop in the generated code.
void take_time()
{
Foo f;
auto t1 = std::chrono::system_clock::now();
register int c = 0;
int x,y,z;
///// Result not used
for (c=0;c<10000;c++){ // even if i put 1000000000 it will take 0 nanosec !!!!!
f.z = x+y;
f.z += z-x+y;
}
/// We can discard the above
std::cout<<"\ntoken time : "<< std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::system_clock::now()-t1).count()<<std::endl;;
}
BTW, the register
keyword there is deprecated.
For GCC and clang, there's a way to "scare" the optimizer from optimizing out the use of certain variables. I use this function:
template<typename T>
void scareTheOptimizer(T& x){
asm volatile("" :: "p"((volatile void*)&x) : "memory");
}
So, when you call that in your loop, you should see some timing now.
void take_time()
{
Foo f;
auto t1 = std::chrono::system_clock::now();
int c = 0;
int x=0,y=0,z=1;
for (c=0;c<10000;c++){
f.z = x+y;
scareTheOptimizer(f.z); /// <---- Added Here
f.z += z-x+y;
}
std::cout<<"\ntoken time : "<< std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::system_clock::now()-t1).count()<<std::endl;;
}
See it Live On Coliru