Suppose I have a std::tuple
:
std::tuple<int,int,int,int> t = {1,2,3,4};
and I want to use std::tie
just for readability purpose like that:
int a, b, c, d; // in real context these names would be meaningful
std::tie(a, b, c, d) = t;
vs. just using t.get<int>(0)
, etc.
Would a GCC optimize the memory use of this tuple or would it allocate additional space for a, b, c, d
variables?
In this case I don't see any reason why not, under the as-if rule the compiler only has to emulate the observable behavior of the program. A quick experiment using godbolt:
#include <tuple>
#include <cstdio>
void func( int x1, int x2,int x3, int x4)
{
std::tuple<int,int,int,int> t{x1,x2,x3,x4};
int a, b, c, d; // in real context these names would be meaningful
std::tie(a, b, c, d) = t;
printf( "%d %d %d %d\n", a, b, c, d ) ;
}
shows that gcc does indeed optimize it away:
func(int, int, int, int):
movl %ecx, %r8d
xorl %eax, %eax
movl %edx, %ecx
movl %esi, %edx
movl %edi, %esi
movl $.LC0, %edi
jmp printf
On the other hand if you used a address of t
and printed it out, we now have observable behavior which relies on t
existing (see it live):
printf( "%p\n", static_cast<void*>(&t) );
and we can see that gcc no longer optimizes away the t
:
movl %esi, 12(%rsp)
leaq 16(%rsp), %rsi
movd 12(%rsp), %xmm1
movl %edi, 12(%rsp)
movl $.LC0, %edi
movd 12(%rsp), %xmm2
movl %ecx, 12(%rsp)
movd 12(%rsp), %xmm0
movl %edx, 12(%rsp)
movd 12(%rsp), %xmm3
punpckldq %xmm2, %xmm1
punpckldq %xmm3, %xmm0
punpcklqdq %xmm1, %xmm0
At the end of the day you need to look at what the compiler generates and profile your code, in more complicated cases it may surprise you. Just because the compiler is allowed to do certain optimizations does not mean it will. I have looked at more complicated cases where the compiler does not do what I would expect with std::tuple
. godbolt is a very helpful tool here, I can not count how many optimizations assumptions I used to have that were upended by plugging in simple examples into godbolt.
Note, I typically use printf
in these examples because iostreams generates a lot of code that gets in the way of the example.