When a C function has to return multiple values, there's a few ways to go about that.
Right now I'm interested in the relative efficiency of two of those methods:
a) bundle the values in a struct foo. Populate a local foo, and return that.
b) pass pointers to be populated.
(I'm working on some legacy code that has a mix of the two.)
For the purposes of this post:
Obviously inlining would make the question moot.
Can the different methods affect the compiler's ability to inline?
If not inlined, will there be a performance difference between the two methods?
Can placement of a pointer-to-return-val parameters in the function arguments have an effect? Either on the compiler's ability to inline, or on non-inlined performance?
Edited (a) for clarity.
On Linux / x86-64, a struct
with exactly two words (e.g. two pointers or two intptr_t
or two long
-s) is returned in two registers. This is a lot faster than e.g. malloc
-ing it, and might be faster than writing a two words struct
allocated on the call stack by the caller (then it is likely to be in some fast CPU cache; remember that on recent processors a cache miss may take hundreds of nanoseconds, or the time needed for a hundred of register to register integer addition machine instructions)
But inlining a function is not always faster. You could also use partial evaluation techniques or C++ code generation.
With a recent GCC compiler, consider also compiling all C or C++ files and linking with link-time optimization (e.g. -flto -O2
)