c++optimizationc++11return-by-referencereturn-by-value

value semantics vs output params with large data structures


2013 Keynote: Chandler Carruth: Optimizing the Emergent Structures of C++

Can anyone put this in the context of this answer: https://stackoverflow.com/a/14229152

I hear that being repeated on and on but, well, for me a function returning something is a source. Output parameters by reference take that characteristic out of a function, and removing such hard coded characteristic from a function allows one to manage outside instead, how output will be stored/reused.

My question is, even in the context of that SO answer, is there a way to tell, restructuring the code some other equivalent way, "ok now see, value semantics in this way doesn't lose for the output param version", or Chandler comments were specific to some contrived situations? I had even seen Andrei Alexandrescu arguing this in a talk and telling you can't escape using by ref output for better performance.

For another take on Andrei's comments see Eric Niebler: Out Parameters, Move Semantics, and Stateful Algorithms.


Solution

  • It's either an overstatement, generalization, a joke, or Chandler's idea of "Perfectly Reasonable Performance" (using modern C++ toolchains/libs) is unacceptable to my programs.

    I find it a fairly narrow scope of optimizations. Penalties exist beyond that scope which cannot be ignored due to the actual complexities and designs found in programs - heap allocations were an example for the getline example. The specific optimizations may or may not always be applicable to the program in question, despite your attempts to reduce them. Real world structures will reference memory which may alias. You can reduce that, but it's not practical to believe that you can eliminate aliasing (from an optimizer's perspective).

    Of course, RBV can be a great thing - it's just not appropriate for all cases. Even the link you referenced pointed out how one can avoid a ton of allocations/frees. Real programs and the data structures found in them are far more complex.

    Later in the talk, he goes on to criticize the use of member functions (ref: S::compute()). Sure, there is a point to take away, but is it really reasonable to avoid using these language features entirely because it makes the optimizer's job easier? No. Will it always result in more readable programs? No. Will these code transformations always result in measurably faster programs? No. Are the changes required to transform your codebase worth the time you invest? Sometimes. Can you take away some points and make better informed decisions which impact some of your existing or future codebase? Yes.

    Sometimes it helps to break down how exactly your program would execute, or what it would look like in C.

    The optimizer will not solve all performance problems, and you should not rewrite programs with the assumption that the programs you are dealing with are "completely brain dead and broken designs", nor should you believe that using RBV will always result in "Perfectly Reasonable Performance". You can utilize new language features and make the optimizer's job easier, though there is much to gain there are often more important optimizations to invest your time on.

    It's fine to consider the proposed changes; ideally you would measure the impact of such changes in real world execution times and impact on your source code before adopting these suggestions.

    For your example: Even copying+assigning large structures by value can have significant costs. Beyond the cost of running constructors and destructors (along with their associated creation/cleanup of the resources they acquire and own, as pointed out in the link you reference), even things as simple as avoiding unnecessary structure copies can save you a ton of CPU time if you use references (where appropriate). The structure copy may be as simple as a memcpy. These aren't contrived problems; they appear in actual programs and the complexity can increase greatly with your program's complexity. Is reducing aliasing of some memory and other optimizations worth the costs, and does it result in "Perfectly Reasonable Performance"? Not always.