c++c++11move-semanticsrvalue-referenceref-qualifier

How to improve the efficiency of "str1 + str2 + str3 + ..." in C++14?


std::string Concatenate(const std::string& s1,
                        const std::string& s2,
                        const std::string& s3,
                        const std::string& s4,
                        const std::string& s5)
{
    return s1 + s2 + s3 + s4 + s5;
}

By default, return s1 + s2 + s3 + s4 + s5; may be equivalent to the following code:

auto t1 = s1 + s2; // Allocation 1
auto t2 = t1 + s3; // Allocation 2
auto t3 = t2 + s4; // Allocation 3

return t3 + s5; // Allocation 4

Is there an elegant way to reduce the allocation times to 1? I mean keeping return s1 + s2 + s3 + s4 + s5; not changed, but the efficiency is improved automatically. If it is possible, it can also avoid the programmer misusing std::string::operator +.

Does ref-qualifier member functions help?


Solution

  • The premise of the question that:

    s1 + s2 + s3 + s4 + s5 + ... + sn
    

    will require n allocations is incorrect.

    Instead it will require O(Log(n)) allocations. The first s1 + s1 will generate a temporary. Subsequently a temporary (rvalue) will be the left argument to all subsequent + operations. The standard specifies that when the lhs of a string + is an rvalue, that the implementation simply append to that temporary and move it out:

    operator+(basic_string<charT,traits,Allocator>&& lhs,
              const basic_string<charT,traits,Allocator>& rhs);
    
    Returns: std::move(lhs.append(rhs))
    

    The standard also specifies that the capacity of the string will grow geometrically (a factor between 1.5 and 2 is common). So on every allocation, capacity will grow geometrically, and that capacity is propagated down the chain of + operations. More specifically, the original code:

    s = s1 + s2 + s3 + s4 + s5 + ... + sn;
    

    is actually equivalent to:

    s = s1 + s2;
    s += s3;
    s += s4;
    s += s5;
    // ...
    s += sn;
    

    When geometric capacity growth is combined with the short string optimization, the value of "pre-reserving" the correct capacity is limited. I would only bother doing that if such code actually shows up as a hot spot in your performance testing.