c++stringdynamic-memory-allocationstring-concatenationtheory

std::string concatenation in C++


What is the principle of std::string concatenation in C++ ? How does it works in memory allocation ?

I found out while exploring a leetcode card that in Java:
"concatenation works by first allocating enough space for the new string, copy the contents from the old string and append to the new string".
And unlike Java, "there is no noticeable performance impact in C++".

I thought that it is the same in C++ because of how dynamic arrays work. As I remember if you want to increase the capacity of a dynamic array, the program first creates a new space, the size of which is equal to the desired size, and then copies all the elements of old array to the new.


Solution

  • There are two operators in C++ that can be used for std::string concatenation:

    Binary operator std::string result = a + b compiles to:

    result.reserve(a.size() + b.size());
    result.append(a);
    result.append(b);
    

    Assignment operator a += b compiles to:

    a.append(b);
    

    Both of these may result in memory allocation, unless the result string happens to have enough allocated space already. This may be either due to prior allocation such as by reserve() call, or due to small string optimization that stores strings up to about 15 bytes directly in the structure on the stack.

    The significant difference between Java and C++ is the += operator (or .append()) call, which modifies the string in place. Java strings are immutable so this is not available there. When modifying the existing string, C++ usually allocates more space than immediately needed, so that every call does not result in a new memory allocation. Then the existing part of the string does not need to be copied, only the appended part.

    The claim that "there is no noticeable performance impact in C++" is not true if you have a lot of strings or they are long. Unlike Java where StringBuilder gives a speed advantage, C++ does not have a direct equivalent. Calling .reserve() ahead of time will give roughly equivalent performance. For best performance you would reorganize the code to completely avoid the copying required for appending, such as by using a rope data structure.