c++stringvectorreserved

How to reserve a vector of strings, if string size is variable?


I want to add many strings to a vector, and from what I've found, calling reserve() before this is more efficient. For a vector of ints, this makes sense because int is 4 bytes, so calling reserve(10) clearly reserves 40 bytes. I know the number of strings, which is about 60000. Should I call vector.reserve(60000)? How would the compiler know the size of my strings, as it doesn't know if these strings are of length 5 or 500?


Solution

  • The compiler doesn't know the size of the strings, it knows the size of std::string object. Now, the size of std::string object does not depend on size of string. That is because - most of the time [1] - std::string will allocate on heap, so the object itself is only a pointer and length.

    This also means, when you reserve the vector, you don't yet reserve memory for the strings. This is, however, not always a problem. std::strings come from somewhere: if the strings you receive are the return value of a function (i.e., you have them by value), then the memory is already allocated for the string (in the return value). Thus, e.g. std::swap() can help you speeding up populating the array with the results.

    If however you populate it using passing references, then the callee will do the operations that result in alloc. In this case, you'd likely want to loop over the vector and reserve each string:

    std::vector<std::string> v;
    v.reserve(60000); // expected number of strings
    for (auto& s : v) {
        s.reserve(500); // expected/max. size of strings
    }
    

    [1] It might occur that the specific implementation of std::string actually has a small, fixed-size buffer for sort strings and thus allocates only on heap when the string is longer than that.