c++stringmemory-reallocation

Going ones through string to count length takes longer time, than moving string a couple of times?


I wrote the following two functions. In the second function, I used reserve() so that there is no memory reallocation, but unfortunately the second function is slower than the first.

I used release mode and this CPU profiler in Visual Studio to count time. In the second function, reallocation takes place 33 times. So my question is: Really? Going one length string to count length takes longer time, than moving this string 33 times?

string commpres2(string str)
{
    string strOut;
    int count = 0;
    for (int i = 0; i < str.length(); ++i)
    {
        ++count;
        if (i < str.length() - 1)
        {
            if (str[i + 1] != str[i])
            {
                strOut += str[i];
                strOut += to_string(count);
                count = 0;
            }
        }
        else
        {
            strOut += str[i] + to_string(count);
        }
    }
    return strOut.length() < str.length() ? strOut : str;
}

string commpres3(string str)
{
    int compressedLength = 0;
    int countConsecutive = 0;
    for (int i = 0; i < str.length(); ++i)
    {
        ++countConsecutive;
        if (i + 1 >= str.length() || str[i] != str[i + 1]) 
        {
            compressedLength += 1 + 
                to_string(countConsecutive).length();
            countConsecutive = 0;
        }
    }
    if (compressedLength >= str.length())
        return str;
    string strOut;
    strOut.reserve(compressedLength);
    int count = 0;
    for (int i = 0; i < str.length(); ++i)
    {
        ++count;
        if (i < str.length() - 1)
        {
            if (str[i + 1] != str[i])
            {
                strOut += str[i];
                strOut += to_string(count);
                count = 0;
            }
        }
        else
        {
            strOut += str[i] + to_string(count);
        }
    }
    return strOut;
}

int main()
{
    string str = "aabcccccaaa";

    //str.size ~ 11000000;
    for (int i = 0; i < 20; ++i)
        str += str;
    commpres2(str); //107ms //30,32% CPU
    commpres3(str); //147ms //42,58% CPU
}

Solution

  • The 2nd function is doing more work than the 1st function, so of course it is going to take longer. Profiling the code should have shown you exactly where the code is spending its time. For instance, the 1st function loops through the str at most 1 time, but the 2nd function may loop through the same str 2 times, which by definition takes longer.

    And you haven't eliminated all memory allocations from the 2nd function, either. to_string() allocates memory, and you are calling it many times before and after calling reserve(). Eliminating all of the to_string() allocations is fairly simple, using std::snprintf() into a local buffer and then std::string::append() to add that buffer to your output std::string.

    You could forgo all of the pre-calculating and just reserve() the full str length even if you don't end up using all of that memory. You are not going to use up more than the original str length in the worse case scenario (no compression possible at all):

    inline int to_buffer(size_t number, char *buf, size_t bufsize)
    {
        return snprintf(buf, bufsize, "%zu", number);
    }
    
    string commpres3(const string &str)
    {
        string::size_type strLen = str.length();
        string strOut;
        strOut.reserve(strLen);
        size_t count = 0;
        char buf[25];
        for (string::size_type i = 0; i < strLen; ++i)
        {
            ++count;
            if (i < strLen - 1)
            {
                if (str[i + 1] != str[i])
                {
                    strOut += str[i];
                    strOut.append(buf, to_buffer(count, buf, sizeof(buf)));
                    count = 0;
                }
            }
            else
            {
                strOut += str[i];
                strOut.append(buf, to_buffer(count, buf, sizeof(buf)));
            }
            if (strOut.length() >= strLen)
                return str;
        }
        return strOut;
    }
    

    Or, if you must pre-calculate, you can replace the 1st set of to_string() calls with something else that returns the needed length without allocating memory dynamically (see this for ideas). When calculating the size to reserve, you don't need to actually convert an integer 123 to an allocated string "123" to know that it would take up 3 chars.

    inline int to_buffer(size_t number, char *buf, size_t bufsize)
    {
        return snprintf(buf, bufsize, "%zu", number);
    }
    
    inline int to_buffer_length(size_t number)
    {
        return to_buffer(number, nullptr, 0);
    }
    
    string commpres3(const string &str)
    {
        string::size_type strLen = str.length();
        string::size_type compressedLength = 0;
        size_t countConsecutive = 0;
        for (string::size_type i = 0; i < strLen; ++i)
        {
            ++countConsecutive;
            if (i < (strLen - 1))
            {
                if (str[i + 1] != str[i])
                {
                    strOut += 1 + to_buffer_length(countConsecutive);
                    countConsecutive = 0;
                }
            }
            else
            {
                strOut += 1 + to_buffer_length(countConsecutive);
            }
        }
        if (compressedLength >= strLen)
            return str;
        string strOut;
        strOut.reserve(compressedLength);
        size_t count = 0;
        char buf[25];
        for (string::size_type i = 0; i < strLen; ++i)
        {
            ++count;
            if (i < strLen - 1)
            {
                if (str[i + 1] != str[i])
                {
                    strOut += str[i];
                    strOut.append(buf, to_buffer(count, buf, sizeof(buf)));
                    count = 0;
                }
            }
            else
            {
                strOut += str[i];
                strOut.append(buf, to_buffer(count, buf, sizeof(buf)));
            }
        }
        return strOut;
    }