c++performanceerror-handlingc++20stringstream

Boolean testing vs exception in stringstream


I want to short-circuit when an I/O operation on a std::istringstream/std::ostringstream fails, to avoid unnecessarily calling the rest of any chained <</>> operators on the already failed (x)stringstream object. Some of my overloaded <</>> operators do a bit of non-trivial formatting/parsing, so calling them on a failed stream object will waste their CPU cycles.

The resulting ostringstream (oss in the below code) is then inserted into a std::ofstream in one piece. I've used std::cout instead for demonstration.

Two approaches come to my mind. Below is a very minimal example.

  1. testing with the bool operator:
#include <bitset>
#include <sstream>
#include <iostream>
#include <cstdlib>


int main()
{
    constexpr auto delimiter { ';' };

    std::ostringstream oss;

    const bool success = oss << "Hi" &&
                         oss << delimiter &&
                         oss << 8.02 &&
                         oss << delimiter &&
                         oss << false &&
                         oss << delimiter &&
                         oss << std::bitset<10> { 0b01010 } &&
                         oss << delimiter &&
                         oss << '\n';

    if ( success ) std::cout << oss.view(); // Hi;8.02;0;0000001010;
    else return EXIT_FAILURE;
}
  1. throwing an exception:
int main()
{
    constexpr auto delimiter { ';' };

    std::ostringstream oss;
    oss.exceptions( std::ios::failbit | std::ios::badbit );

    try
    {
        oss << "Hi"
            << delimiter
            << 8.02
            << delimiter
            << false
            << delimiter
            << std::bitset<10> { 0b01010 }
            << delimiter
            << '\n';
    }
    catch ( const std::ios_base::failure& e )
    {
        return EXIT_FAILURE;
    }

    std::cout << oss.view(); // Hi;8.02;0;0000001010;
}

Obviously, the compiler generates different code for the two approaches (with the exception-based approach having noticeably less code), though I'm not sure which one can be faster in the above use case.

Which one might be a better solution? I must insert/extract ~20 values to/from the stream objects. It could be a gain in efficiency if the right choice were made.


Solution

  • Since the question "what is faster" is unanswerable, I'll try to explain what happens in each case. Still, I'd like to underline strongly that this answer should be taken as academic consideration and you should not introduce changes in code based only on what you read here. Profile your code, find chokepoints, fix them and when in doubt about possible fixes, benchmark them (or implement both and profile again). Don't ever optimise based on a hunch that something may be slow.

    The question that you have to answer is "what am I trying to prevent?". If you are trying to prevent calling the body of built-in operator<< when stream is in failed state, that's already done for you, as Pete noted in comments. That means, nothing will try to convert numeric value to string representation nor any string will be copied into the stream nor any operation on output or files will happen. If you do overload operator<< for your own types and these overloads do significant work before handing results to built-in operator, that work will still happen). Example:

    std::ostream& operator<<(std::ostream& os, const MyType1& t) {
        return os << t.a << ' ' << t.b << ", " << t.c; // almost nothing is done if !os
    }
    
    std::ostream& operator<<(std::ostream& os, const MyType2& t) {
        return os << t.toString(); // `toString` has to be evaluated fully before os is checked for fail
    }
    

    If you however want to prevent evaluation of operand to operator<< (e.g., construction of temporary object), that's much more problematic. Your first approach does prevent construction of any object like that std::bitset<10> after oss goes into failed state. This comes at a price of adding multiple jumps into assembly, and jumps aren't good for performance either. Your second approach is only guaranteed to do same in C++17 and higher, due to new order of evaluation rules that add sequencing to operator <<. Before C++17, it's unspecified whether all the remaining objects are created or not.

    So, to sum up:

    1. Approach 1 does prevent evaluation of operands once the stream failed, so it does prevent extra object constructions or calling toString() in above example. The price is adding more branches in assembly code, which generally isn't the best for performance (compiler may or may not optimise to assume it's unlikely for stream to fail).
    2. Since C++17, approach 2 does prevent evaluation of operands one after the stream has failed. One operand right after fail will still be evaluated, since it only throws when next read happens on a failed stream. The price is adding exceptions to code, which are usually very expensive.
    3. Before C++17, approach 2 gives no guarantees about evaluation of arguments. Compiler may choose to evaluate all operands before ever doing anything with oss if it decides it would be faster.

    To show it on example, let's assume this output failed somehow:

        const bool success = oss << "Hi" &&
                             oss << delimiter && // this one failed
                             oss << std::bitset<10> { 0b01010 } &&
                             oss << delimiter &&
                             oss << std::string(13, 'n');
                             oss << '\n';
    

    Approach 1 will not create std::bitset or str::string object, approach 2 will create std::bitset and throw immediately after in C++17 (avoiding creation of std::string), before C++17 approach 2 depends on the compiler - std::bitset will be created, std::string may or may not be.