c++c++17language-lawyerorder-of-execution

What are the evaluation order guarantees introduced by C++17?


What are the implications of the voted in C++17 evaluation order guarantees (P0145) on typical C++ code?

What does it change about things like the following?

i = 1;
f(i++, i)

and

std::cout << f() << f() << f();

or

f(g(), h(), j());

Solution

  • Some common cases where the evaluation order has so far been unspecified, are specified and valid with C++17. Some undefined behaviour is now instead unspecified.

    i = 1;
    f(i++, i)
    

    was undefined, but it is now unspecified. Specifically, what is not specified is the order in which each argument to f is evaluated relative to the others. i++ might be evaluated before i, or vice-versa. Indeed, it might evaluate a second call in a different order, despite being under the same compiler.

    However, the evaluation of each argument is required to execute completely, with all side-effects, before the execution of any other argument. So you might get f(1, 1) (second argument evaluated first) or f(1, 2) (first argument evaluated first). But you will never get f(2, 2) or anything else of that nature.

    std::cout << f() << f() << f();
    

    was unspecified, but it will become compatible with operator precedence so that the first evaluation of f will come first in the stream (examples below).

    f(g(), h(), j());
    

    still has unspecified evaluation order of g, h, and j. Note that for getf()(g(),h(),j()), the rules state that getf() will be evaluated before g, h, j.

    Also note the following example from the proposal text:

     std::string s = "but I have heard it works even if you don't believe in it"
     s.replace(0, 4, "").replace(s.find("even"), 4, "only")
      .replace(s.find(" don't"), 6, "");
    

    The example comes from The C++ Programming Language, 4th edition, Stroustrup, and used to be unspecified behaviour, but with C++17 it will work as expected. There were similar issues with resumable functions (.then( . . . )).

    As another example, consider the following:

    #include <iostream>
    #include <string>
    #include <vector>
    #include <cassert>
    
    struct Speaker{
        int i =0;
        Speaker(std::vector<std::string> words) :words(words) {}
        std::vector<std::string> words;
        std::string operator()(){
            assert(words.size()>0);
            if(i==words.size()) i=0;
            // Pre-C++17 version:
            auto word = words[i] + (i+1==words.size()?"\n":",");
            ++i;
            return word;
            // Still not possible with C++17:
            // return words[i++] + (i==words.size()?"\n":",");
    
        }
    };
    
    int main() {
        auto spk = Speaker{{"All", "Work", "and", "no", "play"}};
        std::cout << spk() << spk() << spk() << spk() << spk() ;
    }
    

    With C++14 and before we may (and will) get results such as

    play
    no,and,Work,All,
    

    instead of

    All,work,and,no,play
    

    Note that the above is in effect the same as

    (((((std::cout << spk()) << spk()) << spk()) << spk()) << spk()) ;
    

    But still, before C++17 there was no guarantee that the first calls would come first into the stream.

    References: From the accepted proposal:

    Postfix expressions are evaluated from left to right. This includes functions calls and member selection expressions.

    Assignment expressions are evaluated from right to left. This includes compound assignments.

    Operands to shift operators are evaluated from left to right. In summary, the following expressions are evaluated in the order a, then b, then c, then d:

    1. a.b
    2. a->b
    3. a->*b
    4. a(b1, b2, b3)
    5. b @= a
    6. a[b]
    7. a << b
    8. a >> b

    Furthermore, we suggest the following additional rule: the order of evaluation of an expression involving an overloaded operator is determined by the order associated with the corresponding built-in operator, not the rules for function calls.

    Edit note: My original answer misinterpreted a(b1, b2, b3). The order of b1, b2, b3 is still unspecified. (thank you @KABoissonneault, all commenters.)

    However, (as @Yakk points out) and this is important: Even when b1, b2, b3 are non-trivial expressions, each of them are completely evaluated and tied to the respective function parameter before the other ones are started to be evaluated. The standard states this like this:

    §5.2.2 - Function call 5.2.2.4:

    . . . The postfix-expression is sequenced before each expression in the expression-list and any default argument. Every value computation and side effect associated with the initialization of a parameter, and the initialization itself, is sequenced before every value computation and side effect associated with the initialization of any subsequent parameter.

    However, one of these new sentences are missing from the GitHub draft:

    Every value computation and side effect associated with the initialization of a parameter, and the initialization itself, is sequenced before every value computation and side effect associated with the initialization of any subsequent parameter.

    The example is there. It solves a decades-old problems (as explained by Herb Sutter) with exception safety where things like

    f(std::unique_ptr<A> a, std::unique_ptr<B> b);
    
    f(get_raw_a(), get_raw_a());
    

    would leak if one of the calls get_raw_a() would throw before the other raw pointer was tied to its smart pointer parameter.

    As pointed out by T.C., the example is flawed since unique_ptr construction from raw pointer is explicit, preventing this from compiling.*

    Also note this classical question (tagged C, not C++):

    int x=0;
    x++ + ++x;
    

    is still undefined.