Is optimization applied to single-line functions?

I don't like to repeat myself in code, but also I don't want to lose performance by simple functions. Suppose the class has operator+ and function Add with same functionality (considering former as handy way of using class in expressions and latter as "expilicit" way to do so)

struct Obj {
   Obj operator+(float);
   Obj Add(float);
   /* some other state and behaviour */
};

Obj AddDetails(Obj const& a, float b) {
   return Obj(a.float_val + b, a.some_other_stuff);
}

Obj Obj::operator+(float b) {
   return AddDetails(*this, b);
}

Obj Obj::Add(float b) {
   return AddDetails(*this, b);
}

For the purpose of making changes easier both functions are implemented with auxiliary function call. Therefore, any call to operator makes 2 calls what is not really pleasant.

But is compiler smart enough to eliminate such double calls?

I tested with simple classes (that contain built-in types and pointers) and optimizer just doesn't calculate something not needed, but how does it behave in large systems (with hot calls especially)?

If this is where RVO takes place, then does it work in larger sequences of calls (3-4) to fold it in 1 call?

P.S. Yes, yes, premature optimization is the root of all evil, but still I want an answer

Solution

Overall

Yes See the instructions clang generated on https://godbolt.org/z/VB23-W Line 21

   movsd   xmm0, qword ptr [rsp]   # xmm0 = mem[0],zero
   addsd   xmm0, qword ptr [rip + .LCPI3_0]

it just takes the applies the code of AddDetails directly instead of even calling your operator+. This is called inlining and worked even for this chain of value returning calls.

Details

Not only RVO optimisation can happen to single line functions but every other optimisation including inlining see https://godbolt.org/z/miX3u1 and https://godbolt.org/z/tNaSW .

Look at this you can see gcc and clang heavily optimises even the non inlined declared code, ( https://godbolt.org/z/8Wf3oR )

#include <iostream>

struct Obj {
    Obj(double val) : float_val(val) {}
    Obj operator+(float b) {
        return AddDetails(*this, b);
    }
    Obj Add(float b) {
        return AddDetails(*this, b);
    }
    double val() const {
        return float_val;
    }
private:
    double float_val{0};
    static inline Obj AddDetails(Obj const& a, float b);
};

Obj Obj::AddDetails(Obj const& a, float b) {
    return Obj(a.float_val + b);
}


int main() {
    Obj foo{32};
    Obj bar{foo + 1337};
    std::cout << bar.val() << "\n";
}

Even without inlining no extra C-Tor Calls can be seen with

#include <iostream>

struct Obj {
    Obj(double val) : float_val(val) {}
    Obj operator+(float);
    Obj Add(float);
    double val() const {
        return float_val;
    }
private:
    double float_val{0};
    static Obj AddDetails(Obj const& a, float b);
};

Obj Obj::AddDetails(Obj const& a, float b) {
    return Obj(a.float_val + b);
}

Obj Obj::operator+(float b) {
    return AddDetails(*this, b);
}

Obj Obj::Add(float b) {
    return AddDetails(*this, b);
}

int main() {
    Obj foo{32};
    Obj bar{foo + 1337};
    std::cout << bar.val() << "\n";
}

However some of the optimisation is done due to the compiler knowing that the value won't change so lets change the main to

int main() {
    double d{};
    std::cin >> d;
    Obj foo{d};
    Obj bar{foo + 1337};
    std::cout << bar.val() << "\n";
}

But then you can still see the optimisations on both compilers https://godbolt.org/z/M2jaSH and https://godbolt.org/z/OyQfJI