c++eigencompiler-optimizationeigen3

Eigen library: Return expression instead of the vector itself


Is it possible to return some kind of expression, which then can be further used and simplified, instead of a vector itself?

I have a function Vector3<T> f(Vector3<T> const&, Vector3<T> const&); and I apply it in an expression of the form g = f(a1, b1).cwiseProduct(t1) + f(a2, b2).cwiseProduct(t2). I wonder if this can be optimized. It seems to be unnecessary that a vector is created when f returns. It might be more efficient to return an "expression" instead which then can be optimized for the evaluation of g.


Solution

  • Writing up my comment as it might be clearer as an answer. If you return auto rather than Vector3<T> then the expression type is retained.

    E.g.

    template <typename T>
    auto f(Vector3<T> const& x, Vector3<T> const& y)
    {
        return x + y;
    }
    
    template <typename T>
    auto use_f(Vector3<T> const& a1, Vector3<T> const& b1, Vector3<T> const& t1,
               Vector3<T> const& a2, Vector3<T> const& b2, Vector3<T> const& t2)
    {
        auto f1 = f(a1, b1);
        auto f2 = f(a2, b2);
        auto g = f1.cwiseProduct(t1) + f1.cwiseProduct(t2);
        return g;
    }
    

    Types of f1 & f2:

    Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double,double>,Eigen::Matrix<double,3,1,0,3,1>const,Eigen::Matrix<double,3,1,0,3,1>const>
    Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double,double>,Eigen::Matrix<double,3,1,0,3,1>const,Eigen::Matrix<double,3,1,0,3,1>const>
    

    (this on Visual C++, might be a little different on gcc/clang)

    Thanks to @Homer512, it should be noted that this approach is risky, in that if temporary objects are created in f, the use of auto will effectively return a pointer to a local variable in f, with the local variable going out of scope. In the docs, under "C++11 and the auto keyword", the problematic example given is

    auto z = ((x+y).eval()).transpose();
    // use or return z
    

    problem is that eval() returns a temporary object which is then referenced by the Transpose<> expression. However, this temporary is deleted right after the first line.

    The docs further note that the issue can be fixed by calling eval() on the whole expression; however this is equivalent in our case to returning the Vector3<T>, which is what we were trying to avoid.