Is it possible to return some kind of expression, which then can be further used and simplified, instead of a vector itself?
I have a function Vector3<T> f(Vector3<T> const&, Vector3<T> const&);
and I apply it in an expression of the form g = f(a1, b1).cwiseProduct(t1) + f(a2, b2).cwiseProduct(t2)
. I wonder if this can be optimized. It seems to be unnecessary that a vector is created when f
returns. It might be more efficient to return an "expression" instead which then can be optimized for the evaluation of g
.
Writing up my comment as it might be clearer as an answer. If you return auto
rather than Vector3<T>
then the expression type is retained.
E.g.
template <typename T>
auto f(Vector3<T> const& x, Vector3<T> const& y)
{
return x + y;
}
template <typename T>
auto use_f(Vector3<T> const& a1, Vector3<T> const& b1, Vector3<T> const& t1,
Vector3<T> const& a2, Vector3<T> const& b2, Vector3<T> const& t2)
{
auto f1 = f(a1, b1);
auto f2 = f(a2, b2);
auto g = f1.cwiseProduct(t1) + f1.cwiseProduct(t2);
return g;
}
Types of f1
& f2
:
Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double,double>,Eigen::Matrix<double,3,1,0,3,1>const,Eigen::Matrix<double,3,1,0,3,1>const>
Eigen::CwiseBinaryOp<Eigen::internal::scalar_sum_op<double,double>,Eigen::Matrix<double,3,1,0,3,1>const,Eigen::Matrix<double,3,1,0,3,1>const>
(this on Visual C++, might be a little different on gcc/clang)
Thanks to @Homer512, it should be noted that this approach is risky, in that if temporary objects are created in f
, the use of auto
will effectively return a pointer to a local variable in f
, with the local variable going out of scope. In the docs, under "C++11 and the auto keyword", the problematic example given is
auto z = ((x+y).eval()).transpose();
// use or return z
problem is that eval() returns a temporary object which is then referenced by the Transpose<> expression. However, this temporary is deleted right after the first line.
The docs further note that the issue can be fixed by calling eval()
on the whole expression; however this is equivalent in our case to returning the Vector3<T>
, which is what we were trying to avoid.