c++templateseigenref

When is it recommented NOT to use Eigen::Ref for parameters?


I'm currently writing a lot of functions that accept blocks and expressions as input. I generally find it much easier to work with Refs, as they are simple, lightweight, and it's also easy to ensure that the incoming expression fulfills a certain shape (for example, a vector).

At the same time, I imagine that there must be some kind of disadvantage, as otherwise the approach of passing arguments using the template MatrixBase<Derived> would not exist. Still, I could not find any post discussing this topic.

So I ask here: what are the practical disadvantages of using Refs instead of templated function parameters?


Solution

  • There are two types of Ref commonly in use, the mutable Ref<Matrix<…>> for output or in-out parameters and the immutable Ref<const Matrix<…>> for input parameters. Those actually behave a bit differently.

    Inner stride

    Both guarantee that the input matrix or vector has an inner stride of 1 at compile time, meaning that elements, at least in one column, are adjacent to one another. That allows vectorization.

    The way this is achieved, however, is different. The mutable version simply fails to compile if the referenced block has a different stride. The immutable version will instead possibly create a temporary copy. This can have performance and correctness implications.

    Consider this:

    
    double sum(const Eigen::Ref<const Eigen::VectorXd>& in)
    { return in.sum(); }
    
    double foo()
    {
        Eigen::VectorXcd x = …;
        return sum(x.real());
    }
    

    x.real() creates a view into the complex-valued matrix. Since complex values are stored in an interleaved format of real and imaginary components, this view has an inner stride of 2. Therefore the constructor of the Ref object will allocate a VectorXd internally and copy the values into it.

    The same happens here:

    double foo()
    {
        Eigen::MatrixXd x = …;
        return sum(x.row(0));
    }
    

    but it would not happen for x.col(0) since Eigen uses a column-major format by default.

    If you instead wrote the same function using a MatrixBase<Derived>, the sum itself would be specialized for fixed or variable inner stride of the x.real() or x.row() expressions. This would prevent vectorization (well, maybe partial vectorization for x.real().sum() could still be achieved), but also avoid the copy.

    For most use cases, losing vectorization is probably beneficial over creating a copy but this would need testing.

    Arbitrary expressions

    A temporary copy will also be created when the input is an arbitrary expression. For example here:

    double baz()
    {
        Eigen::VectorXd x = …;
        return sum(x * 2.);
    }
    

    Here a temporary vector with the value x * 2. has to be created since the transformation cannot be passed to the sum function. If sum had accepted a MatrixBase<Derived>, the code would instead be specialized for the expression object and the code would run as a fast and efficient as (x * 2.).sum().

    (Partially) fixed dimensions

    In general, when you use Ref, neither Eigen nor the compiler can use information on the specific type or expression used as a parameter. Another example would be passing a Vector4d to the sum function. Here the information is lost that there are always 4 entries, meaning the summation loop could be unrolled.

    In some places, Eigen uses specialized code paths if it is known at compile time that a type has a fixed size in at least one dimension, for example computing the inverse of a small matrix or doing a matrix multiplication with a small matrix or vector on one side.

    Outer stride

    When you use a plain matrix or the column-block of a plain matrix (e.g. matrix.middleCols(start, n) in an expression, Eigen can normally use the information that there is no gap from one column to the next. For simple scalar operations, instead of having a double loop for(col = 0; col < cols; ++col) for(row = 0; row < rows; ++row), the code will be optimized into a single loop for(i = 0; i < rows * cols; ++i). This can improve vectorization, especially for matrices with few rows and many columns.

    This optimization is not possible with Ref since the outer stride may be higher than the row count.

    Alignment

    A minor performance issue is that Ref does not guarantee alignment of the start address. Eigen will assume that the content is misaligned. This is mostly an issue if you compile without AVX extensions because SSE can only fold memory loads and stores into arithmetic operations if they are guaranteed to be aligned. Very old SSE-only hardware also used to be very slow for unaligned memory load/store instructions even when they happened to be aligned at runtime.

    See Alignment and SSE strange behaviour for details.

    Dangling pointers

    As far as correctness goes, these temporary copies can cause issues if you decide to keep a Ref object (or a pointer to the content of a Ref) around for longer than the function call itself. My code used to contain a function like this:

    
    using ConstMapType = Eigen::Map<const Eigen::MatrixXd, Eigen::OuterStride<>>;
    
    // Never do this!
    ConstMapType block_to_map(const Eigen::Ref<const Eigen::MatrixXd>& block)
    {
        return ConstMapType(block.data(), block.rows(), block.cols(),
                            Eigen::OuterStride<>(block.outerStride()));
    }
    

    If you forget how it works and accidentally call this with some expression that causes a temporary copy, the data() pointer will be dangling at the end of the function call.

    Summary

    Unfortunately, you have to assess the performance impact on a case-by-case basis. Will you likely pass a plain matrix/vector or a sub-block? Will they be dynamically sized or can the fixed size be included in the Ref, e.g. Ref<const Matrix4Xd>? Then the overhead will probably be negligible.