performancematlabeigen

Performance Issues with EIGEN vs Matlab with elementwise operations


I am rewriting some Code that is written in Matlab to C++ with the usage of EIGEN (in Windows 10, Visual Studio 2022). In one particular case I am having a huge performance drop with EIGEN which I am not able to fix.

A minimal example is the following in Matlab:

r     = [1,2,3,4,5];
init  = ones(9, 1);
M     = zeros(9, 5);

tic
for i = 1:1E6
    M = repmat(init, [1, length(r)]);
    M = r.*M + init;
end
toc

>> Elapsed time is 0.605376 seconds.

In words: I am first replicating the column vector init to fill the matrix M. Then I am multiplying a row vector r elementwise to M and add init to it.

My C++ Version looks like this:

RowVector<double, 5> r = { 1.0, 2.0, 3.0, 4.0, 5.0 };
Vector<double, 9> init = {};
init.setOnes();
Matrix<double, 9, 5> M = {};

std::chrono::steady_clock::time_point start = std::chrono::steady_clock::now();
for (int i = 0; i < 1E6; i++) {
    M = init.replicate(1, r.size());

    M.array().rowwise() *= r.array();
    M.array().colwise() += init.array();
}
std::chrono::steady_clock::time_point terminate = std::chrono::steady_clock::now();

std::cout << "Duration:" << std::chrono::duration_cast<std::chrono::milliseconds>(terminate - start).count() << "[ms]" << std::endl;

>> Duration:13130[ms]

Obsiously Matlabs Version is way smarter than my C++ Version. What am I doing wrong? I guess it could have something to do with MKL which is used by Matlab. I installed MKL as well and it doesn't seem to have a performance impact at all in this case.


Solution

  • In your code,

    M = repmat(init, [1, length(r)]);
    M = r.*M + init;
    

    r.*M is identical to r.*init. Adding one init to that is the same as incrementing r by one first. So, your MATLAB code is the same as

    M = (r + 1) .* init;
    

    This version is about twice as fast. repmat is hardly ever useful anymore since MATLAB introduced implicit singleton expansion (a.k.a. broadcasting).

    The right way to time code that is very fast like this is using timeit:

    r = [1, 2, 3, 4, 5];
    init = ones(9, 1);
    timeit(@() func1(r, init))
    timeit(@() func2(r, init))
    
    function M = func1(r, init)
    M = repmat(init, [1, length(r)]);
    M = r .* M + init;
    end
    
    function M = func2(r, init)
    M = (r + 1) .* init;
    end
    

    Anyway, I am fairly sure that replicating this logic in Eigen will also produce much faster code. You will need to use .array().

    On arrays, we can also use the operators *=, /=, * and / to perform coefficient-wise multiplication and division column-wise or row-wise. These operators are not available on matrices because it is not clear what they would do. [From "Broadcasting"]

    But if your Eigen code took 13 seconds to run, then you surely are using a debug build. You should never time a debug build (especially of C++ code), they turn off most optimizations. With Eigen, optimization can easily make a 10x or even a 100x difference.