How can I use MKL.h library to do matrix-vector multiplication in C++, does anybody have any good source for begineers to learn about it?

I have "oneMKL" integrated with Visual Studio 2022 and I would like to know how it performs for Matrix-vector multiplications, and compare it with a sequential implementation:

E.g. If I have this sequential matrix-vector multiplication in C++, how can I do the same for the same matrix-vector with MKL #include "mkl.h"?

#include <math.h>
#include <iostream>
using namespace std;
 

#define MAX 3
 
float matA[3][3]={{1.1,2.2,3.3},{4.4,5.5,6.6},{7.7,8.8,9.9}};
float matB[3]={1,2,3};
float matC[3];


float sequencial_multiplication(float matA[MAX][MAX],float matB[MAX]){
    for (int i = 0; i < MAX; i++)
      for (int j = 0; j < MAX; j++)
        matC[i] += matA[i][j] * matB[j];

    for(int i=0;i<MAX;i++)
        cout<<matC[i]<<endl;

    return 0;
}


int main()
{
    sequencial_multiplication(matA,matB);  
}

Solution

First, let me lower your expectations. MKL is not a library to be used casually, it needs a lot of boilerplate in addition to what is required by using BLAS (the historic linear algebra interface). Also, it is not the best option to operate small matrices, like 3x3, or 4x4, for that it might be better to write them manually if nothing else to avoid the non-inlined BLAS (or MKL) function. I hope your real application is to multiply big matrices.

If it looks complicated and convoluted is because it is. There is a lot of history there, starting from the fact that the original BLAS was written in Fortran and still has to work with Fortran to this day. There is no real BLAS or MKL for C++, the interface is defined for C.

Using C++, you can take advantage of the many BLAS wrappers written for BLAS, such as Boost.UBLAS or Multi. (See below for more).

But here I am giving you the recipe, without getting too much into the details to answer your question precisely.

This is the full program, I left your code for reference:

// my_gemv.cpp
#include <iostream>

extern "C" {
void  sgemv_(const char& trans, int32_t  const& nr, int32_t  const& nc, float const& a, float const* A, int32_t  const& lda, float const* X, int32_t  const& incx, float const& beta, float* Y, int32_t const& incy);
}

#define MAX 3

void matrix_vector_multiplication_with_addition(float matA[MAX][MAX], float vecB[MAX], float vecC[MAX]) {
    for(int i = 0; i != MAX; i++)
        for(int j = 0; j != MAX; j++)
            vecC[i] += matA[i][j] * vecB[j];
}

void blas_matrix_vector_multiplication_with_addition(float matA[MAX][MAX], float vecB[MAX], float vecC[MAX]) {
    sgemv_('T', MAX, MAX, 1.0, &matA[0][0], MAX, &vecB[0], 1, 1.0, &vecC[0], 1);
}

int main() {
    float matA[3][3] = {
        {1.1, 2.2, 3.3},
        {4.4, 5.5, 6.6},
        {7.7, 8.8, 9.9},
    };
    float vecB[3] = {1.0, 2.0, 3.0};
    float vecC[3] = {0.0, 0.0, 0.0};

    matrix_vector_multiplication_with_addition(matA, vecB, vecC);

    std::cout <<"vecC[] = {" << vecC[0] <<", "<< vecC[1] <<", "<< vecC[2] <<"}"<<std::endl;

    float mkl_vecC[3] = {0.0, 0.0, 0.0};
    blas_matrix_vector_multiplication_with_addition(matA, vecB, mkl_vecC);

    std::cout <<"vecC[] = {" << mkl_vecC[0] <<", "<< mkl_vecC[1] <<", "<< mkl_vecC[2] <<"}"<<std::endl;
}

You might ask, where is MKL here? It is not really there, I am using the fact that MKL uses a standard ABI interface for the linear algebra operators. Why the extern "C"? Because you are using an interface defined for another language.

If you try to produce an executable it will not be able to "link" because someone needs to provide the sgemv_ "symbol". Why the _? Because, well, Fortran. I am digressing.

The next steps may be different in Windows but maybe you can translate:

c++ my_gemv.cpp -L/opt/intel/oneapi/mkl/2023.0.0/lib/intel64 -lmkl_rt -o my_gemv

and you need to tell the executable where to find the library (again!)

export LD_LIBRARY_PATH=/opt/intel/oneapi/mkl/2023.0.0/lib/intel64
./my_gemv

or perhaps in Windows, you are better off linking a static version of MKL. I don't know.

If everything goes well, the program will print this (manual loop and MKL will give the same answer):

vecC[] = {15.4, 35.2, 55}
vecC[] = {15.4, 35.2, 55}

What's next? The answer I gave is really just the start, it will also free you from really needing MKL, and you can use other BLAS implementations.

You can use the header files from mkl, or from cblas, etc. This can give you more convenience but it can lock you up on using MKL or extensions that are not open source, etc.

Don't like all the command line stuff? You will need a build system, for example, CMake https://cmake.org/cmake/help/latest/module/FindBLAS.html

Don't like the C syntax? Use a C++ BLAS wrapper, they usually can be a link to any implementation of BLAS, including MKL.

As I said above you can use a C++ wrapper to use BLAS (or BLAS in MKL) sanely, this is the example using my library Multi. As you can see the library offers several modes, one is to "refer" to your c-arrays to use the library and another is to use the arrays from the library.

Compile it with:

c++ -Ipath_to_Multi my_gemv_multi.cpp -L/opt/intel/oneapi/mkl/2023.0.0/lib/intel64 -lmkl_rt

// my_gemv_multi.cpp
#include <multi/adaptors/blas/gemv.hpp>
#include <multi/array.hpp>

int main() {
    float matA[3][3] = {
        {1.1, 2.2, 3.3},
        {4.4, 5.5, 6.6},
        {7.7, 8.8, 9.9},
    };
    float vecB[3] = {1.0, 2.0, 3.0};
    float vecC[3] = {0.0, 0.0, 0.0};

    namespace multi = boost::multi;

    {  // make references to c-arrays
        multi::array_ref<float, 2> A{matA};
        multi::array_ref<float, 1> B{vecB};
        multi::array_ref<float, 1> C{vecC};

        multi::blas::gemv(1.0, A, B, 0.0, C);  // C is output
    }
    {  // make references to c-arrays
        auto const& A = multi::ref(matA);  // deduce element type and dimensionality
        auto const& B = multi::ref(vecB);
        auto&&   Cref = multi::ref(vecC);

        multi::blas::gemv(1.0, A, B, 0.0, Cref);  // vecC holds the result
    }
    {  // one-liner
        multi::blas::gemv(1.0, multi::ref(matA), multi::ref(vecB), 0.0, multi::ref(vecC));  // vecC holds the result
    }
    {  //one-liner with output assignment syntax
        multi::ref(vecC) = multi::blas::gemv(1.0, multi::ref(matA), multi::ref(vecB));
    }
    {  // using the library, not references to c-arrays
        multi::array<float, 2> A = {
            {1.1, 2.2, 3.3},
            {4.4, 5.5, 6.6},
            {7.7, 8.8, 9.9},
        };
        multi::array<float, 1> B = {1.0, 2.0, 3.0};
 
        multi::array<float, 1> C = multi::blas::gemv(1.0, A, B);  // create (allocate) the result in C
    }
    {
        multi::array<float, 2> A = {
            {1.1, 2.2, 3.3},
            {4.4, 5.5, 6.6},
            {7.7, 8.8, 9.9},
        };
        multi::array<float, 1> B = {1.0, 2.0, 3.0};

        auto C =+ multi::blas::gemv(1.0, A, B);  // create (allocate) the result in C
    }
}