c++matrixlapacklapacke

LAPACK zgemm op(A) dimension


In this link from netlib it specifies M as:

On entry, M specifies the number of rows of the matrix op( A ) and of the matrix C. M must be at least zero. Unchanged on exit.

So if I want to use a 3x10 matrix as A but I want to use it's conjugate for zgemm (TRANSA = 'C') what should I enter as M? 3 or 10?

Also when I used other LAPACK routines I entered 2D matrices as 1D like A[3*3] instead of A[3][3] and upon calling the routine I just used A for the matrix, Can I do the same with non-square matrices? A[3*10] instead of A[3][10]?

I code in C++.


Solution

  • A/ Naming convention/clarification

    Before giving an answer and for a better clarity It is important to have this fact in mind:

    whereas

    Comments:

    To avoid such ambiguity I generally use:

    B/ Answers

    B.1/ gemm

    void cblas_zgemm(CBLAS_LAYOUT layout,
                     CBLAS_TRANSPOSE opA,
                     CBLAS_TRANSPOSE opB,
                     const int M, <-------------- I_Size of op(A) 
                     const int N, <-------------- J_Size of op(B)
                     const int K, <-------------- J_Size of op(A)
                     const void* alpha,
                     const void* A,
                     const int lda,
                     const void* B,
                     const int ldb,
                     const void* beta,
                     void* C,
                     const int ldc);
    

    In verbs if TRANSA = 'T' you must take the dimensions of the transposed A matrix.

    The implementation to call cblas_zgemm may look like:

    const Size_t opA_I_size = (opA == CblasNoTrans) ? A.I_size() : A.J_size();
    const Size_t opA_J_size = (opA == CblasNoTrans) ? A.J_size() : A.I_size();
    
    const Size_t opB_I_size = (opB == CblasNoTrans) ? B.I_size() : B.J_size();
    const Size_t opB_J_size = (opB == CblasNoTrans) ? B.J_size() : B.I_size();
    
    cblas_zgemm(CblasColMajor,
                opA,
                opB,
                opA_I_size,
                opB_J_size,
                opA_J_size,
                alpha,
                A.data(),
                A.ld(),
                B.data(),
                B.ld(),
                beta,
                C.data(),
                C.ld());
    

    B.2/ Memory layout

    You must also take memory layout into account. There are two possibilities :

    (where ld is the leading dimension)

    Updates:

    B.3/ Matrix dimensions

    If your matrices will always be as small as 3x10 blas/lapack is not a good choice (for perfomance). Considere using a library like Eigen or Blaz.