I have three matrices A,B and C:
And the matrix-matrix product for general matrices:
void cblas_dgemm(const enum CBLAS_ORDER __Order, const enum CBLAS_TRANSPOSE __TransA, const enum CBLAS_TRANSPOSE __TransB, const int __M, const int __N, const int __K, const double __alpha, const double *__A, const int __lda, const double *__B, const int __ldb, const double __beta, double *__C, const int __ldc);
For using the cblas_dgemm
-command I need to know the leading dimension. For me it is clear that in the case of the total matrix A (or its transpose form) we have: M=5, N=4, lda=4
.
In the case of submatrix C I think i have to overgive &A[5]
and set M=3, N=2, ldc=4
But I have no idea how this could work in the case of red submatrix B with M=4, N=2
. Can someone explain this to me. Thanks a lot.
This article pretty much nails it: https://petewarden.com/2015/10/25/an-engineers-guide-to-gemm/
The reason for the apparent complexity of BLAS routines is that they allow a lot of flexibility and are optimised to perform extremely well. Both goals are achieved, if the routines can be applied on matrices consisting of submatrices of interest etc. Oftentimes they can do more than you need. xGEMM
class is an outstanding example. You may perform A * B
but also A * B + c*C
...
In you above cases:
A: M=5, N=5, LDA = 5
B: M=4, N=1, LDB = 10
C: M=3, N=2, LDB = 5
And you are correct in that your first entry in C is &C[6]
The leading dimension would in other words be usually the length on the column of the outer matrix if you are column major and the length of the row if you are computing row-major.
In case B, it's a little trickier, as you have to jump over two columns of 5 each, i.e. 10, when going from one column of the submatrix to the next.
All BLAS wants to now is (column/row major):
&A[0], &B[0], &C[6]
/&A[0], &B[0], &C[5]
) M
(5, 4, 3
/ 4, 2, 2
)N
(4, 2, 2
/ 5, 4, 3
)ldx
(5, 10, 5
)