I am performing a Cholesky factorization with Intel-MKL, which uses ScaLAPACK. I distributed the matrix, based on this example, where the matrix is distributed in blocks, which are of equal size (i.e. Nb x Mb). I tried to make it so that every block has it's own size, depending on which process it belongs, so that I can experiment more and maybe get better performance.
Check this question, in order to get a better understanding of what I am saying. I won't post my code, since it's too big (yes the minor example is too big too, I checked) and the distribution seems to work well. However, ScaLAPACK seems to assume that the matrix is distributed in blocks of equal size?
For example, I am using this:
int nrows = numroc_(&N, &Nb, &myrow, &iZERO, &procrows);
int ncols = numroc_(&M, &Mb, &mycol, &iZERO, &proccols);
where (taken from the manual):
NB (global input) INTEGER Block size, size of the blocks the distributed matrix is split into.
So, does ScaLAPACK allow distributed matrices with non-equal block sizes?
If I print information like this, for an 8x8 matrix:
std::cout << Nb << " " << Mb << " " << nrows << " " << ncols << " " << myid << std::endl;
I am getting this:
3 3 5 5 0
1 1 4 4 1
1 1 4 4 2
1 1 4 4 3
and with by just swapping the first two block sizes, this:
1 1 4 4 0
3 3 5 3 1
1 1 4 4 2
1 1 4 4 3
which doesn't make sense for an 8x8 matrix.
As answered here, the answer is no, you can not have blocks of different sizes.