I have a 40000 by 80000 matrix from which I'm obtaining the number of "clusters" (groups of elements with the same value that are adjacent to one another) and then calculating the size of each of those clusters. Here it is the chunk of code.
FRAGMENTSIZESCLASS = struct([]); %We store the data in a structure
for class=1:NumberOfClasses
%-First we create a binary image for each class-%
BWclass = foto==class;
%-Second we calculate the number of connected components (fragments)-%
L = bwlabeln(BWclass); %returns a label matrix, L, containing labels for the connected components in BWclass
clear BWclass
NumberFragments=max(max(L));
%-Third we calculate the size of each fragment-%
FragmentSize=zeros(NumberFragments,1);
for f=1:NumberFragments % potential improvement: using parfor while saring the memory between workers
FragmentSize(f,1) = sum(L(:) == f);
end
FRAGMENTSIZESCLASS{class}=FragmentSize;
clear L
end
The problem is that the matrix L is so large that if I use a parfor loop it turns into a broadcast variable and then the memory gets multiplied and I run out of memory.
Any ideas on how to sort this out? I've seen this file: https://ch.mathworks.com/matlabcentral/fileexchange/28572-sharedmatrix but is not an straightforward solution, even though I have 24 cores still will take a lot of time.
Cheers!
Here it is a picture showing the time it takes as a function of image size when using the code I posted in the question vs using bwconncomp as suggested by @bla:
instead of bwlabeln
use the built in function bwconncomp
, for example:
...
s=bwconncomp(BWClass);
fragmentsize=sum(cellfun(@numel,s.PixelIdxList));
....