I am working in MatLab with the parallel computing toolbox.
Task
I have a vector v. I have 4 cores. I want to split the vector on each core (so each core handles 1/4th of the vector, assuming length(v) is divisible by 4) and apply a function f() on each part.
So for core 1: f1 = f(v that belongs to part 1)
and for core 2: f2 = f(v that belongs to part 2)
and so on.
Then I want to gather the results so that, after this I have: f = "one vector containing all elements of f1, and all elements of f2, etc." on the main core (root if you wish, maybe MatLab calls this "client", but I am not sure).
Attempt
spmd
v_dist = codistributed( v ); %split v onto cores
lpv = getLocalPart( v_dist ); %this core's part ("my part")
f1 = f( lpv ); %apply f to my part of v
%I want to piece back together the outputs?
f_tmp = codistributed( zeros(length(f1) * 4, 1) );
%get my part of the container where I want to put the output
f_tmp_lp = getLocalPart( f_tmp );
%now actually put my part of the output here:
f_tmp_lp = f1;
%and then finally piece back together my part into
f_tmp = codistributed.build( f_tmp_lp, getCodistributor( f_tmp ) );
end
%we should gather the output on the client?
f = gather( f_tmp );
And?
This does not work as expected. I do get the right size of f, but somehow what seems to happen is that "lpv" is just the same piece given to each core. But I am not sure if this is the issue.
Help?
I have not done a lot of MatLab parallel programming. How would I accomplish my task?
I think your code is pretty close, but I don't think you need f_tmp
. Here's an example:
v = 1:10;
spmd
v_dist = codistributed(v);
lpv = getLocalPart(v_dist);
f1 = sqrt(lpv);
v2 = codistributed.build(f1, getCodistributor(v_dist));
end
assert(isequal(gather(v2), sqrt(v)));