I've used spmd
to calculate two piece of code simultaneously. The computer which I'm using have a processor with 8 cores.which means the communication overhead is something like zero!
I compare the running time of this spmd
block and same code outside of spmd
with tic & toc
.
When I run the code, The parallel version of my code take more time than the sequential form.
Any idea why is that so?
Here is a sample code of what I'm talking about :
tic;
spmd
if labindex == 1
gamma = (alpha*beta);
end
if labindex == 2
for t = 1:T,
for i1=1:n
for j1=1:n
kesi(i1,j1,t) = (alpha(i1,t) + phi(j1,t));
end;
end;
end;
end
end
t_spmd = toc;
tic;
gamma2= (alpha * beta);
for t = 1:T,
for i1=1:n
for j1=1:n
kesi2(i1,j1,t) = (alpha(i1,t) + phi(j1,t));
end;
end;
end;
t_seq = toc;
disp('t spmd : ');disp(t_spmd);
disp('t seq : ');disp(t_seq);
There are two reasons here. Firstly, your use of if labindex == 2
means that the main body of the spmd
block is being executed by only a single worker - there's no parallelism here.
Secondly, it's important to remember that (by default) parallel pool workers run in single computational thread mode. Therefore, when using local workers, you can only expect speedup when the body of your parallel construct cannot be implicitly multi-threaded by MATLAB.
Finally, in this particular case, you're much better off using bsxfun
(or implicit expansion in R2016b or later), like so:
T = 10;
n = 7;
alpha = rand(n, T);
phi = rand(n, T);
alpha_r = reshape(alpha, n, 1, T);
phi_r = reshape(phi, 1, n, T);
% In R2016b or later:
kesi = alpha_r + phi_r;
% In R2016a or earlier:
kesi = bsxfun(@plus, alpha_r, phi_r);