matlabnested-loopsparfor

Parallelize nested loops in Matlab


I'm trying to speed up the simulation of some panel data in Matlab. I have to simulate first over individuals (loop index ii from 1 to N) and then for each individual over age (loop index jj from 1 to JJ). The code is slow because inside the two loops there is a bilinear interpolation to do.

Since the iterations in the outer loop are independent, I tried to use parfor in the outer loop (the loop indexed by ii), but I get the error message "the parfor cannot run due to the way the variable hsim is used". Could someone explain why and how to solve the problem if possible? Any help is greatly appreciated!

a_sim = zeros(Nsim,JJ);
h_sim = zeros(Nsim,JJ);
% Find point on a_grid corresponding to zero assets
aa0 = find_loc(a_grid,0.0);
% Zero housing
hh0 = 1;
a_sim(:,1) = a_grid(aa0);
h_sim(:,1) = h_grid(hh0);
parfor ii=1:Nsim !illegal
    for jj=1:JJ-1
        z_c = z_sim_ind(ii,jj);
        apol_interp = griddedInterpolant({a_grid,h_grid},apol(:,:,z_c,jj));
        hpol_interp = griddedInterpolant({a_grid,h_grid},hpol(:,:,z_c,jj));
        a_sim(ii,jj+1) = apol_interp(a_sim(ii,jj),h_sim(ii,jj));
        h_sim(ii,jj+1) = hpol_interp(a_sim(ii,jj),h_sim(ii,jj));
    end
end

Solution

  • I think @Ben Voigt's suggestion was correct. To spell it out, do something like this:

    parfor ii=1:Nsim
        a_sim_row = a_sim(ii,:);
        h_sim_row = h_sim(ii,:);
        for jj=1:JJ-1
            z_c = z_sim_ind(ii,jj);
            apol_interp = griddedInterpolant({a_grid,h_grid},apol(:,:,z_c,jj));
            hpol_interp = griddedInterpolant({a_grid,h_grid},hpol(:,:,z_c,jj));
            a_sim_row(jj+1) = apol_interp(a_sim_row(jj),h_sim_row(jj));
            h_sim_row(jj+1) = hpol_interp(a_sim_row(jj),h_sim_row(jj));
        end
        a_sim(ii,:) = a_sim_row;
        h_sim(ii,:) = h_sim_row;
    end
    

    This is a fairly standard parfor pattern to work around the limitation (in this case, parfor cannot spot that what you're doing is not order-independent as far as the outer loop is concerned) - extract a whole slice, do whatever is needed, then put the whole slice back.