matlabparallel-processingembarrassingly-parallel

Matlab dfeval overhead


I have an embarrassingly parallel job that requires no communication between the workers. I'm trying to use the dfeval function, but the overhead seems to be enormous. To get started, I'm trying to run the example from the documentation.

>> matlabpool open
Starting matlabpool using the 'local' configuration ... connected to 8 labs.
>> sched = findResource('scheduler','type','local')
sched =
Local Scheduler Information
===========================

                      Type : local
             ClusterOsType : pc
               ClusterSize : 8
              DataLocation : C:\Users\~\AppData\Roaming\MathWorks\MATLAB\local_scheduler_data\R2010a
       HasSharedFilesystem : true

- Assigned Jobs

           Number Pending  : 0
           Number Queued   : 0
           Number Running  : 1
           Number Finished : 8

- Local Specific Properties

         ClusterMatlabRoot : C:\Program Files\MATLAB\R2010a
>> matlabpool close force local
Sending a stop signal to all the labs ... stopped.
Did not find any pre-existing parallel jobs created by matlabpool.

>> sched = findResource('scheduler','type','local')
sched =
Local Scheduler Information
===========================

                      Type : local
             ClusterOsType : pc
               ClusterSize : 8
              DataLocation : C:\Users\~\AppData\Roaming\MathWorks\MATLAB\local_scheduler_data\R2010a
       HasSharedFilesystem : true

- Assigned Jobs

           Number Pending  : 0
           Number Queued   : 0
           Number Running  : 0
           Number Finished : 8

- Local Specific Properties

         ClusterMatlabRoot : C:\Program Files\MATLAB\R2010a
>> tic;y = dfeval(@rand,{1 2 3},'Configuration', 'local');toc
Elapsed time is 4.442944 seconds.

Running subsequent times produces similar timings. So my questions are:

  1. Why do I need to run matlabpool close force local to get the Number Running to zero, given that I run matlabpool open in a fresh instance?
  2. Is five seconds of overhead really necessary for such a trivial example? especially given the Matlab workers have already been started up?

Solution

  • The DFEVAL function is a wrapper around submitting a job with one or more tasks to a given scheduler, in your case the 'local' scheduler. With the 'local' scheduler, each new task runs in a fresh MATLAB worker session, which is why you see the 4.5 second overhead - that's the time take to launch the worker, work out what to do, do it, and then quit.

    The reason that you need the number of running jobs to be zero is that the local scheduler can only run a restricted number of workers.

    In general, PARFOR with MATLABPOOL is an easier combination to use than DFEVAL. Also, when you open a MATLABPOOL, the workers are launched and ready, so the overhead of PARFOR is much less (but still not zero as the body of the loop needs to be sent to the workers).