javampicompsspycompss

COMPSs - Nodes already filled error


After submitting a COMPSs application I have received the following error message and the application is not executed.

MPI_CMD=mpirun -timestamp-output -n 1 -H s00r0
/apps/COMPSs/1.3/Runtime/scripts/user/runcompss
--project=/tmp/1668183.tmpdir/project_1458303603.xml
--resources=/tmp/1668183.tmpdir/resources_1458303603.xml
--uuid=2ed20e6a-9f02-49ff-a71c-e071ce35dacc
/apps/FILESPACE/pycompssfile arg1 arg2 : -n 1 -H s00r0
/apps/COMPSs/1.3/Runtime/scripts/system/adaptors/nio/persistent_worker_starter.sh
/apps/INTEL/mkl/lib/intel64 null
/home/myhome/kmeans_python/src/ true
/tmp/1668183.tmpdir 4 5 5 s00r0-ib0 43001 43000 true 1
/apps/COMPSs/1.3/Runtime/scripts/system/2ed20e6a-9f02-49ff-a71c-e071ce35dacc : -n 1 -H s00r0
/apps/COMPSs/1.3/Runtime/scripts/system/adaptors/nio/persistent_worker_starter.sh
/apps/INTEL/mkl/lib/intel64 null
/home/myhome/kmeans_python/src/ true
/tmp/1668183.tmpdir 4 5 5 s00r0-ib0 43001 43000 true 2
/apps/COMPSs/1.3/Runtime/scripts/system/2ed20e6a-9f02-49ff-a71c-e071ce35dacc

--------------------------------------------------------------------------
All nodes which are allocated for this job are already filled.
--------------------------------------------------------------------------

I am using COMPSs 1.3.

Why is this happenning?


Solution

  • You are trying to run master and worker in the same node. COMPSs 1.3 at cluster with the NIO adaptor (default option) is using mpirun to spawn the master and worker processes in the different nodes of the cluster and the mpirun installed in the cluster doesn't allow to do this.

    The options to solve it are the following:

    1. You do not specify --tasks_in_master= in the enqueue_compss command.
    2. You execute with GAT Adaptor (--comm=integratedtoolkit.gat.master.GATAdaptor) which has more overhead

    Next COMPSs software release will use the spawn command which is available in the different cluster resource managers( such as blaunch, srun) which must solve this issue