I have a slurm batch script, and I'm running Intel MPI.
I want to run two different MPI codes on the same set of nodes with different process placement configurations.
I'm running two MPI codes, one with -np = 8 and the other with -np = 2. For the -np=8 case, I want to place ranks [0, 1, 2, 3] from the first mpiexec on node0 and ranks [4, 5, 6, 7] from the second mpiexec on node1.
For the -np=2 case, I want to place rank [0] on node0 and rank [1] on node1.
I've tried -ppn, -perhost, I_MPI_PERHOST, all the options available here, Intel process placement but none of them are working.
My scripts are directly inheriting options from SBATCH instead of local settings. Please don't suggest srun; there's some issue with mpi and srun. I'm unable to run mpi with srun on multiple nodes (PMI error); mpiexec directly works on multiple nodes.
Is there any way I can achieve the above task? Here's the SBATCH script I'm currently using.
#!/bin/bash
#SBATCH -p small
#SBATCH -N 2
#SBATCH --exclusive
#SBATCH --time=01:00:00
#SBATCH --error=err.out
#SBATCH --output=out.out
#SBATCH --ntasks=10
module load compiler/intel/2018.2.199
module load apps/ucx/ucx_1.13.1
source /opt/ohpc/pub/apps/intel/2018_2/compilers_and_libraries_2018.2.199/linux/mpi/intel64/bin/mpivars.sh intel64
export I_MPI_FALLBACK=disable
#--> First mpiexec
mpiexec.hydra -n 8 ./hello.out &
#--> Second mpiexe
mpiexec.hydra -n 2 ./world.out &
wait
Here my first mpiexec runs ranks [0,1,2,3,4] on node0 and [5,6,7] on node1 where as my second mpiexec runs ranks [0,1] on node 0.
I want my first mpiexec to run ranks [0,1,2,3] on node0 and ranks [4,5,6,7] on node1 I want my second mpiexec to run rank [0] on node0 and rank [1] on node1
I can't use srun I can only use mpiexec
Is there any way I can set local settings to each mpiexec? Any suggestions will be helpful
The issue was with the version of Intel MPI.
Using the latest version of Intel MPI fixed the issue.
I credit Gilles Gouaillardet for the solution. Please refer to the comments section for the discussion.