How is it possible to pass the IntelMPI flag -print-rank-map
as input to the srun command or as an environment variable into the batch script which is submitted in a SLURM system via the sbatch command?
Using export I_MPI_DEBUG=4
along with a knowledge of which core IDs belong to which sockets allows you to get this information. For example, I can get the mapping between sockets and core IDs from lscpu
:
[auser@login3 ~]$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 72
On-line CPU(s) list: 0-71
Thread(s) per core: 2
Core(s) per socket: 18
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2695 v4 @ 2.10GHz
Stepping: 1
CPU MHz: 3091.353
CPU max MHz: 3300.0000
CPU min MHz: 1200.0000
BogoMIPS: 4199.86
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 46080K
NUMA node0 CPU(s): 0-17,36-53
NUMA node1 CPU(s): 18-35,54-71
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
As this is an Intel Broadwell CPU, the NUMA regions correspond to sockets:
NUMA node0 CPU(s): 0-17,36-53
NUMA node1 CPU(s): 18-35,54-71
Setting export I_MPI_DEBUG=4
gives the following type of information from which I can work out that ranks 0-17 are bound to socket 0 and ranks 18-35 are bound to socket 1.
[0] MPI startup(): Intel(R) MPI Library, Version 2019 Update 9 Build 20200923 (id: abd58e492)
[0] MPI startup(): Copyright (C) 2003-2020 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.10.1-impi
[0] MPI startup(): libfabric provider: verbs;ofi_rxm
[0] MPI startup(): Rank Pid Node name Pin cpu
[0] MPI startup(): 0 2685376 r1i7n14 0
[0] MPI startup(): 1 2685377 r1i7n14 1
[0] MPI startup(): 2 2685378 r1i7n14 2
[0] MPI startup(): 3 2685379 r1i7n14 3
[0] MPI startup(): 4 2685380 r1i7n14 4
[0] MPI startup(): 5 2685381 r1i7n14 5
[0] MPI startup(): 6 2685382 r1i7n14 6
[0] MPI startup(): 7 2685383 r1i7n14 7
[0] MPI startup(): 8 2685384 r1i7n14 8
[0] MPI startup(): 9 2685385 r1i7n14 9
[0] MPI startup(): 10 2685386 r1i7n14 10
[0] MPI startup(): 11 2685387 r1i7n14 11
[0] MPI startup(): 12 2685388 r1i7n14 12
[0] MPI startup(): 13 2685389 r1i7n14 13
[0] MPI startup(): 14 2685390 r1i7n14 14
[0] MPI startup(): 15 2685391 r1i7n14 15
[0] MPI startup(): 16 2685392 r1i7n14 16
[0] MPI startup(): 17 2685393 r1i7n14 17
[0] MPI startup(): 18 2685394 r1i7n14 18
[0] MPI startup(): 19 2685395 r1i7n14 19
[0] MPI startup(): 20 2685396 r1i7n14 20
[0] MPI startup(): 21 2685397 r1i7n14 21
[0] MPI startup(): 22 2685398 r1i7n14 22
[0] MPI startup(): 23 2685399 r1i7n14 23
[0] MPI startup(): 24 2685400 r1i7n14 24
[0] MPI startup(): 25 2685401 r1i7n14 25
[0] MPI startup(): 26 2685402 r1i7n14 26
[0] MPI startup(): 27 2685403 r1i7n14 27
[0] MPI startup(): 28 2685404 r1i7n14 28
[0] MPI startup(): 29 2685405 r1i7n14 29
[0] MPI startup(): 30 2685406 r1i7n14 30
[0] MPI startup(): 31 2685407 r1i7n14 31
[0] MPI startup(): 32 2685408 r1i7n14 32
[0] MPI startup(): 33 2685409 r1i7n14 33
[0] MPI startup(): 34 2685410 r1i7n14 34
[0] MPI startup(): 35 2685411 r1i7n14 35