linuxmpiulimittorqueofed

How can I increase OpenFabrics memory limit for Torque jobs?


When I run MPI job over InfiniBand, I get the following worning. We use Torque Manager.

--------------------------------------------------------------------------
WARNING: It appears that your OpenFabrics subsystem is configured to only
allow registering part of your physical memory.  This can cause MPI jobs to
run with erratic performance, hang, and/or crash.

This may be caused by your OpenFabrics vendor limiting the amount of
physical memory that can be registered.  You should investigate the
relevant Linux kernel module parameters that control how much physical
memory can be registered, and increase them to allow registering all
physical memory on your machine.

See this Open MPI FAQ item for more information on these Linux kernel module
parameters:

http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

Local host:              host1

Registerable memory:     65536 MiB

Total memory:            196598 MiB

Your MPI job will continue, but may be behave poorly and/or hang.

--------------------------------------------------------------------------

I've read the link on the warning message, and I've done so far is;

  1. Append options mlx4_core log_num_mtt=20 log_mtts_per_seg=4 on /etc/modprobe.d/mlx4_en.conf.
  2. Make sure the following lines are written on /etc/security/limits.conf
    • * soft memlock unlimited
    • * hard memlock unlimited
  3. Append session required pam_limits.so on /etc/pam.d/sshd
  4. Make sure ulimit -c unlimited is uncommented on /etc/init.d/pbs_mom

Can anyone help me to find out what I'm missing?


Solution

  • Your mlx4_core parameters allow for the registration of 2^20 * 2^4 * 4 KiB = 64 GiB only. With 192 GiB of physical memory per node and given that it is recommended to have at least twice as much registerable memory, you should set log_num_mtt to 23, which would increase the limit to 512 GiB - the closest power of two greater or equal to twice the amount of RAM. Be sure to reboot the node(s) or unload and then reload the kernel module.

    You should also submit a simple Torque job script that executes ulimit -l in order to verify the limits on locked memory and make sure there is no such limit. Note that ulimit -c unlimited does not remove the limit on the amount of locked memory but rather the limit on the size of core dump files.