c++multithreadingsetrlimit

setrlimit() not affecting spawned std::threads


I am currently working on a pipeline which loads and transforms multiple images at once. As this is happening to many images at the same time (1440) the memory footprint is quite heavy. I therefore tried to implement a memory management system based on setrlimit, however it doesn't seem to affect the spawned threads (std::thread) as they will happily ignore the limit - I know this because of calls to getrlimit() in the threaded functions - and eventually cause my program to be killed. Here is the code I use for setting the limit:

void setMemoryLimit(std::uint64_t bytes)
{
    struct rlimit limit;
    getrlimit(RLIMIT_AS, &limit);

    if(bytes <= limit.rlim_max)
    {
        limit.rlim_cur = bytes;
        std::cout << "New memory limit: " << limit.rlim_cur << " bytes" << std::endl;
    }
    else
    {
        limit.rlim_cur = limit.rlim_max;
        std::cout << "WARNING: Memory limit couldn't be set to " << bytes << " bytes" << std::endl;
        std::cout << "New memory limit: " << limit.rlim_cur << " bytes" << std::endl;
    }

    if(setrlimit(RLIMIT_AS, &limit) != 0)
        std::perror("WARNING: memory limit couldn't be set:");

    // included for debugging purposes
    struct rlimit tmp;
    getrlimit(RLIMIT_AS, &tmp);
    std::cout << "Tmp limit: " << tmp.rlim_cur << " bytes" << std::endl; // prints the correct limit
}

I'm using Linux. The man page states that setrlimit affects the whole process so I'm kind of clueless why the threads don't seem to be affected.

Edit: By the way, the function above is called at the very beginning of main().


Solution

  • The problem was quite hard to find as it consisted of two entirely independent components:

    1. My executable was compiled with -fomit-frame-pointer. This will result in a reset of the limit. See the following example:

      /* rlimit.cpp */
      #include <iostream>
      #include <thread>
      #include <vector>
      
      #include <sys/resource.h>
      
      class A
      {
          public:
              void foo()
              {
                  struct rlimit limit;
                  getrlimit(RLIMIT_AS, &limit);
                  std::cout << "Limit: " << limit.rlim_cur << std::endl;
              }
      };
      
      int main()
      {
          struct rlimit limit;
          limit.rlim_cur = 500 * 1024 * 1024;
          setrlimit(RLIMIT_AS, &limit);
          std::cout << "Limit: " << limit.rlim_cur << std::endl;
      
          std::vector<std::thread> t;
      
          for(int i = 0; i < 5; i++)
          {
              A a;
              t.push_back(std::thread(&A::foo, &a));
          }
      
          for(auto thread : t)
              thread.join();
      
          return 0;
      }
      

      Outputs:

      > g++ -std=c++11 -pthread -fomit-frame-pointer rlimit.cpp -o limit
      > ./limit
      Limit: 524288000
      Limit: 18446744073709551615
      Limit: 18446744073709551615
      Limit: 18446744073709551615
      Limit: 18446744073709551615
      Limit: 18446744073709551615
      
      > g++ -std=c++11 -pthread rlimit.cpp -o limit
      > ./limit
      Limit: 524288000
      Limit: 524288000
      Limit: 524288000
      Limit: 524288000
      Limit: 524288000
      Limit: 524288000
      
    2. For the image processing part I work with OpenCL. Apparently NVIDIA's implementation calls setrlimit and pushes the limit to rlim_max.