ctimesetrlimit

setrlimit isn't reliable?


I'm trying to use setrlimit() to cap the amount of time a process takes. However it doesn't seem to work when I do certain operations like printf().

Here is a test program illustrating the problem:

#include <sys/resource.h>
#include <stdio.h>

int main(void) {
    int i;
    struct rlimit limit;
    limit.rlim_cur = 3;
    limit.rlim_max = 3; // send SIGKILL after 3 seconds 
    setrlimit(RLIMIT_CPU, &limit);

    // doesn't get killed
    for(i=0; i<1000000; i++)
        printf("%d",i);

    return 0;
}

However if I replace the for loop with a different routine like naive fibonacci:

int fib(int n) {
    if(n<=1) return 1;
    return fib(n-1)+fib(n-2);
}
int main(void) {
    ...
    fib(100);
    ...
}

It works perfectly. What's going on here? Is setrlimit() simply unreliable?


Solution

  • The CPU limit is a limit on CPU seconds rather than elapsed time. CPU seconds is basically how many seconds the CPU has been in use and does not necessarily directly relate to the elapsed time.

    When you do the fib call, you hammer the CPU so that elapsed and CPU time are close (most of the process time is spent using the CPU). That's not the case when printing since most time there is spent in I/O.

    So what's happening in your particular case is that the rlimit is set but you're just not using your three seconds of CPU time before the process finishes.

    Changing the main as follows causes the signal to be delivered on my system:

    int main(void) {
        int i;
        struct rlimit limit;
        limit.rlim_cur = 3;
        limit.rlim_max = 3; // send SIGKILL after 3 seconds
        setrlimit(RLIMIT_CPU, &limit);
    
        while (1) {                      // Run "forever".
            for(i=0; i<100000; i++) {
                printf("%d\n",i);
            }
            fib(30);                     // some CPU-intensive work.
        }
    
        return 0;
    }
    

    When you time that under Linux, you see:

    : (much looping).
    52670
    52671
    52672
    52673
    52674
    Killed
    
    real   0m18.719s
    user   0m0.944s
    sys    0m2.416s
    

    In that case, it took almost 20 seconds of elapsed time but the CPU was in use for only 3.36 seconds (user + sys).