[SOLVED] What do the user CPU time and system CPU time in getrusage(RUSAGE_THREAD, &r

What do the user CPU time and system CPU time in getrusage(RUSAGE_THREAD, &r_usage) measure excactly?

So I am trying to know the time the current thread has been executing for so far. I am trying to use getrusage(RUSAGE_THREAD, &r_usage); for it. Here are my confusions:

1- Will the time returned by the this function include the time the threads spent being blocked (e.g. on a condition variable) or scheduled out?

2- How about the time the thread spent being blocked for other reasons e.g. blocked for I/O?

3- Can I increase the time accuracy to make getrusage(RUSAGE_THREAD, &r_usage);to return it in nanoseconds?

Many thanks!

Solution

No, in general the blocked time won't be assigned to user or kernel CPU time (which is what rusage measures). The rusage timer is basically a "wall clock timer" that is started and stopped by the OS: when the process is scheduled it notes the time and when it is descheduled it stops it (a similarly for the user/kernel split on entry/exit to kernel routines). The sum of all those segments is CPU time.

In some cases, like IO, the kernel might be doing real work, not waiting and that may be assigned to your process.

If you want higher precious for CPU time you should look into performance counters. On Linux you can use the perf_events system to access these counters in a virtualized way, or use a library like PAPI that wraps access to this subsystem, or perhaps easiest to get started is to use something lightweight like libpfc which provides straightforward counter access.

You also ask:

Moreover, how can I include the time spent on I/O?

This is a tough question because of the asynchronous and cached nature of IO on modern systems. It is tough to pin down exactly the time spent and how to allocate it (e.g., if one process brings a page from disk into the cache but then 10 other processes subsequently access it, how do you divide up the IO time)? One think you could do is look into the /proc/pid/ counters and status entries which probably has a blocking-on-IO indicator. Indeed, top can show processes in this state, so you could likely get this into from /proc/$pid. Note that you would have to sample this file system from some other thread to make that work, I think.

Alternately, you could try to instrument, at the application level, your IO calls. Finally, you could use something like ptrace or the newer FTrace on Linux to instrument at the kernel level the IO calls, and you can probably filter this by process. Newer kernels apparently have "per-process IO accounting" - but I couldn't quickly dig up a good link. Probably the source for iotop would have the calls you need though.

Finally, it all depends on what you mean by the time the current thread has been executing for so far: if you want to include IO, perhaps you just want the wall-clock time since the thread started? Then you can use clock_gettime and friends, which offer nanosecond resolution (nominally, but the call itself takes at least a dozen nanoseconds, so you aren't going to measure things that take 1 or 2 nanos accurately).