linuxbashiodiskspacedata-profiling

Measuring peak disk use of a process


I am trying to benchmark a tool I'm developing in terms of time, memory, and disk use. I know /usr/bin/time gives me basically what I want for the first two, but for disk use I came to the conclusion I would have to roll my own bash script that periodically extracts the 'bytes written' contents from /proc/<my_pid>/io. Based on this script, here's what I came up with:

"$@" &
pid=$!
status=$(ps -o rss -o vsz -o pid | grep $pid)
maxdisk=0
while [ "${#status}" -gt "0" ];
do
    sleep 0.05
    delta=false
    disk=$(cat /proc/$pid/io | grep -P '^write_bytes:' | awk '{print $2}')
    disk=$(disk/1024)
    if [ "0$disk" -gt "0$maxdisk" ] 2>/dev/null; then
        maxdisk=$disk
        delta=true
    fi
    if $delta; then
        echo disk: $disk
    fi
    status=$(ps -o rss -o vsz -o pid | grep $pid)
done
wait $pid
ret=$?
echo "maximal disk used: $maxdisk KB"

Unfortunately, I am running into two problems:

How can I resolve these problems?


Solution

  • You may like to have a look at filetop from BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more:

    tools/filetop: File reads and writes by filename and process. Top for files.

    This script works by tracing the vfs_read() and vfs_write() functions using kernel dynamic tracing, which instruments explicit read and write calls. If files are read or written using another means (eg, via mmap()), then they will not be visible using this tool.

    Brendan Gregg gives good talks and demos about Linux Performance Tools, they are quite instructive.