tl;dr: I want to see/get the filenames that the reported block numbers on dmesg are part of, when vm.block_dump=1
example dmesg: bash(13515): READ block 5434824 on xvda3 (32 sectors)
When eg. sudo sysctl -w vm.block_dump=1
or eg. echo '1' | sudo tee /proc/sys/vm/block_dump
then "Linux reports all disk read and write operations that take place, and all block dirtyings done to files. [...] The output of block_dump is written to the kernel output, and it can be retrieved using "dmesg". When you use block_dump and your kernel logging level also includes kernel debugging messages, you probably want to turn off klogd, otherwise the output of block_dump will be logged, causing disk activity that is not normally there." (quote from here)
The "block dirtyings" are not an issue, eg.
[ 3140.559675] systemd-journal(291): dirtied inode 399135 (system.journal) on xvda3
I can see its filename like this:
$ echo -e 'open /dev/xvda3\n ncheck 399135' | sudo debugfs -f -
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: ncheck 399135
Inode Pathname
399135 /var/log/journal/12c5e521101c444594b96b53751551a8/system.journal
The problem is with "Linux reports all disk read and write operations that take place" (to quote from the above) as they are reported in blocks, eg.
[ 3140.376827] kworker/u24:3(21616): WRITE block 11037768 on xvda3 (8 sectors)
[ 3140.724725] bash(13515): READ block 5434824 on xvda3 (32 sectors)
[ 3140.725483] date(13515): READ block 5434896 on xvda3 (160 sectors)
[ 3140.728946] sed(13519): READ block 5143680 on xvda3 (32 sectors)
[ 3140.736022] sleep(13522): READ block 5379184 on xvda3 (24 sectors)
[ 3140.804803] qubes-gui(522): READ block 5179952 on xvda3 (16 sectors)
[ 3140.806519] Xorg(599): READ block 7420192 on xvda3 (176 sectors)
[ 3140.810348] InputThread(613): READ block 7418560 on xvda3 (112 sectors)
[ 3140.815866] at-spi2-registr(812): READ block 5654512 on xvda3 (8 sectors)
[ 3140.816860] xdg-desktop-por(888): READ block 5795168 on xvda3 (8 sectors)
[ 3140.818716] gnome-terminal-(865): READ block 5804672 on xvda3 (16 sectors)
[ 3141.064524] sed(13531): READ block 3446048 on xvda3 (16 sectors)
[ 3141.130808] systemd(571): READ block 4744136 on xvda3 (184 sectors)
The kernel code responsible for showing these kind of messages can be seen here: https://elixir.bootlin.com/linux/v4.18.5/source/block/blk-core.c#L2542
None of those blocks yield any inode numbers, using this:
$ echo -e 'open /dev/xvda3\n icheck 11037768' |sudo debugfs -f -
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: icheck 11037768
Block Inode number
11037768 <block not found>
Instead of <block not found>
above I should get an inode number which I can then use with the previous echo to see the file name.
What I tried: just in case the block number is sector size(512 bytes) multiple and I know that ext4 block size is 4096 bytes(sudo blockdev --getbsz /dev/xvda
), I also tried(using the result of multiplication for): 11037768*2, 11037768*4 and 11037768*8 with the same result: <block not found>
What am I missing here? Are those blocks pointing to directory entries or something that's not a file name? Even so, shouldn't debugfs still yield an inode number?
Is there a better way to get to the file name?
EDIT: The block number that icheck
(inside debugfs
) expects are numbers like what debugfs's stat
reports under EXTENTS
, for example any number in range 2172716-2172721
seen below as:
$ sudo debugfs -R "stat /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py" /dev/xvda3
debugfs 1.44.2 (14-May-2018)
Inode: 550529 Type: regular Mode: 0644 Flags: 0x80000
Generation: 1781055959 Version: 0x00000000:00000001
User: 0 Group: 0 Project: 0 Size: 22179
File ACL: 0
Links: 1 Blockcount: 48
Fragment: Address: 0 Number: 0 Size: 0
ctime: 0x5b6ec29d:1b2e0200 -- Sat Aug 11 13:03:57 2018
atime: 0x5b33b5a9:00000000 -- Wed Jun 27 18:04:57 2018
mtime: 0x5b33b5a9:00000000 -- Wed Jun 27 18:04:57 2018
crtime: 0x5b6ec29d:1af0f900 -- Sat Aug 11 13:03:57 2018
Size of extra inode fields: 32
EXTENTS:
(0-5):2172716-2172721
Another way:
$ filefrag -s -v /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py
Filesystem type is: ef53
File size of /usr/lib/python2.7/site-packages/salt/modules/zonecfg.py is 22179 (6 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 5: 2172716.. 2172721: 6: last,eof
/usr/lib/python2.7/site-packages/salt/modules/zonecfg.py: 1 extent found
Now the question remains: How do those device block numbers (the block numbers reported for xvda3 in dmesg) relate to these physical offsets ?
EDIT2: I've just confirmed that these physical offset numbers are the same as block device numbers (just, apparently not the same as the ones reported on dmesg); the following shows me the last block of the above file and I can confirm it being the same as when viewing the file with vim
:
$ sudo dd bs=4096 skip=2172721 count=1 if=/dev/xvda3 | hexdump -C
I've tested this under a 4.18.5 kernel inside a Qubes OS R4.0 Fedora 28 AppVM. (I can recompile a custom kernel with custom .config
/patches if necessary - suggestions welcome)
The block numbers reported on dmesg when vm.block_dump=1
are sector numbers - 512
bytes per block - but the block numbers that the debugfs
command expects for its icheck
command are usually 4096
bytes per block for ext4, so all that is required is a division by 8
.
While block size is usually 4096 bytes, you have to be sure, so get the block size like this:
$ sudo blockdev --getbsz /dev/xvda3
4096
While this value is usually 4096, it can be different in certain cases like eg. mkfs.ext4
with -b
can create an ext4 filesystem with a different block size; vfat has 512 byte block size.
To get the divider for the division, from the block size, all in one go (in bash):
$ echo $(( `sudo blockdev --getbsz /dev/xvda3` / 512 ))
8
Let's use this line as input:
[ 3140.736022] sleep(13522): READ block 5379184 on xvda3 (24 sectors)
Get the block number for icheck
by dividing the above by 8
:
$ bc -l
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
5379184/8
672398.00000000000000000000
I used bc -l
to make sure that I didn't typo the number(if it end with .00000000000000000000
I most likely didn't)
Get the path name by using debugfs
:
$ sudo debugfs
debugfs 1.44.2 (14-May-2018)
debugfs: open /dev/xvda3
debugfs: icheck 672398
Block Inode number
672398 134571
debugfs: ncheck 134571
Inode Pathname
134571 /usr/bin/sleep
debugfs: close
debugfs: quit
I am unsure if the block numbers reported in dmesg can ever be multiples of values other than 512 bytes per sector. For example, are they still 512 bytes per block if the underlying disk has 4096 bytes per sector physically? If I were to guess, I'd say it's safe to assume they're always 512 bytes per block.
Here's the script version of the above:
#!/bin/bash
#./showblock rev.02 rewritten for question/answer from: https://stackoverflow.com/q/52058914/10239615
#----
bytes_per_sector=512 #assumed that dmesg block numbers are 512 bytes each (ie. 512 bytes per sector; aka block size is 512)!
#----
#use `sudo` only when not already root
if test "`id -u`" != "0"; then
sudo='sudo'
else
sudo=''
fi
if ! test "$#" -ge "2"; then
echo "Usage: $0 <device> <dmesgblocknumber> [dmesgblocknumber ...]"
echo "Examples:"
echo "$0 /dev/xvda3 5379184"
echo "$0 /dev/xvda3 5379184 5129952 7420192"
exit 1
fi
within_exit() {
echo -e "\nSkipped current instruction within on_exit()"
}
on_exit() {
#trap - EXIT SIGINT SIGQUIT SIGHUP #will exit by skipping the rest of all instrunction from on_exit() eg. if C-c
trap within_exit EXIT SIGINT SIGQUIT SIGHUP #skip only current instruction from on_exit() eg. when C-c is pressed
#echo "first sleep"
#sleep 10
#echo "second sleep"
#sleep 10
if test "${#remaining_args[@]}" -gt 0; then
echo -n "WARNING: There are '${#remaining_args[@]}' remaining args not processed, they are: " >&2
for i in `seq 0 1 "$(( "${#remaining_args[@]}" - 1 ))"`; do #seq is part of coreutils package
echo -n "'${remaining_args[${i}]}' " >&2
done
echo >&2
fi
}
trap on_exit EXIT SIGINT SIGQUIT SIGHUP
dev="$1"
shift 1
if test -z "$dev" -o ! -b "$dev"; then
echo "Bad device name or not a device: '$dev'" >&2
exit 1
fi
blocksize="`$sudo blockdev --getbsz "$dev"`"
if test "${blocksize:-0}" -le "0"; then #handles empty arg too
echo "Failed getting block size for '$dev', got '$blocksize'" >&2
exit 1
fi
#TODO: check and fail if not a multiplier
divider="$(( $blocksize / $bytes_per_sector ))"
if ! test "${divider:-0}" -gt "0"; then
echo "Failed computing divider from: '$blocksize' / '$bytes_per_sector', got '$divider'" >&2
exit 1
fi
# for each passed-in dmesg block number do
while test "$#" -gt "0"; do
dmesgblock="$1"
shift
remaining_args=("$@") #for on_exit() above
echo '--------'
echo "Passed-in dmesg block($bytes_per_sector bytes per block) number: '$dmesgblock'"
#have to handle the case when $dmesgblock is empty and when it's negative eg. "-1" so using a default value(of 0) when unset in the below 'if' block will help not break the 'test' expecting an integer while also allowing negative numbers ("0$dmesgblock" would otherwise yield "0-1" a non-integer):
if test "${dmesgblock:-0}" -le "0"; then
echo "Bad passed-in dmesg block number: '$dmesgblock'" >&2
exit 1
fi
#TODO: check and fail(or warn? nah, it should be fail!) if not a multiplier (eg. use modullo? is it "%" ?)
block=$(( $dmesgblock / 8 ))
if ! test "${block:--1}" -ge "0"; then
echo "Badly computed device block number: '$block'" >&2
exit 1
fi
echo "Actual block number(of $blocksize bytes per block): $block"
inode="$(echo "open ${dev}"$'\n'"icheck ${block}"$'\n'"close" | $sudo debugfs -f - 2>/dev/null | tail -n2|head -1|cut -f2 -d$'\t')"
if test "<block not found>" == "$inode"; then
echo "No inode was found for the provided dmesg block number '$dmesgblock' which mapped to dev block number '$block'" >&2
exit 1
else
#assuming number TODO: check for this!
echo "Found inode: $inode"
fpath="$(echo "open ${dev}"$'\n'"ncheck ${inode}"$'\n'"close" | $sudo debugfs -f - 2>/dev/null | tail -n2|head -1|cut -f2- -d$'\t')"
#fpath always begins with '/' right?
if test "$fpath" != "Pathname"; then
echo "Found path : $fpath"
else
echo "not found"
fi
fi
done
Sample usage:
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 "" -1 "" 1200 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: ''
Bad passed-in dmesg block number: ''
WARNING: There are '3' remaining args not processed, they are: '-1' '' '1200'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 -1 "" "" 1200 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: '-1'
Bad passed-in dmesg block number: '-1'
WARNING: There are '3' remaining args not processed, they are: '' '' '1200'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 0 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: '0'
Bad passed-in dmesg block number: '0'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 3 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
--------
Passed-in dmesg block(512 bytes per block) number: '3'
Actual block number(of 4096 bytes per block): 0
No inode was found for the provided dmesg block number '3' which mapped to dev block number '0'
exit code: 1
[user@dev01-w-s-f-fdr28 ~]$ ./showblock /dev/xvda3 5379184 5129952 ; echo "exit code: $?"
--------
Passed-in dmesg block(512 bytes per block) number: '5379184'
Actual block number(of 4096 bytes per block): 672398
Found inode: 134571
Found path : /usr/bin/sleep
--------
Passed-in dmesg block(512 bytes per block) number: '5129952'
Actual block number(of 4096 bytes per block): 641244
Found inode: 141497
Found path : /usr/lib64/libIlmImf-2_2.so.22.0.0
exit code: 0
Here's a script to process an entire dmesg log:
#!/bin/bash
#./showallblocks rev.01 rewritten for question/answer from: https://stackoverflow.com/q/52058914/10239615
if test "`id -u`" != "0"; then
sudo='sudo'
else
sudo=''
fi
dmesglog="$1"
if test -z "$dmesglog"; then
echo "Usage: '$0' <dmesglogfile>"
echo "Examples:"
echo "sudo dmesg > dmesg1.log && '$0' dmesg1.log"
echo "'$0' <(sudo dmesg)"
#Note: '$0' used for the case when $0 has spaces or other things in its path names, and user wants to copy paste, for whatever reason, the output of the above into the command line.
exit 1
fi
#(optional) Stop logging if already in progress:
$sudo sysctl -w vm.block_dump=0
#Using the answer from here(thanks to glenn jackman): https://unix.stackexchange.com/a/467377/306023
#grep --color=never -E -- 'READ block [0-9]+ on xvda3' "$dmesglog" |
#cat "$dmesglog" |
$sudo perl -pe '
if (! /READ block [0-9]+ on [A-Za-z0-9]+ .*$/) {
s{.*}{}s
}
s{(READ block) (\d+) (on) ([A-Za-z0-9]+) ([^\$]*)\n$}
{join " ",$1, $2, $3, $4, $5, qx(./showblock "/dev/$4" "$2" | grep -F -- "Found path :" | cut -f4- -d" ")}es
' -- "$dmesglog"
#Note: the output of qx(...) above is purposely allowed to have trailing newline! (I did wonder if purposely is correct here or it should be purposefully, https://www.merriam-webster.com/words-at-play/purposely-purposefully-usage )
#To find out what "}es"(above) is, see perlre modifiers: https://perldoc.perl.org/perlre.html#Modifiers
#FIXME: noobish try to exclude lines not matching the lines that need to be replaced, from output! used 'if' above
Sample output:
[user@dev01-w-s-f-fdr28 ~]$ ./showallblocks sample.dmesg.log
vm.block_dump = 0
[ 6031.953619] sysctl(20774): READ block 5285528 on xvda3 (32 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.954317] sysctl(20774): READ block 5285768 on xvda3 (8 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.954598] sysctl(20774): READ block 5285648 on xvda3 (120 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.954617] sysctl(20774): READ block 5285776 on xvda3 (24 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.955482] sysctl(20774): READ block 5285560 on xvda3 (88 sectors) /usr/lib64/libgpg-error.so.0.24.2
[ 6031.955699] sysctl(20774): READ block 4473568 on xvda3 (8 sectors) /usr/lib64/libuuid.so.1.3.0
[ 6031.955730] sysctl(20774): READ block 4473584 on xvda3 (16 sectors) /usr/lib64/libuuid.so.1.3.0
[ 6031.955787] sysctl(20774): READ block 4749496 on xvda3 (224 sectors) /usr/lib64/libblkid.so.1.1.0
[ 6030.638492] bash(20757): READ block 4543096 on xvda3 (32 sectors) /usr/bin/gawk
[ 6030.639133] awk(20757): READ block 4544280 on xvda3 (176 sectors) /usr/bin/gawk