I can use od
when I want to dump the contents of a non-textual file to a terminal (or a text file) as human-readable values: I can peer into files with elements of various types - signed or unsigned integers, floating point or printable ASCII. (You can also have the data printed in various bases like hexadecimal or octal, hence the name, but that's not what I care about.)
The limitation is, that the input file is assumed to have a single, uniform data type. But - what if this is not the case? What if I have triplets of, say, a single-byte unsigned value, then a floating-point element of size 4 bytes, and then a signed integer element of size 2 bytes? i.e. in od
terms, u1,f4,d2
?
I would like to see a sequence of triplets of numbers of these types printed for me; with any reasonable convention of line-breaking and field-delimitation. Suppose I want to specify my struct/tuple format as in the above, i.e. comma-separated-od-style; but I'm flexible on the specifics of this.
Can I use the shell and common command-line tools to achieve this relatively painlessly?
The od
command will accumulate multiple formats with a single -t
option (e.g., -t u1f4d2
in your case), and output a line for each type requested. Since you have multiples of the same type, adding them to the -t
option only adds redundant information, so we can just use the representative types. Attempting to generate some data like describe, you get something like the following, with a line of output for each requested type:
% echo "128 255 12 3.7 -12" | perl -ne "print pack("CCCfs", split)" | od -An -tu1f4d2
128 255 12 205 204 108 64 244 255 // u1
-1.4784717e+08 -6.0981913e+31 3.57e-43 // f4
-128 -13044 27852 -3008 255 // d2
Unfortunately, it seems that od
tries to apply the requested type for each line, and since in your example, the three unsigned bytes cause the floating-point value following them not to start on a word (32-bit) boundary, it can't decode the float correctly.
However, if your data packing matches word boundaries, then you can get pretty close. By inserting an additional unsigned byte after your triple:
% echo "128 255 12 255 3.7 -12" | perl -ne "print pack("CCCCfs", split)" | od -An -tu1f4d2
128 255 12 255 205 204 108 64 244 255
-1.8741855e+38 3.7 9.1819e-41 // we get the correct float
-128 -244 -13107 16492 -12 // and signed short
With this scenario, we can get close to what you ask with some more shell magic
% echo "128 255 12 255 3.7 -12" | perl -ne "print pack("CCCCfs", split)" | od -An -tu1f4d2 | paste -sd ' \n' | awk '{ print $1, $2, $3, $12, $18 }'
128 255 12 3.7 -12
Decoding that command pipeline a bit:
Command | Description |
---|---|
echo "128 255 12 255 3.7 -12" |
Create some data in the form requested (four unsigned bytes, float, and a signed short) |
perl -ne "print pack("CCCCfs", split)" |
write them as binary |
od -An -tu1u1u1u1fFdS |
decode the binary. od will write a line of output for each type requested:• decoded as unsigned bytes • decoded as floats • decoded as signed shorts |
paste -sd ' \n' |
combine the three lines together |
awk '{ print $1,$2,$3,$12,$18 }' |
print the selected fields from the space-separated output |
awk
is just one option for isolating the fields you're looking for.
If you need to do this for multiple structures of the same size you can use a combination of od
's -N
(number of bytes to read) and -w
(number of bytes of width to print) fields (with the limitation that the bytes read must be evenly divisible by the width, and be a multiple of the word (e.g., 32-bit) size), or you might use a loop in a shell script to use the -j <n>
(have od
skip the first n bytes of the file) combined with the -N
option.