How can I extract the size of the total uncompressed file data in a .tar.gz file from command line?
This will sum the total content size of the extracted files:
$ tar tzvf archive.tar.gz | sed 's/ \+/ /g' | cut -f3 -d' ' | sed '2,$s/^/+ /' | paste -sd' ' | bc
The output is given in bytes.
Explanation: tar tzvf
lists the files in the archive in verbose format like ls -l
. sed
and cut
isolate the file size field. The second sed
puts a + in front of every size except the first and paste
concatenates them, giving a sum expression that is then evaluated by bc
.
Note that this doesn't include metadata, so the disk space taken up by the files when you extract them is going to be larger - potentially many times larger if you have a lot of very small files.