gziptar

Check the total content size of a tar gz file


How can I extract the size of the total uncompressed file data in a .tar.gz file from command line?


Solution

  • This will sum the total content size of the extracted files:

    $ tar tzvf archive.tar.gz | sed 's/ \+/ /g' | cut -f3 -d' ' | sed '2,$s/^/+ /' | paste -sd' ' | bc
    

    The output is given in bytes.

    Explanation: tar tzvf lists the files in the archive in verbose format like ls -l. sed and cut isolate the file size field. The second sed puts a + in front of every size except the first and paste concatenates them, giving a sum expression that is then evaluated by bc.

    Note that this doesn't include metadata, so the disk space taken up by the files when you extract them is going to be larger - potentially many times larger if you have a lot of very small files.