I am trying to read a zipped file. I am doing this using command tar tf abc.tar.xz
. Because the size of the file is 1TB so it takes a lot of time. I am not much familiar with bash script. I have other commands as well such as zcat 3532642.tar.gz | more
and tar tf 3532642.tar.xz |grep --regex="folder1/folder2/folder3/folder4/"
and
tar tvf 3532642.tar.xz --to-command \
'grep --label="$TAR_FILENAME" -H folder1/folder2/folder3/folder4/ ; true'
But I dont find much difference among them in terms of time they take to execute the file to read its contents.
Does anyone know how can I do It in minimum time to process such a huge amount of data for a zipped file. Any help would be appreciated!!!
As rrauenza
mentions, since pigz
may not work for the xz
format, there is a similar tool pixz
for parallel, indexed xz compressing/decompressing.
from the man
page it is evident that Pigz
compresses/decommpresses using threads to make use of multiple processors and cores.
Similar to pigz
, this command also provides an option to specify the number of threads that can be invoked in parallel in multiple cores to achieve maximum performance.
-p --processes n
Allow up to n processes (default is the number of online processors)
Or you can manually get the number of cores from the bash command getconf _NPROCESSORS_ONLN
and set the value to -p
.
More details from the GitHub
page of pixz
also with details on how to download and install
(or)
Going with a tar
only solution, it can be done only if the file-name is known in prior
tar -zxOf <file-name_inside-tar> <file-containing-tar>
with options as follow:-
-f, --file=ARCHIVE
use archive file or device ARCHIV
-z, --gzip
filter the archive through gzip
-x, --extract, --get
extract files from an archive
-O, --to-stdout
extract files to standard output
May not be as effective as pigz
, but nevertheless does the job.