[SOLVED] Pipe output to zipped tar after cspilt

Pipe output to zipped tar after cspilt

So, I have the following situation:

A code which produces a large (must be zipped) set of outputs as follows:

line00
line01
...
line0N
.
line10
line11
...
line1M
.
...

I generate this content and zip it with:

./my_cmd | gzip -9 > output.gz

What I would like to do is, in pseudo code:

./my_cmd \
| csplit --prefix=foo '/^\.$/+1' {*} \  # <-- this will just create files
| tar -zf ??? \                 # <-- don't know how to link files to tar
| gzip -9 > output.tar.gz

Ideally, nothing unzipped ever gets on the hard drive.

In summary: My objective is a set of files split at the delimiter on the hard drive in a zipped state, without intermediate read-write steps.

If I can't do this with tar/gzip/csplit, then maybe something else?

Solution

Tar can handle the compression itself.

./my_cmd | csplit --prefix=foo - '/^\.$/+1' {*} ; # writes foo?? files 

printf "%s\n" foo[0-9][0-9] | tar czf output.tar.gz -T -
rm -f foo[0-9][0-9]  # clean up the temps

If that's just not good enough, and you REALLY need that -9 compression,

printf "%s\n" foo[0-9][0-9] | 
    tar cOT -               |
    gzip -9 > output.tar.gz

Then you should be able to extract individual files from the archive for handling individually.

tar xvOf tst.tgz foo00 | wc -l

That lets you keep the file compressed, but pull out chunks to work on without writing them to disk.