shellloggingcommand-linezip

Find, unzip and grep the content of multiple files in one step/command


First I made a question here: Unzip a file and then display it in the console in one step It works and helped me a lot. (please read)

Now I have a second issue. I do not have a single zipped log file but I have a lot of them in defferent folders, which I need to find first. The files have the same names. For example:

/somedir/server1/log.gz
/somedir/server2/log.gz
/somedir/server3/log.gz

and so on...

What I need is a way to:

  1. find all the files like: find /somedir/server* -type f -name log.gz
  2. unzip the files like: gunzip -c log.gz
  3. use grep on the content of the files

Important! The whole should be done in one step. I cannot first store the extracted files in the filesystem because it is a readonly filesystem. I need somehow to connect, with pipes, the output from one command to the input of the next.

Before, the log files were in text format (.txt), therefore I had not to unzip them first. In this case it was easy: ex. find /somedir/server* -type f -name log.txt | xargs grep "term"

Now I have to deal with zipped files. That means, after I find the files, I need first somehow do unzip them and then send the contents to grep. With one file I do: gunzip -p /somedir/server1/log.gz | grep term But for multiple files I don't know how to do it. For example how to pass the output of find to gunzip and the to grep?!

Also if there is another way / "best practise" how to do that, it is welcome :)


Solution

  • find lets you invoke a command on the files it finds:

    find /somedir/server* -type f -name log.gz -exec gunzip -c '{}' + | grep ...
    

    From the man page:

    -exec command {} +

    This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending each selected file name at the end; the total number of invocations of the command will be much less than the number of matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of {} is allowed within the command, and (when find is being invoked from a shell) it should be quoted (for example, '{}') to protect it from interpretation by shells. The command is executed in the starting directory. If any invocation with the + form returns a non-zero value as exit status, then find returns a non-zero exit status. If find encounters an error, this can sometimes cause an immediate exit, so some pending commands may not be run at all. This variant of -exec always returns true.