linuxbashshellzipunzip

How can I recursively unzip files that matches a pattern in a group of nested ZIP files?


I have a bunch of zip files, with several levels of zip files inside them.

I only want to extract files that matches this pattern: "_dnbr6.tif" and they can be located at different levels of the zip/directories hierarchy.

One extra hint is that the files I'm looking for are located in directories/zip files (inside zip files) named like "fire_bundle". Lower and upper case changes on different zip/directories.

How can I do it without extracting all the files of each of the nested zip files?

I guess the idea is to:

I still don't know how to do the recursive part, so any help is appreciated.


Solution

  • Here is a script that will recursively search all the .zip files within the starting .zip file looking for the target file "_dnbr6.tif".

    #!/bin/bash
    
    targetFile="_dnbr6.tif"
    sandboxDir="sandbox"
    
    usage () {
      cat <<EOF
    Usage $(basename $0) zipfile
    
    This script will extract all the target files ($targetFile) from the provided
    .zip file, including any that are in embedded in any .zip files included within it.
    It puts all the files in a sandbox directory name $sandboxDir. This directory
    must not already exist.
    EOF
    }
    
    if [ -z "$1" ]; then
      usage
      exit 1
    fi
    
    if [ -d $sandboxDir ]; then
      echo "Error, the sandbox directory $sandboxDir already exists."
      echo
      usage
      exit 1
    fi
    
    mkdir $sandboxDir
    cd $sandboxDir
    startDir=$(pwd)
    
    # Extract the initial .zip file looking for either the target file or additional .zip files.
    unzip -q ../$1 "*.zip" "*$targetFile" 2>&1 | fgrep -v "filename not matched"   # Suppress the "filename not matched" message.
    
    zipFiles="../$1"   # Just set to get past the first interation of the while loop.
    #
    # Loop until there aren't any .zip files anymore.
    while [ ! -z "$zipFiles" ]; do
      zipFiles=$(find . -name '*.zip')
      if [ ! -z "$zipFiles" ]; then
        #
        # For each .zip file, extract any .zip files and the target file.
        for zipFile in $zipFiles; do
          cd $(dirname $zipFile)
          unzip -q ${startDir}/$zipFile "*.zip" "*$targetFile" 2>&1 | fgrep -v "filename not matched"
          rm ${startDir}/$zipFile
          cd $startDir
        done
      fi
    done
    
    echo "The target file '$targetFile' was extracted here:"
    cd ..
    find $sandboxDir -name "$targetFile"