windowspowershellmergegziptar

How to untar multiple files with an extension .tar.gz.aa, .tar.gz.ab..... in windows?


How to untar multiple files with an extension .tar.gz.aa, .tar.gz.ab..... until .tar.gz.an each file being around 10 GB in Windows?

I've tried the following commands in my powershell(with admin rights):

cat <name>.tar.gz.aa | tar xzvf -

cat : Exception of type 'System.OutOfMemoryException' was thrown.
At line:1 char:1
+ cat <name>.tar.gz.aa | tar xzvf –
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Get-Content], OutOfMemoryException
    + FullyQualifiedErrorId : System.OutOfMemoryException,Microsoft.PowerShell.Commands.GetContentCommand
cat *.tar.gz.* | zcat | tar xvf -
zcat : The term 'zcat' is not recognized as the name of a cmdlet, function, script file, or operable program. Check
the spelling of the name, or if a path was included, verify that the path is correct and try again.
At line:1 char:18
+ cat *.tar.gz.* | zcat | tar xvf -
+                  ~~~~
    + CategoryInfo          : ObjectNotFound: (zcat:String) [], CommandNotFoundException
    + FullyQualifiedErrorId : CommandNotFoundException

Thanks in advance! Would be happy to know of any solutions for linux as well, if anyone else might be facing a same difficulty.


Solution

  • You are calling cat (an alias for Get-Content) to enumerate the contents of a single file and then attempting to pass the parsed file content to tar. You were getting the OutOfMemoryException due to this. Get-Content is not designed to read binary files, it's designed to read ASCII and Unicode text files, and certainly not 10GB of them. Even if you had the available memory I don't know how performantly Get-Content would handle single files that large.

    Just pass the file path to tar like this, adding any additional arguments you need such as controlling output directory, etc.:

    tar xvzf "$name.tar.gz.aa"
    

    You can extract all of the archives with a loop in one go (with some helpful output and result checking). This code is also 100% executable in PowerShell Core and should work on Linux:

    Push-Location $pathToFolderWithGzips
    
    try {
      ( Get-ChildItem -File *.tar.gz.a[a-n] ).FullName | ForEach-Object {
        Write-Host "Extracting $_"
        tar xzf $_
      
        if( $LASTEXITCODE -ne 0 ) {
          Write-Warning "tar returned $LASTEXITCODE"
        }
      }
    } finally {
      Pop-Location
    }
    

    Let's break this down: