pythongoogle-cloud-storage7zip

How to run a one-off Python unzipping script (19Gb .7z archive) on GC?


I need to extract files from the archive on a one-off basis. The Python code that can handle the task is very simple:

from py7zr import unpack_7zarchive
import shutil

shutil.register_unpack_format('7zip', ['.7z'], unpack_7zarchive)
path = 'gs://my_bucket/my_archive.7z'
shutil.unpack_archive(path, '')

I need it to run on the cloud because the archive is huge (19Gb).

There are many unzipping solutions listed here, but all Pythonic solutions are run as Google Functions, and Google Functions have a 16Gb memory limit.


Solution

  • Posting as a community wiki as per @JohnHanley's comment:

    The file is too big for Cloud Shell (max total space 5 GB). The only Google service that I recommend for your use case is Compute Engine.