pythongoogle-cloud-platformgoogle-cloud-storage

Walking a directory tree inside a Google Cloud Platform bucket in Python


For directories on a local machine, the os.walk() method is commonly used for walking a directory tree in Python.

Google has a Python module (google.cloud.storage) for uploading to and downloading from a GCP bucket in a locally-run Python script.

I need a way to walk directory trees in a GCP bucket. I browsed through the classes in the google.cloud Python module, but could not find anything. Is there a way to perform something similar to os.walk() on directories inside a GCP bucket?


Solution

  • No such function exists in the GCS library. However, GCS can list objects by prefix, which is usually sufficiently equivalent:

    from google.cloud import storage
    
    bucket = storage.Client().get_bucket(bucket_name)
    for blob in bucket.list_blobs(prefix="dir1/"):
      print(blob.name)