I start with
client = storage.Client()
bucket = client.get_bucket(BUCKET_NAME)
<what's next? Need something like client.list_folders(path)>
I know how to:
list all the blobs (including blobs in sub-sub-sub-folders, of any depth) with bucket.list_blobs()
or how to list all the blobs recursively in given folder with bucket.list_blobs(prefix=<path to subfolder>)
but what if my file system structure has 100
top level folders, each having thousands of files. Any efficient way to get only those 100
top level folder names without listing all the inside blobs?
I do not think you can get the 100 top level folders without listing all the inside blobs. Google Cloud Storage does not have folders or subdirectories, the library just creates an illusion of a hierarchical file tree.
I used this simple code :
from google.cloud import storage
storage_client = storage.Client()
blobs = storage_client.list_blobs('my-project')
res = []
for blob in blobs:
if blob.name.split('/')[0] not in res:
res.append(blob.name.split('/')[0])
print(res)