pythongoogle-cloud-storagegoogle-cloud-sdkgoogle-cloud-python

Google Cloud Storage streaming upload from Python generator


I have a Python generator that will yield a large and unknown amount of byte data. I'd like to stream the output to GCS, without buffering to a file on disk first.

While I'm sure this is possible (e.g., I can create a subprocess of gsutil cp - <...> and just write my bytes into its stdin), I'm not sure what's a recommended/supported way and the documentation gives the example of uploading a local file.

How should I do this right?


Solution

  • The BlobWriter class makes this a bit easier:

    bucket = storage_client.bucket('my_bucket')
    blob = bucket.blob('my_object')
    writer = BlobWriter(blob)
    
    for d in your_generator:
      writer.write(d)
    
    writer.close()