pythongoogle-cloud-storagelibcloud

Object metadata keys are lowercased when uploading to GCS with Apache Libcloud


I'm using Apache Libcloud to upload files to a Google Cloud Storage bucket together with object metadata.

In the process, the keys in my metadata dict are being lowercased. I'm not sure whether this is due to Cloud Storage or whether this happens in Libcloud.

The issue can be reproduced following the example from the Libcloud docs:

from libcloud.storage.types import Provider
from libcloud.storage.providers import get_driver

cls = get_driver(Provider.GOOGLE_STORAGE)
driver = cls('SA-EMAIL', './SA.json') # provide service account credentials here


FILE_PATH = '/home/user/file'

extra = {'meta_data': {'camelCase': 'foo'}}

# Upload with metadata
with open(FILE_PATH, 'rb') as iterator:
    obj = driver.upload_object_via_stream(iterator=iterator,
                                          container=container,
                                          object_name='file',
                                          extra=extra)

The file uploads succesfully, but resulting metadata will look as follows: result

Where camelCase has been turned into camelcase.

I don't think GCS disallows camelcase for object metadata, since it's possible to edit the metadata manually in that sense: enter image description here

I went through Libcloud's source code, but I don't see any explicit lowercasing going on. Any pointers on how to upload camelcased metadata with libcloud are most welcome.


Solution

  • I also checked the library and wasn't able to see anything obvious. But I guess to open a new issue there will be a great start.

    As far as what's concerned on the Google Cloud Storage side, and as you could verify by yourself it does admit camelcase. I was able to successfully edit the metadata of a file by using the code offered on their public docs (but wasn't able to figure out something on libcloud itself):

    from google.cloud import storage
    
    
    def set_blob_metadata(bucket_name, blob_name):
        """Set a blob's metadata."""
        # bucket_name = 'your-bucket-name'
        # blob_name = 'your-object-name'
    
        storage_client = storage.Client()
        bucket = storage_client.bucket(bucket_name)
        blob = bucket.get_blob(blob_name)
        metadata = {'camelCase': 'foo', 'NaMe': 'TeSt'}
        blob.metadata = metadata
        blob.patch()
    
        print("The metadata for the blob {} is {}".format(blob.name, blob.metadata))
    

    So, I believe that this could be a good workaround on your case if you are not able to work it out with libcloud. Do notice that the Cloud Storage Client Libraries base their authentication on environment variables and the following docs should be followed.

    Addition by question author: As hinted at in the comments, metadata can be added to a blob before uploading a file as follows:

    from google.cloud import storage
    gcs = storage.Client()
    bucket = gcs.get_bucket('my-bucket')
    blob = bucket.blob('document')
    blob.metadata = {'camelCase': 'foobar'}
    blob.upload_from_file(open('/path/to/document', 'rb'))
    

    This allows to set metadata without having to patch an existing blob, and provides an effective workaround for the issue with libcloud.