djangoamazon-web-servicesamazon-s3filefield

How to use Django FileField with dynamic Amazon S3 bucket?


I have a Django model with a Filefield, and a default storage using Amazon S3 bucket (via the excellent django-storage).

My problem is not to upload files to a dynamic folder path (as we see in many other answers). My problem is deeper and twofold:

Any idea?

(Djabgo 1.11, Python 3).


Solution

  • It turns out it is not so difficult. But the code below isn't much tested, and I must warn you to not copy-paste without checking!

    I have created a custom FileField subclass:

    class DynamicS3BucketFileField(models.FileField):
        attr_class = S3Boto3StorageFile
        descriptor_class = DynamicS3BucketFileDescriptor
    
        def pre_save(self, model_instance, add):
            return getattr(model_instance, self.attname)
    

    Note that the attr_class is specifically using the S3Boto3StorageFile class (a File subclass provided by django-storages).

    The pre_save overload has only one goal: avoid the internal file.save call that would attempt to re-upload the file.

    The magic happens inside the FileDescriptor subclass:

    class DynamicS3BucketFileDescriptor(FileDescriptor):
        def __get__(self, instance, cls=None):
            if instance is None:
                return self
    
            # Copied from FileDescriptor
            if self.field.name in instance.__dict__:
                file = instance.__dict__[self.field.name]
            else:
                instance.refresh_from_db(fields=[self.field.name])
                file = getattr(instance, self.field.name)
    
            # Make sure to transform storage to a Storage instance.
            if callable(self.field.storage):
                self.field.storage = self.field.storage(instance)
    
            # The file can be a string here (depending on when/how we access the field).
            if isinstance(file, six.string_types):
                # We instance file following S3Boto3StorageFile constructor.
                file = self.field.attr_class(file, 'rb', self.field.storage)
                # We follow here the way FileDescriptor work (see 'return' finish line).
                instance.__dict__[self.field.name] = file
    
            # Copied from FileDescriptor. The difference here is that these 3
            # properties are set systematically without conditions.
            file.instance = instance
            file.field = self.field
            file.storage = self.field.storage
            # Added a very handy property to file.
            file.url = self.field.storage.url(file.name)
    
            return instance.__dict__[self.field.name]
    

    The code above takes some internal code of FileDescriptor adapted to my case. Note the if callable(self.field.storage):, explained below.

    The key line is: file = self.field.attr_class(file, 'rb', self.field.storage), which automatically creates a valid instance of S3Boto3StorageFile depending on the content of the current file instance (sometimes, it's a file, sometimes it's a simple string, that's part of the FileDescriptor business).

    Now, the dynamic part comes quite simply. In fact, when declaring a FileField, you can provide to the storage option, a function. Like this:

    class MyMedia(models.Model):
        class Meta:
            app_label = 'appname'
    
        mediaset = models.ForeignKey(Mediaset, on_delete=models.CASCADE, related_name='media_files')
        file = DynamicS3BucketFileField(null=True, blank=True, storage=get_fits_file_storage)
    

    And the function get_fits_file_storage will be called with a single argument: the instance of MyMedia. Hence, I can use any property of that object, to return the valid storage. In my case mediaset, which contains a key that allow me to retrieve an object containing S3 credentials with which I can build a S3Boto3Storage instance (another class provided by django-storages).

    Specifically:

    def get_fits_file_storage(instance):
        name = instance.mediaset.archive_storage_name
        return instance.mediaset.archive.bucket_keys.get(name= name).get_storage()
    

    Et voilà!