djangonatural-keygeneric-foreign-key

Is it possible to use a natural key for a GenericForeignKey in Django?


I have the following:

target_content_type = models.ForeignKey(ContentType, related_name='target_content_type')
target_object_id = models.PositiveIntegerField()
target = generic.GenericForeignKey('target_content_type', 'target_object_id')

I would like dumpdata --natural to emit a natural key for this relation. Is this possible? If not, is there an alternative strategy that would not tie me to target's primary key?


Solution

  • TL;DR - Currently there is no sane way of doing so, short of creating a custom Serializer / Deserializer pair.

    The problem with models that have generic relations is that Django doesn't see target as a field at all, only target_content_type and target_object_id, and it tries to serialize and deserialize them individually.

    The classes responsible for serializing and deserializing Django models are in the modules django.core.serializers.base and django.core.serializers.python. All the others (xml, json and yaml) extend either of them (and python extends base). The field serialization is done like this (irrelevant lines ommited):

        for obj in queryset:
            for field in concrete_model._meta.local_fields:
                    if field.rel is None:
                            self.handle_field(obj, field)
                    else:
                            self.handle_fk_field(obj, field)
    

    Here's the first complication: the foreign key to ContentType is handled ok, with natural keys as we expected. But the PositiveIntegerField is handled by handle_field, that is implemented like this:

    def handle_field(self, obj, field):
        value = field._get_val_from_obj(obj)
        # Protected types (i.e., primitives like None, numbers, dates,
        # and Decimals) are passed through as is. All other values are
        # converted to string first.
        if is_protected_type(value):
            self._current[field.name] = value
        else:
            self._current[field.name] = field.value_to_string(obj)
    

    i.e. the only possibility for customization here (subclassing PositiveIntegerField and defining a custom value_to_string) will have no effect, since the serializer won't call it. Changing the data type of target_object_id to something else than a integer will probably break many other stuff, so it's not an option.

    We could define our custom handle_field to emit natural keys in this case, but then comes the second complication: the deserialization is done like this:

       for (field_name, field_value) in six.iteritems(d["fields"]):
            field = Model._meta.get_field(field_name)
            ...
                data[field.name] = field.to_python(field_value)
    

    Even if we customized the to_python method, it acts on the field_value alone, out of the context of the object. It's not a problem when using integers, since it will be interpreted as the model's primary key no matter what model it is. But to deserialize a natural key, first we need to know which model that key belongs to, and that information isn't available unless we got a reference to the object (and the target_content_type field had already been deserialized).

    As you can see, it's not an impossible task - supporting natural keys in generic relations - but to accomplish that a lot of things would need to be changed in the serialization and deserialization code. The steps necessary, then (if anyone feels up to the task) are: