I have the following:
target_content_type = models.ForeignKey(ContentType, related_name='target_content_type')
target_object_id = models.PositiveIntegerField()
target = generic.GenericForeignKey('target_content_type', 'target_object_id')
I would like dumpdata --natural to emit a natural key for this relation. Is this possible? If not, is there an alternative strategy that would not tie me to target's primary key?
TL;DR - Currently there is no sane way of doing so, short of creating a custom Serializer
/ Deserializer
pair.
The problem with models that have generic relations is that Django doesn't see target
as a field at all, only target_content_type
and target_object_id
, and it tries to serialize and deserialize them individually.
The classes responsible for serializing and deserializing Django models are in the modules django.core.serializers.base
and django.core.serializers.python
. All the others (xml
, json
and yaml
) extend either of them (and python
extends base
). The field serialization is done like this (irrelevant lines ommited):
for obj in queryset:
for field in concrete_model._meta.local_fields:
if field.rel is None:
self.handle_field(obj, field)
else:
self.handle_fk_field(obj, field)
Here's the first complication: the foreign key to ContentType
is handled ok, with natural keys as we expected. But the PositiveIntegerField
is handled by handle_field
, that is implemented like this:
def handle_field(self, obj, field):
value = field._get_val_from_obj(obj)
# Protected types (i.e., primitives like None, numbers, dates,
# and Decimals) are passed through as is. All other values are
# converted to string first.
if is_protected_type(value):
self._current[field.name] = value
else:
self._current[field.name] = field.value_to_string(obj)
i.e. the only possibility for customization here (subclassing PositiveIntegerField
and defining a custom value_to_string
) will have no effect, since the serializer won't call it. Changing the data type of target_object_id
to something else than a integer will probably break many other stuff, so it's not an option.
We could define our custom handle_field
to emit natural keys in this case, but then comes the second complication: the deserialization is done like this:
for (field_name, field_value) in six.iteritems(d["fields"]):
field = Model._meta.get_field(field_name)
...
data[field.name] = field.to_python(field_value)
Even if we customized the to_python
method, it acts on the field_value
alone, out of the context of the object. It's not a problem when using integers, since it will be interpreted as the model's primary key no matter what model it is. But to deserialize a natural key, first we need to know which model that key belongs to, and that information isn't available unless we got a reference to the object (and the target_content_type
field had already been deserialized).
As you can see, it's not an impossible task - supporting natural keys in generic relations - but to accomplish that a lot of things would need to be changed in the serialization and deserialization code. The steps necessary, then (if anyone feels up to the task) are:
Field
extending PositiveIntegerField
, with methods to encode/decode an object - calling the referenced models' natural_key
and get_by_natural_key
;handle_field
to call the encoder if present;field_value
but also a reference to the decoded ContentType
.