pythondjangodjango-migrationsdjango-inheritance

How to maintain a table of all proxy models in Django?


I have a model A and want to make subclasses of it.

class A(models.Model):
    type = models.ForeignKey(Type)
    data = models.JSONField()
    
    def compute():
            pass

class B(A):
    def compute():
        df = self.go_get_data()
        self.data = self.process(df)

class C(A):
    def compute():
        df = self.go_get_other_data()
        self.data = self.process_another_way(df)

# ... other subclasses of A

B and C should not have their own tables, so I decided to use the proxy attirbute of Meta. However, I want there to be a table of all the implemented proxies. In particular, I want to keep a record of the name and description of each subclass. For example, for B, the name would be "B" and the description would be the docstring for B. So I made another model:

class Type(models.Model):
    # The name of the class
    name = models.String()
    # The docstring of the class
    desc = models.String()
    # A unique identifier, different from the Django ID,
    # that allows for smoothly changing the name of the class
    identifier = models.Int()

Now, I want it so when I create an A, I can only choose between the different subclasses of A. Hence the Type table should always be up-to-date. For example, if I want to unit-test the behavior of B, I'll need to use the corresponding Type instance to create an instance of B, so that Type instance already needs to be in the database.

Looking over on the Django website, I see two ways to achieve this: fixtures and data migrations. Fixtures aren't dynamic enough for my usecase, since the attributes literally come from the code. That leaves me with data migrations.

I tried writing one, that goes something like this:

def update_results(apps, schema_editor):
    A = apps.get_model("app", "A")
    Type = apps.get_model("app", "Type")
    subclasses = get_all_subclasses(A)
    for cls in subclasses:
        id = cls.get_identifier()
        Type.objects.update_or_create(
            identifier=id,
            defaults=dict(name=cls.__name__, desc=cls.__desc__)
        )
    
class Migration(migrations.Migration):

    operations = [
        RunPython(update_results)
    ]
    
    # ... other stuff

The problem is, I don't see how to store the identifier within the class, so that the Django Model instance can recover it. So far, here is what I have tried:

I have tried using the fairly new __init_subclass__ construct of Python. So my code now looks like:

class A:

    def __init_subclass__(cls, identifier=None, **kwargs):
        super().__init_subclass__(**kwargs)
        if identifier is None:
            raise ValueError()
        cls.identifier = identifier
        Type.objects.update_or_create(
            identifier=identifier,
            defaults=dict(name=cls.__name__, desc=cls.__doc__)
        )
    
    # ... the rest of A

# The identifier should never change, so that even if the
# name of the class changes, we still know which subclass is referred to
class B(A, identifier=3):

    # ... the rest of B

But this update_or_create fails when the database is new (e.g. during unit tests), because the Type table does not exist. When I have this problem in development (we're still in early stages so deleting the DB is still sensible), I have to go comment out the update_or_create in __init_subclass__. I can then migrate and put it back in.

Of course, this solution is also not great because __init_subclass__ is run way more than necessary. Ideally this machinery would only happen at migration.

So there you have it! I hope the problem statement makes sense.

Thanks for reading this far and I look forward to hearing from you; even if you have other things to do, I wish you a good rest of your day :)


Solution

  • With a little help from Django-expert friends, I solved this with the post_migrate signal. I removed the update_or_create in __init_subclass, and in project/app/apps.py I added:

    from django.apps import AppConfig
    from django.db.models.signals import post_migrate
    
    
    def get_all_subclasses(cls):
        """Get all subclasses of a class, recursively.
    
        Used to get a list of all the implemented As.
        """
        all_subclasses = []
    
        for subclass in cls.__subclasses__():
            all_subclasses.append(subclass)
            all_subclasses.extend(get_all_subclasses(subclass))
    
        return all_subclasses
    
    
    def update_As(sender=None, **kwargs):
        """Get a list of all implemented As and write them in the database.
    
        More precisely, each model is used to instantiate a Type, which will be used to identify As.
        """
        from app.models import A, Type
    
        subclasses = get_all_subclasses(A)
        for cls in subclasses:
            id = cls.identifier
            Type.objects.update_or_create(identifier=id, defaults=dict(name=cls.__name__, desc=cls.__doc__))
    
    
    class MyAppConfig(AppConfig):
        default_auto_field = "django.db.models.BigAutoField"
        name = "app"
    
        def ready(self):
            post_migrate.connect(update_As, sender=self)
    

    Hope this is helpful for future Django coders in need!