djangoinheritancedjango-modelsmultiple-inheritancedjango-inheritance

Django's MutiTable Vs. Abstract Inheritance


While there is general consensus that multi-table inheritance isn't a very good idea in the long term (Jacobian, Others), am wondering if in some use cases the "extra joins" created by django during querying might be worth it.

My issue is having a Single Source of Truth in the database. Say, for Person Objects who are identified using an Identity Number and Identity Type. E.g. ID Number 222, Type Passport.

class Person(models.Model):
    identity_number = models.CharField(max_length=20)
    identity_type = models.IntegerField()

class Student(Person):
    student_number = models.CharField(max_length=20)

class Employee(Person):
    employee_number = models.CharField(max_length=20)

In abstract inheritance, any subclass model of person e.g. Student, Parent, Supervisor, Employee etc inheriting from a Person Abstract Class will have identity_number & identity_type stored in their respective tables

In multi-table inheritance, since they all share the same table, I can be sure that if I create a unique constraint on both columns in the Person Model then no duplicates will exist in the database.

In the abstract inheritance, to keep out duplicates in the database, one would have to build extra validation logic into the application thus also slightly degrading performance meaning it cancels out the "extra join" that django has to do with a concrete inheritance?


Solution

  • It's a mistake to think about your data modeling in object-oriented terms at all. It's an abstraction that fits poorly to relational databases, by hiding some very important details that can massively affect performance (as pointed out in the articles) or correctness (as you've pointed out above).

    A traditional SQL approach to your example would offer two possibilities:

    1. Having a Person table with the IDs and then Student, etc. with foreign keys back to it.
    2. Having a single table for everything, with some additional fields to distinguish the different kinds of person.

    Now, if your evaluation led you to prefer 1, you might notice that in Django this could be accomplished by using a concrete inheritance model (it's the same as what you describe above). In that case, by all means, use inheritance if you'd find the resulting access patterns in Django more elegant.

    So I'm not saying you shouldn't use inheritance, I'm saying you should only look at it after you've modeled your data from the SQL perspective. If you did that in the example above, you would never even consider splitting everything into separate tables—which has all the problems you noted—as suggested by the abstract inheritance model.