mysql laravel database-design database-schema polymorphic-associations

Polymorphic relationships vs separate tables per type

I am working on a database which has some types (e.g. User, Appointment, Task etc.) which can have zero or more Notes associated with each type.

The possible solutions I have come across for implementing these relationships are:

Polymorphic relationship
Separate table per type

Polymorphic Relationship

Suggested by many as being the easiest solution to implement and seemingly the most common implementation for frameworks that follow the Active Record pattern, I would add a table whose data is morphable:

My notable_type would allow me to distinguish between the type (User, Appointment, Task) the Note relates to, whilst the notable_id would allow me to obtain the individual type record in the related type table.

PROS:

Easy to scale, more models can be easily associated with the polymorphic class
Limits table bloat
Results in one class that can be used by many other classes (DRY)

CONS

More types can make querying more difficult and expensive as the data grows
Cannot have a foreign key
Lack of data consistency

Separate Table per Type

Alternatively I could create a table for each type which is responsible for the Notes associated with that type only. The type_id foreign key would allow me to quickly obtain the individual type record.

Deemed by many online as a code smell, many articles advocate avoiding the polymorphic relationship in favour of an alternative (here and here for example).

PROS:

Allows us to use foreign keys effectively
Efficient data querying
Maintains data consistency

CONS:

Increases table bloat as each type requires a separate table
Results in multiple classes, each representing the separate type_notes table

Thoughts

The polymorphic relationship is certainly the simpler of the two options to implement, but the lack of foreign key constraints and therefore potential for consistency issues feels wrong.

A table per notes relationship (user_notes, task_notes etc.) with foreign keys seems the correct way (in keeping with design patterns) but could result in a lot of tables (addition of other types that can have notes or addition of types similar to notes [e.g. events]).

It feels like my choice is either simplified table structure but forgo foreign keys and increased query overhead, or increase the number of tables with the same structure but simplify queries and allow for foreign keys.

Given my scenario which of the above would be more appropriate, or is there an alternative I should consider?

Solution

What is "table bloat"? Are you concerned about having too many tables? Many real-world databases I've worked on have between 100 and 200 tables, because that's what it takes.

If you're concerned with adding multiple tables, then why do you have separate tables for User, Appointment, and Task? If you had a multi-valued attribute for User, for example for multiple phone numbers per user, would you create a separate table for phones, or would you try to combine them all into the user table somehow? Or have a polymorphic "things that belong to other things" table for user phones, appointment invitees, and task milestones?

Answer: No, you'd create a Phone table, and use it to reference only the User table. If Appointments have invitees, that gets its own table (probably a many-to-many between appointments and users). If tasks have milestones, that gets its own table too.

The correct thing to do is to model your database tables like you would model object types in your application. You might like to read a book like SQL and Relational Theory: How to Write Accurate SQL Code 3rd Edition by C. J. Date to learn more about how tables are analogous to types.

You already know instinctively that the fact that you can't create a foreign key is a red flag. A foreign key must reference exactly one parent table. This should be a clue that it's not valid relational database design to make a polymorphic foreign key. Once you start thinking of tables and their attributes as concrete types (like described in SQL and Relational Theory), this will become obvious.

If you must create one notes table, you could make it reference one table called "Notable" which is like a superclass of User, Appointment, and Task. Then each of those three tables would also reference a primary key of Notable. This mimics the object-oriented structure of polymorphism, where you can have a class Note have a reference to an object by its superclass type.

But IMHO, that's more complex than it needs to be. I would just create separate tables for UserNotes, AppointmentNotes, and TaskNotes. I'm not troubled by having three more tables, and it makes your code more clear and maintainable.