sqldata-modelinghierarchical-dataadjacency-listmptt

Managing hierarchies in SQL: MPTT/nested sets vs adjacency lists vs storing paths


For a while now I've been wrestling with how best to handle hierarchies in SQL. Frustrated by the limitations of adjacency lists and the complexity of MPTT/nested sets, I began thinking about simply storing key paths instead, as a simple node_key/node_key/... string. I decided to compile the pros and cons of the three techniques:

Number of calls required to create/delete/move a node:

Number of calls required to get a tree:

Number of calls required to get path to a node / ancestry:

Number of calls required to get number of subnodes:

Number of calls required to get depth of node:

DB fields required:

Conclusion

The stored path technique uses the same or less calls than the other techniques in every use case except one. By this analysis, storing paths is a clear winner. Not to mention, it's a lot simpler to implement, human readable, etc.

So the question is, shouldn't stored paths be considered a stronger technique than MPTT? Why are stored paths not a more commonly used technique, and why would you not use them over MPTT in a given instance?

Also, if you think this analysis is incomplete please let me know.

UPDATE:

Here are at least 2 things MPTT can do out of the box that a stored path solution won't:

  1. Allows calculation of subnode count for each node without any additional queries (mentioned above).
  2. Imposes an order on nodes at a given level. The other solutions are unordered.

Solution

  • You might also consider the Closure Table design I describe in my answer to What is the most efficient/elegant way to parse a flat table into a tree?

    Calls required to create/delete/move a node:

    Calls required to get a tree:

    Calls required to get path to a node / ancestry:

    Calls required to get number of subnodes:

    Calls required to get depth of node:

    DB fields required:

    There are a couple of other considerations:

    Supports unlimited depth:

    Supports referential integrity:

    I also cover Closure Table in my presentation Models for Hierarchical Data with SQL and PHP, and my book, SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.