mysqltable-structure

MySQL - Best method to handle this hierarchical data?


This is a followup to:
MySQL - Is it possible to get all sub-items in a hierarchy?

I have an arbitrary-depth adjacency list model table (I am at the point that I can convert it into a nested set model.

I read the MySQL data on how to use a nested set model, though it seemed to get increasingly complex and very complex to do basic functions such as inserting, updating and deleting.

Another blog showing how to use a trigger system with the adjacency list model to keep a table of ancestors that relates each object to its ancestors.


Right now I need to be able to return a list of all children of a given node, to change or delete them. This hierarchical structure won't be changing all the time once created, but there will be a mass amount of the hierarchical structures.

The three methods I see are:

  1. Created a Stored Procedure which would do a recursive query that returns all children.

  2. Convert to Nested Set Model which would require to get into the complexities and possibly create a stored procedure to add, edit and delete in that.

  3. Create the Ancestor Table described above on insert/delete triggers to handle all of the data.

If there are other methods I'm not exploring, please let me know and I'll update this list.


Solution

  • Quassnoi has run some performance tests on the nested sets model and the adjacency list model and documented the results and recommendations in his blog post Adjacency list vs. nested sets: MySQL. The executive summary is:

    Here is the conclusion from his article:

    In MySQL, the nested sets model should be preferred if the updates to the hierarhical structure are infrequent and it is affordable to lock the table for the duration of an update (which can take minutes on a long table).

    This implies creating the table using MyISAM storage engine, creating the bounding box of a GEOMETRY type as described above, indexing it with a SPATIAL index and persisting the level in the table.

    If the updates to the table are frequent or it is inaffordable to lock the table for a long period of time implied by an update, then the adjacency list model should be used to store the hierarchical data.

    This requires creating a function to query the table.

    The rest of the article shows how to define the table, implement the queries and gives performance measurements. The use of the spatial index is a clever idea to improve the performance of the nested set model that might be new to you.


    If you're also considering approaches without MySQL then you might want to look at PostgreSQL which is another free and open-source database. PostgreSQL supports recursive queries in the form of recursive common table expressions which make querying heirarchical data easier than in MySQL and also give better performance. Quassnoi has also written an article Adjacency list vs. nested sets: PostgreSQL that shows the details.

    While we are talking about looking at other approaches, Oracle's database is also worth a mention. Oracle also have a custom extension CONNECT BY which make querying heirarchical data very easy and fast. Quassnoi's article Adjacency list vs. nested sets: Oracle again covers the performance details. The query you need to get all children is extremely simple in this case:

    SELECT *
    FROM yourtable
    START WITH id = 42
    CONNECT BY parent = PRIOR id