mysqlsqltreematerialized-path-pattern

How do I properly sort a materialized paths in SQL?


I am using Materialized Path to store a tree structure in SQl (MySQL 5.7 in my case). I am storing the paths as slash-separated slugs. All the tutorials I have read said to sort the rows by the path to extract it in the right order, but it appears not to work when parts of the path have a similar prefix.

Some example code:

CREATE TABLE categories (
  id int(11),
  parent_id int(11) DEFAULT NULL,
  slug varchar(255),
  path varchar(255)
);

INSERT INTO categories VALUES
  (1, null, 'foo', '/foo'),
  (2, 1, 'bar', '/foo/bar'),
  (3, null, 'foo-it', '/foo-it'),
  (4, 3, 'boy', '/foo-it/boy');

Now, when sorting by path I get the wrong order:

SELECT * FROM categories ORDER BY path;

Output:

+------+-----------+--------+-------------+
| id   | parent_id | slug   | path        |
+------+-----------+--------+-------------+
|    1 |      NULL | foo    | /foo        |
|    3 |      NULL | foo-it | /foo-it     |
|    4 |         3 | boy    | /foo-it/boy |
|    2 |         1 | bar    | /foo/bar    |
+------+-----------+--------+-------------+
4 rows in set (0.00 sec)

This appears to be caused because - precedes / in most (all?) collations.

Crazy thing is, the unix sort commandline utility does the right thing. If I put all the paths in a file and sort it, I get the correct output:

$ sort paths.txt 
/foo
/foo/bar
/foo-it
/foo-it/boy

Is there any way to make MySQL sort the tree properly? To sort it the same way that unix's sort utility does? Perhaps a different collation or something? Or any other tricks?


Solution

  • Try this:

    SELECT * FROM categories ORDER BY path + '/';
    

    Produces:

    /foo-it
    /foo-it/boy
    /foo
    /foo/bar
    

    /foo is sorted after /foo-it because /foo/ comes after /foo-.

    You can fiddle around a bit like replacing - with something that comes after / in ordering and not allowed in paths or file name.

    SELECT * FROM categories ORDER BY replace(path,'-','?') + '/';
    

    Produces:

    /foo
    /foo/bar
    /foo-it
    /foo-it/boy