I have a tree structure in a table and it uses materialized paths to allow me to find children quickly. However, I also need to sort the results depth-first, as one would expect with threaded forum replies.
id | parent_id | matpath | created
----+-----------+---------+----------------------------
2 | 1 | 1 | 2010-05-08 15:18:37.987544
3 | 1 | 1 | 2010-05-08 17:38:14.125377
4 | 1 | 1 | 2010-05-08 17:38:57.26743
5 | 1 | 1 | 2010-05-08 17:43:28.211708
7 | 1 | 1 | 2010-05-08 18:18:11.849735
6 | 2 | 1.2 | 2010-05-08 17:50:43.288759
9 | 5 | 1.5 | 2010-05-09 14:02:43.818646
8 | 6 | 1.2.6 | 2010-05-09 14:01:17.632695
So the final results should actually be sorted like this:
id | parent_id | matpath | created
----+-----------+---------+----------------------------
2 | 1 | 1 | 2010-05-08 15:18:37.987544
6 | 2 | 1.2 | 2010-05-08 17:50:43.288759
8 | 6 | 1.2.6 | 2010-05-09 14:01:17.632695
3 | 1 | 1 | 2010-05-08 17:38:14.125377
4 | 1 | 1 | 2010-05-08 17:38:57.26743
5 | 1 | 1 | 2010-05-08 17:43:28.211708
9 | 5 | 1.5 | 2010-05-09 14:02:43.818646
7 | 1 | 1 | 2010-05-08 18:18:11.849735
How would I work that out? Can I do that in straight SQL (this is PostgreSQL 8.4) or should additional information be added to this table?
Update: trying to explain sort criteria better.
Imagine that id '1' is the root post to a forum and everything with a 'matpath' beginning with '1' is a child of that post. So ids 2 through 5 are direct replies to 1 and get matpaths of '1'. However, id 6 is a reply 2, not directly to 1, so it gets a matpath of 1.2. This means that for a threaded forum with proper nesting, with all ids shown in the tables, the structure of the forum would look like this, hence the ordering requirement:
* id 1 (root post)
* id 2
* id 6
* id 8
* id 3
* id 4
* id 5
* id 9
* id 7
I typically create an additional columnn for this, called something like SortPath
. It would contain the data that you need to sort by, concatenated together. That column would be of type varchar
, and get sorted as a string. Something like this:
id | parent_id | matpath | created | sortpath
---+-----------+---------+-----------------------------+--------------------------------------------------------------------------------------
2 | 1 | 1 | 2010-05-08 15:18:37.987544 | 2010-05-08 15:18:37.987544-2
6 | 2 | 1.2 | 2010-05-08 17:50:43.288759 | 2010-05-08 15:18:37.987544-2.2010-05-08 17:50:43.288759-6
8 | 6 | 1.2.6 | 2010-05-09 14:01:17.632695 | 2010-05-08 15:18:37.987544-2.2010-05-08 17:50:43.288759-6.2010-05-09 14:01:17.632695-8
3 | 1 | 1 | 2010-05-08 17:38:14.125377 | 2010-05-08 17:38:14.125377-3
4 | 1 | 1 | 2010-05-08 17:38:57.26743 | 2010-05-08 17:38:57.267430-4
5 | 1 | 1 | 2010-05-08 17:43:28.211708 | 2010-05-08 17:43:28.211708-5
9 | 5 | 1.5 | 2010-05-09 14:02:43.818646 | 2010-05-08 17:43:28.211708-5.2010-05-09 14:02:43.818646-9
7 | 1 | 1 | 2010-05-08 18:18:11.849735 | 2010-05-08 18:18:11.849735-7
A couple of things to note here:
sortpath
will be sorted as a string, so it is important all dates have the same length for it to correctly sort. E.g., observe how 2010-05-08 17:38:57.26743
has an extra zero added in the sortpath
column.sortpath
, but it is not in matpath
. I would prefer to see it in both.sortcolumn
as well. This is so that if you ever want to query for more than one forum at a time (you probably won't), then it will still sort correctly.