I'm working on converting an old database and queries to recursive CTEs. In the current code, I'm storing hierarchical data as a dash separated list of 0 padded strings. So as an example, I may have the following forums, with their respective heritage fields
- forum1 (0001)
-- forum4 (0001-0004)
--- forum5 (0001-0004-0005)
The tables also contain a parentID, which I'm trying to build the recursive query off of. For example, I have this query, which grabs all the forums a user is subscribed to, and it's parents:
SELECT
p.forumID,
p.title,
p.parentID,
p.order,
IF(s.ID = p.forumID, 1, 0) isSubbed
FROM
forumSubs s
INNER JOIN forums f ON s.ID = f.forumID
INNER JOIN forums p ON f.heritage LIKE CONCAT(p.heritage, '%')
WHERE
p.forumID != 0
AND s.userID = {$userID}
AND s.`type` = 'f'
ORDER BY
LENGTH(p.heritage),
`order`
I created a CTE to get a forum and its parents:
with recursive forum_with_parents (forumID, title, parentID, `order`) as (
select forumID, title, parentID, `order`
from forums
where forumID = ?
union all
select p.forumID, p.title, p.parentID, p.`order`
from forums p
inner join forum_with_parents
on p.forumID = forum_with_parents.parentID
)
select * from forum_with_parents;
But it needs a forumID to work. So how could I join against it? I'd figure I'd be replacing the forums p
with forums_with_parents
, but I don't know how to join against it, because I need that info before I can set the value in the CTE itself. Does the ENTIRE thing have to be a CTE? If so, I'm struggling to think how to do that.
Your recursive initialization query is standard SQL, so it can absolutely use a JOIN
, either with a "real" table, a sub-SELECT
table (from forums join (select …) subsel on subsel.forumID = forums.forumID
), or a CTE, which would be my preferred (clearest) way of doing a multi-stage query with some recursive parts (you can even have two recursive CTEs in the same with
block. But only the latest of the two will be joinable to the results of the first one; you cannot have two recursive queries accessing each other's temporary result set while running, the first one has to be finished before the second one can launch).
You can mix non-recursive and recursive CTEs in the same with recursive
query;
the recursive
keyword then has to be mentioned only once at the start, not before each recursive CTE; the RDBMS will infer which parts run recursively, and which are simple CTEs.
Thus you can write:
with recursive
subsel as (/* not a recursive CTE */),
forum_with_parents as
(
/* Iteration #0: */
select …
from forums join subsel on subsel.forumID = forums.forumID
union all
/* Recursive iterations: */
select
from forums p inner join forum_with_parents …
)
…;
If I understand correctly, you are looking to use a recursive CTE to rebuild the hierarchy of forums, of which one user has subscribed to some:
have the same results as your first (non-recursive) query, that relies on a denormalized heritage
column that holds the results of a recursive walkthrough of the tree of forums,
using only recursive CTEs (and dropping your heritage
column that poses maintenance problems).
We don't even have to use a second CTE (as I told for the general case),
you can simply replace your = ?
by an in (select s.ID from forumSubs s where s.userID = {$userID} and s.`type` = 'f')
.
I would suggest an additional modification: use union
, not union all
:\0
This will automatically handle deduplication, in case the user subscribed to two forums of the same hierarchy, only one of them will appear in the final result
(if he subscribed to both 4 and 5 for example, both will be detected as direct subscriptions in iteration #0, and then on iteration #1: 4 will fetch its parent 1, and 5 will fetch 4, but as 4 has already been fetched on iteration #0 the recursivity stops propagating it and only continues with 1)
Thus your query would be (with in (select …)
+ union
without all
):
with recursive
forum_with_parents (forumID, title, parentID, `order`) as (
select forumID, title, parentID, `order`
from forums
where forumID in (select s.ID from forumSubs s where s.userID = {$userID} and s.\`type\` = 'f')
union
select p.forumID, p.title, p.parentID, p.`order`
from forums p
inner join forum_with_parents
on p.forumID = forum_with_parents.parentID
)
select * from forum_with_parents;
You can see it (adapted to multiple users) running in a fiddle, with a user 1 having subscribed to both 1 and 5. You can see:
heritage
), showing how 1 appears twice (due to being both a direct subscription, and the grand-parent of 5)union all
(and thus still the double appearance of 1)union
(with 1 appearing only once)