mysqlpostgresqlcategoriesnested-sets

Why in my nested set hierarchy is my HAVING clause not comparing int values properly?


I'm modeling a category hierarchy in postgres using the nested set model as explain here. I am using a query to find the immediate subordinates of a node which is explained below in an excerpt from the article:

Imagine you are showing a category of electronics products on a retailer web site. When a user clicks on a category, you would want to show the products of that category, as well as list its immediate sub-categories, but not the entire tree of categories beneath it. For this, we need to show the node and its immediate sub-nodes, but no further down the tree. For example, when showing the PORTABLE ELECTRONICS category, we will want to show MP3 PLAYERS, CD PLAYERS, and 2 WAY RADIOS, but not FLASH.

This can be easily accomplished by adding a HAVING clause to our previous query:

SELECT node.name, (COUNT(parent.name) - (sub_tree.depth + 1)) AS depth
FROM nested_category AS node,
        nested_category AS parent,
        nested_category AS sub_parent,
        (
                SELECT node.name, (COUNT(parent.name) - 1) AS depth
                FROM nested_category AS node,
                        nested_category AS parent
                WHERE node.lft BETWEEN parent.lft AND parent.rgt
                        AND node.name = 'PORTABLE ELECTRONICS'
                GROUP BY node.name
                ORDER BY node.lft
        )AS sub_tree WHERE node.lft BETWEEN parent.lft AND parent.rgt
        AND node.lft BETWEEN sub_parent.lft AND sub_parent.rgt
        AND sub_parent.name = sub_tree.name GROUP BY node.name HAVING depth <= 1 ORDER BY node.lft;

My schema differs slightly from the articles schema in the following ways:

Here is my category table definition:

(
    category_id uuid NOT NULL,
    name text COLLATE pg_catalog."default" NOT NULL,
    description text COLLATE pg_catalog."default",
    created_at timestamp with time zone NOT NULL,
    updated_at timestamp with time zone NOT NULL,
    updated_by text COLLATE pg_catalog."default" NOT NULL,
    image_url text COLLATE pg_catalog."default",
    is_main boolean NOT NULL DEFAULT false,
    CONSTRAINT category_category_id_pk PRIMARY KEY (category_id),
    CONSTRAINT category_category_name_key UNIQUE (name)
)

Here is my category tree table definition:

(
    category_tree_id uuid NOT NULL,
    category_id uuid NOT NULL,
    lft integer NOT NULL,
    rgt integer NOT NULL,
    created_at timestamp with time zone NOT NULL,
    updated_at timestamp with time zone NOT NULL,
    updated_by text COLLATE pg_catalog."default" NOT NULL,
    CONSTRAINT category_tree_pkey PRIMARY KEY (category_tree_id)
)

I have translated the query in the article to fit the schema in my db to the following:

SELECT node.category_tree_id,
       node.category_id,
       cat.name,
       node.lft,
       node.rgt,
       ( Count(parent.category_tree_id) - ( sub_tree.depth + 1 ) ) AS depth,
       node.created_at,
       node.updated_at,
       node.updated_by
FROM   category_tree AS node
       JOIN category cat
         ON cat.category_id = node.category_id,
       category_tree AS parent,
       category_tree AS sub_parent,
       (SELECT node.category_tree_id,
               ( Count(parent.category_tree_id) - 1 ) AS depth
        FROM   category_tree AS node,
               category_tree AS parent
        WHERE  node.lft BETWEEN parent.lft AND parent.rgt
               AND node.category_tree_id = 'f47f6269-9605-459e-a319-2b0177f9f4d0' //lookup by category tree id
        GROUP  BY node.category_tree_id
        ORDER  BY node.lft) AS sub_tree
WHERE  ( node.lft BETWEEN parent.lft AND parent.rgt )
       AND ( node.lft BETWEEN sub_parent.lft AND sub_parent.rgt )
       AND ( sub_parent.category_tree_id = sub_tree.category_tree_id )
GROUP  BY node.category_tree_id,
          depth,
          cat.name
HAVING depth <= 1
ORDER  BY node.lft 

This query does not produce any results and the issue seems to be coming from the HAVING depth ... clause. I have two rows in my category_tree table who's depth value are 0 and 1, respectively as shown below:

"f47f6269-9605-459e-a319-2b0177f9f4d0", "d06a143b-523e-4136-8a17-1049abbf76f4", "Parent", 7, 10, 0
"14b939e9-6784-4905-ba24-f67768c57085", "21191930-a5b9-4868-883f-3798f29d70a3", "Child", 8, 9,  1

Oddly enough, if I change the clause HAVING depth > 1 I actually get the two results I'm expecting, which is obviously wrong.

What's my issue here?


Solution

  • Answering my own question.

    You can't use a column alias on the same level where you defined it and having is only allowed in a query that uses aggregation. If you want to avoid repeating the expression, use a derived table. -source

    Full query:

    SELECT node.category_tree_id,
           node.category_id,
           cat.name,
           node.lft,
           node.rgt,
           (Count(parent.category_tree_id) - ( sub_tree.depth + 1 )) AS depth,
           node.created_at,
           node.updated_at,
           node.updated_by
    FROM   category_tree AS node
           JOIN category cat
             ON cat.category_id = node.category_id,
           category_tree AS parent,
           category_tree AS sub_parent,
           (SELECT node.category_tree_id,
                   ( Count(parent.category_tree_id) - 1) AS depth
            FROM   category_tree AS node,
                   category_tree AS parent
            WHERE  node.lft BETWEEN parent.lft AND parent.rgt
                   AND node.category_tree_id = 'f47f6269-9605-459e-a319-2b0177f9f4d0'
            GROUP  BY node.category_tree_id
            ORDER  BY node.lft) AS sub_tree
    WHERE  (node.lft BETWEEN parent.lft AND parent.rgt)
           AND (node.lft BETWEEN sub_parent.lft AND sub_parent.rgt)
           AND (sub_parent.category_tree_id = sub_tree.category_tree_id)
    GROUP  BY node.category_tree_id,
              depth,
              cat.name
    HAVING (Count(parent.category_tree_id) - ( sub_tree.depth + 1 )) <= 1
    ORDER  BY node.lft